WIP: multivariate statistics / proof of concept
Hi,
attached is a WIP patch implementing multivariate statistics. The code
certainly is not "ready" - parts of it look as if written by a rogue
chimp who got bored of attempts to type the complete works of William
Shakespeare, and decided to try something different.
I also cut some corners to make it work, and those limitations need to
be fixed before the eventual commit (those are not difficult problems,
but were not necessary for a proof-of-concept patch).
It however seems to be working sufficiently well at this point, enough
to get some useful feedback. So here we go.
I expect to be busy over the next two weeks because of travel, so sorry
for somehow delayed responses. If you happen to attend pgconf.eu next
week (Oct 20-24), we can of course discuss this patch in person.
Goals and basics
----------------
The goal of this patch is allowing users to define multivariate
statistics (i.e. statistics on multiple columns), and improving
estimation when the columns are correlated.
Take for example a table like this:
CREATE TABLE test (a INT, b INT, c INT);
INSERT INTO test SELECT i/10000, i/10000, i/10000
FROM generate_series(1,1000000) s(i);
ANALYZE test;
and do a query like this:
SELECT * FROM test WHERE (a = 10) AND (b = 10) AND (c = 10);
which is estimated like this:
QUERY PLAN
---------------------------------------------------------
Seq Scan on test (cost=0.00..22906.00 rows=1 width=12)
Filter: ((a = 10) AND (b = 10) AND (c = 10))
Planning time: 0.142 ms
(3 rows)
The query of course returns 10.000 rows, but the planner assumes the
columns are independent and thus multiplies the selectivities. And 1/100
for each column means 1/1000000 in total, which is 1 row.
This example is of course somehow artificial, but the problem is far
from uncommon, especially in denormalized datasets (e.g. star schema).
If you ever got an index scan instead of a sequential scan due to poor
estimate, resulting in a query running for hours instead of seconds, you
know the pain.
The patch allows you to do this:
ALTER TABLE test ADD STATISTICS ON (a, b, c);
ANALYZE test;
which then results in this estimate:
QUERY PLAN
------------------------------------------------------------
Seq Scan on test (cost=0.00..22906.00 rows=9667 width=12)
Filter: ((a = 10) AND (b = 10) AND (c = 10))
Planning time: 0.110 ms
(3 rows)
This however is not free - both building such statistics (during
ANALYZE) and using it (during planning) costs some cycles. Even if we
optimize the hell out of it, it won't be entirely free.
One of the design goals in this patch is not to make the ANALYZE or
planning more expensive unless you add such statistics.
Those who add such statistics probably decided that the price is worth
the improved estimates, and lower risk of inefficient plans. If the
planning takes a few more miliseconds, it's probably worth it if you
risk queries running for minutes or hours because of misestimates.
It also does not guarantee the estimates to be always better. There will
be misestimates, although rather in the other direction (independence
assumption usually leads to underestimates, this may lead to
overestimates). However based on my experience from writing the patch I
be I believe it's possible to reasonably limit the extent of such errors
(just like in the single-column histograms, it's related to the bucket
size).
Of course, there will be cases when the old approach is lucky by
accident - there's not much we can do to beat luck. And we can't rely on
it either.
Design overview
---------------
The patch adds a new system catalog, called pg_mv_statistic, which is
used to keep track of requested statistics. There's also a pg_mv_stats
view, showing some basic info about the stats (not all the data).
There are three kinds of statistics
- list of most common combinations of values (MCV list)
- multi-dimensional histogram
- associative rules
The first two are extensions of the single-column stats we already have.
The MCV list is a trivial extension to multiple dimensions, just
tracking combinations and frequencies. The histogram is more complex -
the structure is quite simple (multi-dimensional rectangles) but there's
a lot of ways to build it. But even the current naive and simple
implementation seems to work quite well.
The last kind (associative rules) is an attempt to track "implications"
between columns. It is however an experiment and it's not really used in
the patch so I'll ignore it for now.
I'm not going to explain all the implementation details here - if you
want to learn more, the best way is probably by reading the changes in
those files (probably in this order):
src/include/utils/mvstats.h
src/backend/commands/analyze.c
src/backend/optimizer/path/clausesel.c
I tried to explain the ideas thoroughly in the comments, along with a
lot of TODO/FIXME items related to limitations, explained in the next
section.
Limitations
-----------
As I mentioned, the current patch has a number of practical limitations,
most importantly:
(a) only data types passed by value (no varlena types)
(b) only data types with sort (to be able to build histogram)
(c) no NULL values supported
(d) not handling DROP COLUMN or DROP TABLE and such
(e) limited to stats on 8 columns (max)
(f) optimizer uses single stats per table
(g) limited list of compatible WHERE clauses
(h) incomplete ADD STATISTICS syntax
The first three conditions are really a shortcut to a working patch, and
fixing them should not be difficult.
The limited number of columns is really just a sanity check. It's
possible to increase it, but I doubt stats on more columns will be
practical because of excessive size or poor accuracy.
A better approach is to support combining multiple stats, defined on
various subsets of columns. This is not implemented at the memoment, but
it's certainly on the roadmap. Currently the "smallest" stats covering
the most columns is selected.
Regarding the compatible WHERE clauses, the patch currently handles
conditions of the form
column OPERATOR constant
where operator is one of the comparison operators (=, <, >, =<, >=). In
the future it's possible to add support for more conditions, e.g.
"column IS NULL" or "column OPERATOR column".
The last point is really just "unfinished implementation" - the syntax I
propose is this:
ALTER TABLE ... ADD STATISTICS (options) ON (columns)
where the options influence the MCV list and histogram size, etc. The
options are recognized and may give you an idea of what it might do, but
it's not really used at the moment (except for storing in the
pg_mv_statistic catalog).
Examples
--------
Let's see a few examples of how to define the stats, and what difference
in estimates it makes:
CREATE TABLE test (a INT, b INT c INT);
-- same value in all columns
INSERT INTO test SELECT mod(i,100), mod(i,100), mod(i,100)
FROM generate_series(1,1000000) s(i);
ANALYZE test;
=============== no multivariate stats ============================
SELECT * FROM test WHERE a = 10 AND b = 10;
QUERY PLAN
-------------------------------------------------------------------
Seq Scan on test (cost=0.00..20406.00 rows=101 width=12)
(actual time=0.007..60.902 rows=10000 loops=1)
Filter: ((a = 10) AND (b = 10))
Rows Removed by Filter: 990000
Planning time: 0.119 ms
Execution time: 61.164 ms
(5 rows)
SELECT * FROM test WHERE a = 10 AND b = 10 AND c = 10;
QUERY PLAN
-------------------------------------------------------------------
Seq Scan on test (cost=0.00..22906.00 rows=1 width=12)
(actual time=0.010..56.780 rows=10000 loops=1)
Filter: ((a = 10) AND (b = 10) AND (c = 10))
Rows Removed by Filter: 990000
Planning time: 0.061 ms
Execution time: 56.994 ms
(5 rows)
=============== with multivariate stats ===========================
ALTER TABLE test ADD STATISTICS ON (a, b, c);
ANALYZE test;
SELECT * FROM test WHERE a = 10 AND b = 10;
QUERY PLAN
-------------------------------------------------------------------
Seq Scan on test (cost=0.00..20406.00 rows=10767 width=12)
(actual time=0.007..58.981 rows=10000 loops=1)
Filter: ((a = 10) AND (b = 10))
Rows Removed by Filter: 990000
Planning time: 0.114 ms
Execution time: 59.214 ms
(5 rows)
SELECT * FROM test WHERE a = 10 AND b = 10 AND c = 10;
QUERY PLAN
-------------------------------------------------------------------
Seq Scan on test (cost=0.00..22906.00 rows=10767 width=12)
(actual time=0.008..61.838 rows=10000 loops=1)
Filter: ((a = 10) AND (b = 10) AND (c = 10))
Rows Removed by Filter: 990000
Planning time: 0.088 ms
Execution time: 62.057 ms
(5 rows)
OK, that was rather significant improvement, but it's also trivial
dataset. Let's see something more complicated - the following table has
correlated columns with distributions skewed to 0.
CREATE TABLE test (a INT, b INT, c INT);
INSERT INTO test SELECT r*MOD(i,50),
pow(r,2)*MOD(i,100),
pow(r,4)*MOD(i,500)
FROM (SELECT random() AS r, i
FROM generate_series(1,1000000) s(i)) foo;
ANALYZE test;
SELECT * FROM test WHERE a = 0 AND b = 0;
=============== no multivariate stats ============================
QUERY PLAN
-------------------------------------------------------------------
Seq Scan on test (cost=0.00..20406.00 rows=9024 width=12)
(actual time=0.007..62.969 rows=49503 loops=1)
Filter: ((a = 0) AND (b = 0))
Rows Removed by Filter: 950497
Planning time: 0.057 ms
Execution time: 64.098 ms
(5 rows)
SELECT * FROM test WHERE a = 0 AND b = 0 AND c = 0;
QUERY PLAN
-------------------------------------------------------------------
Seq Scan on test (cost=0.00..22906.00 rows=2126 width=12)
(actual time=0.008..63.862 rows=40770 loops=1)
Filter: ((a = 0) AND (b = 0) AND (c = 0))
Rows Removed by Filter: 959230
Planning time: 0.060 ms
Execution time: 64.794 ms
(5 rows)
=============== with multivariate stats ============================
ALTER TABLE test ADD STATISTICS ON (a, b, c);
ANALYZE test;
db=> SELECT * FROM pg_mv_stats;
schemaname | public
tablename | test
attnums | 1 2 3
mcvbytes | 25904
mcvinfo | nitems=809
histbytes | 568240
histinfo | nbuckets=13772
SELECT * FROM test WHERE a = 0 AND b = 0;
QUERY PLAN
-------------------------------------------------------------------
Seq Scan on test (cost=0.00..20406.00 rows=47717 width=12)
(actual time=0.007..61.782 rows=49503 loops=1)
Filter: ((a = 0) AND (b = 0))
Rows Removed by Filter: 950497
Planning time: 3.181 ms
Execution time: 62.859 ms
(5 rows)
SELECT * FROM test WHERE a = 0 AND b = 0 AND c = 0;
QUERY PLAN
-------------------------------------------------------------------
Seq Scan on test (cost=0.00..22906.00 rows=40567 width=12)
(actual time=0.009..66.685 rows=40770 loops=1)
Filter: ((a = 0) AND (b = 0) AND (c = 0))
Rows Removed by Filter: 959230
Planning time: 0.188 ms
Execution time: 67.593 ms
(5 rows)
regards
Tomas
Attachments:
multivar-stats-v1.patchtext/x-diff; name=multivar-stats-v1.patchDownload
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index b257b02..6e63afe 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 9d9d239..68ec1aa 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -150,6 +150,18 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.stakeys AS attnums,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mvclist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index c09ca7e..df51805 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -54,7 +55,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Data structure for Algorithm S from Knuth 3.4.2 */
typedef struct
@@ -111,6 +116,62 @@ static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
+/* multivariate statistics (histogram, MCV list, associative rules) */
+
+static void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+static void update_mv_stats(Oid relid,
+ MVHistogram histogram, MCVList mcvlist);
+
+/* multivariate histograms */
+static MVHistogram build_mv_histogram(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ int attr_cnt, VacAttrStats **vacattrstats,
+ int numrows_total);
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs, int natts,
+ VacAttrStats **vacattrstats);
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+/* multivariate MCV list */
+static MCVList build_mv_mcvlist(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats,
+ int *numrows_filtered);
+
+/* multivariate associative rules */
+static void build_mv_associations(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+/* serialization */
+static bytea * serialize_mv_histogram(MVHistogram histogram);
+static bytea * serialize_mv_mcvlist(MCVList mcvlist);
+
+/* comparators, used when constructing multivariate stats */
+static int compare_scalars_simple(const void *a, const void *b, void *arg);
+static int compare_scalars_partition(const void *a, const void *b, void *arg);
+static int compare_scalars_memcmp(const void *a, const void *b, void *arg);
+static int compare_scalars_memcmp_2(const void *a, const void *b);
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+/* some debugging methods */
+#ifdef MVSTATS_DEBUG
+static void print_mv_histogram_info(MVHistogram histogram);
+#endif
+
+
/*
* analyze_rel() -- analyze one relation
*/
@@ -469,6 +530,13 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For small number of dimensions it works, but
+ * for complex stats it'd be nice use sample proportional to
+ * the table (say, 0.5% - 1%) instead of a fixed size.
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -571,6 +639,9 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
@@ -2810,3 +2881,1979 @@ compare_mcvs(const void *a, const void *b)
return da - db;
}
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+static void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ MVStats mvstats;
+ int nmvstats;
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ *
+ * TODO move this to a separate method or something ...
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel), &nmvstats, false);
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ MCVList mcvlist = NULL;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = mvstats[i].stakeys;
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze associations between pairs of columns.
+ *
+ * FIXME store the identified associations back to pg_mv_statistic
+ */
+ build_mv_associations(numrows, rows, attrs, natts, vacattrstats);
+
+ /* build the MCV list */
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, natts, vacattrstats, &numrows_filtered);
+
+ /*
+ * Build a multivariate histogram on the columns.
+ *
+ * FIXME remove the rows used to build the MCV from the histogram.
+ * Another option might be subtracting the MCV selectivities
+ * from the histogram, but I'm not sure whether that works
+ * accurately (maybe it introduces additional errors).
+ */
+ if (numrows_filtered > 0)
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, natts, vacattrstats, numrows);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(mvstats[i].mvoid, histogram, mcvlist);
+
+#ifdef MVSTATS_DEBUG
+ print_mv_histogram_info(histogram);
+#endif
+
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(stats[i]->compute_stats == compute_scalar_stats);
+
+ /* TODO remove the 'pass by value' requirement */
+ Assert(stats[i]->attrtype->typbyval);
+ }
+
+ return stats;
+}
+
+/*
+ * TODO Add ndistinct estimation, probably the one described in "Towards
+ * Estimation Error Guarantees for Distinct Values, PODS 2000,
+ * p. 268-279" (the ones called GEE, or maybe AE).
+ *
+ * TODO The "combined" ndistinct is more likely to scale with the number
+ * of rows (in the table), because a single column behaving this
+ * way is sufficient for such behavior.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* info for the interesting attributes only */
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* resulting bucket */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+ bucket->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /*
+ * All the sample rows fall into the initial bucket.
+ *
+ * FIXME This is wrong (unless all columns are NOT NULL), because we
+ * skipped the NULL values.
+ */
+ bucket->numrows = numrows;
+ bucket->ntuples = numrows;
+ bucket->rows = rows;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ /*
+ * The initial bucket was not split at all, so we'll start with the
+ * first dimension in the next round (index = 0).
+ */
+ bucket->last_split_dimension = -1;
+
+ return bucket;
+}
+
+/*
+ * TODO Fix to handle arbitrarily-sized histograms (not just 2D ones)
+ * and call the right output procedures (for the particular type).
+ *
+ * TODO This should somehow fetch info about the data types, and use
+ * the appropriate output functions to print the boundary values.
+ * Right now this prints the 8B value as an integer.
+ *
+ * TODO Also, provide a special function for 2D histogram, printing
+ * a gnuplot script (with rectangles).
+ *
+ * TODO For string types (once supported) we can sort the strings first,
+ * assign them a sequence of integers and use the original values
+ * as labels.
+ */
+#ifdef MVSTATS_DEBUG
+static void
+print_mv_histogram_info(MVHistogram histogram)
+{
+ int i = 0;
+
+ elog(WARNING, "histogram nbuckets=%d", histogram->nbuckets);
+
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ MVBucket bucket = histogram->buckets[i];
+ elog(WARNING, " bucket %d : ndistinct=%f ntuples=%d min=[%ld, %ld], max=[%ld, %ld] distinct=[%d,%d]",
+ i, bucket->ndistinct, bucket->numrows,
+ bucket->min[0], bucket->min[1], bucket->max[0], bucket->max[1],
+ bucket->ndistincts[0], bucket->ndistincts[1]);
+ }
+}
+#endif
+
+/*
+ * A very simple partitioning selection criteria - choose the bucket
+ * with the highest number of distinct values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int ndistinct = 1; /* if ndistinct=1, we can't split the bucket */
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* if the ndistinct count is higher, use this bucket */
+ if (buckets[i]->ndistinct > ndistinct) {
+ bucket = buckets[i];
+ ndistinct = buckets[i]->ndistinct;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - splits the dimensions in
+ * a round-robin manner (considering only those with ndistinct>1). That
+ * is first a dimension 0 is split, then 1, 2, ... until reaching the
+ * end of attribute list, and then wrapping back to 0. Of course,
+ * dimensions with a single distinct value are skipped.
+ *
+ * This is essentially what Muralikrishna/DeWitt described in their SIGMOD
+ * article (M. Muralikrishna, David J. DeWitt: Equi-Depth Histograms For
+ * Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD
+ * Conference 1988: 28-36).
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * This splits the bucket by tweaking the existing one, and returning the
+ * new bucket (essentially shrinking the existing one in-place and returning
+ * the other "half" as a new bucket). The caller is responsible for adding
+ * the new bucket into the list of buckets.
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case of
+ * strongly dependent columns - e.g. y=x).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g. to
+ * split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(bucket->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = bucket->rows;
+ int oldnrows = bucket->numrows;
+
+ /* info for the interesting attributes only */
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(bucket->ndistinct > 1);
+ Assert(bucket->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split, in a round robin manner.
+ * We'll use the first one with (ndistinc > 1).
+ *
+ * If we happen to wrap around, something clearly went wrong (we
+ * can't mess with the last_split_dimension directly, because we
+ * couldn't do this check).
+ */
+ dimension = bucket->last_split_dimension;
+ while (true)
+ {
+ dimension = (dimension + 1) % numattrs;
+
+ if (bucket->ndistincts[dimension] > 1)
+ break;
+
+ /* if we ran the previous split dimension, it's infinite loop */
+ Assert(dimension != bucket->last_split_dimension);
+ }
+
+ /* Remember the dimension for the next split of this bucket. */
+ bucket->last_split_dimension = dimension;
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < bucket->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(bucket->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ split_value = values[0].value;
+ for (i = 1; i < bucket->numrows; i++)
+ {
+ /* count distinct values */
+ if (values[i].value != values[i-1].value)
+ ndistinct += 1;
+
+ /* once we've seen 1/2 distinct values (and use the value) */
+ if (ndistinct > bucket->ndistincts[dimension] / 2)
+ {
+ split_value = values[i].value;
+ break;
+ }
+
+ /* keep track how many rows belong to the first bucket */
+ nrows += 1;
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < bucket->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ bucket->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_bucket->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ bucket->numrows = nrows;
+ new_bucket->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&bucket->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_bucket->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ new_bucket->last_split_dimension = bucket->last_split_dimension;
+
+ /* allocate the per-dimension arrays */
+ new_bucket->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * FIXME Make this work with all types (not just pass-by-value ones).
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j, idx = 0;
+ int numattrs = attrs->dim1;
+ Size len = sizeof(Datum) * numattrs;
+ bool isNull;
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ Datum * values = palloc0(bucket->numrows * numattrs * sizeof(Datum));
+
+ for (j = 0; j < bucket->numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ values[idx++] = heap_getattr(bucket->rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isNull);
+
+ qsort_arg((void *) values, bucket->numrows, sizeof(Datum) * numattrs,
+ compare_scalars_memcmp, &len);
+
+ bucket->ndistinct = 1;
+
+ for (i = 1; i < bucket->numrows; i++)
+ if (memcmp(&values[i * numattrs], &values[(i-1) * numattrs], len) != 0)
+ bucket->ndistinct += 1;
+
+ pfree(values);
+
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ *
+ * TODO Remove unnecessary parameters - don't pass in the whole arrays,
+ * just the proper elements.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ Datum * values = (Datum*)palloc0(bucket->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < bucket->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(bucket->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ bucket->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs etc.).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ bucket->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+MVStats
+list_mv_stats(Oid relid, int *nstats, bool built_only)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ MVStats result;
+
+ /* start with 16 items, that should be enough for most cases */
+ int maxitems = 16;
+ result = (MVStats)palloc0(sizeof(MVStatsData) * maxitems);
+ *nstats = 0;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /*
+ * Skip statistics that were not computed yet (if only stats
+ * that were already built were requested)
+ */
+ if (built_only && (! (stats->hist_built || stats->mcv_built || stats->assoc_built)))
+ continue;
+
+ /* double the array size if needed */
+ if (*nstats == maxitems)
+ {
+ maxitems *= 2;
+ result = (MVStats)repalloc(result, sizeof(MVStatsData) * maxitems);
+ }
+
+ result[*nstats].mvoid = HeapTupleGetOid(htup);
+ result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ result[*nstats].hist_built = stats->hist_built;
+ result[*nstats].mcv_built = stats->mcv_built;
+ result[*nstats].assoc_built = stats->assoc_built;
+ *nstats += 1;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+
+/*
+ * Serialize the MV histogram into a bytea value.
+ *
+ * The serialized first deduplicates the boundary values into a separate
+ * array, and uses 2B indexes when serializing the buckets. This stores
+ * a significant amount of space because each bucket split adds a single
+ * new boundary value, so e.g. with 4 attributes and 8191 splits (thus
+ * 8192 buckets), there are only ~8200 distinct boundary values.
+ *
+ * But as each bucket has 8 boundary values (4+4), that's ~64k Datums.
+ * That's roughly 65kB vs. 512kB, but we haven't included the indexes
+ * used to reference the boundary values. By using int16 indexes (which
+ * should be more than enough for all reasonable histogram sizes),
+ * this amounts to ~128kB (8192*8*2). So in total it's ~196kB vs. 512kB,
+ * i.e. more than 2x compression, which is nice.
+ *
+ * The implementation is simple - walk through the buckets, collect all
+ * the boundary values, keep only distinct values (in a sorted array)
+ * and then replace the values with indexes (using binary search).
+ *
+ * It's possible to either serialize/deserialize the histogram into
+ * a MVHistogram, or create a special structure working with this
+ * compressed structure (and keep MVBucket/MVHistogram only for the
+ * building phase). This might actually work better thanks to better
+ * CPU cache hit ratio, and simpler deserialization.
+ *
+ * This encoding will probably prevent automatic varlena compression,
+ * because first part of the serialized bytea will be an array of unique
+ * values (although sorted), and pglz decides whether to compress by
+ * trying to compress the first part (~1kB or so). Which will be poor,
+ * due to the lack of repetition.
+ *
+ * But in this case this is probably desirable - the data in general
+ * won't be really compressible (in addition to the 2x compression we
+ * got thanks to the encoding). In a sense the encoding scheme is
+ * actually a context-aware compression (usually compressing to ~30%).
+ * So this seems appropriate in this case.
+ *
+ * FIXME Make this work with arbitrary types.
+ *
+ * TODO Try to keep the compressed form, instead of deserializing it to
+ * MVHistogram/MVBucket.
+ *
+ * TODO We might get a bit better compression by considering the actual
+ * data type length. The current implementation treats all data as
+ * 8B values, but for INT it's actually 4B etc. OTOH this is only
+ * related to the lookup table, and most of the space is occupied
+ * by the buckets (with int16 indexes). And we don't have type info
+ * at the moment, so it would be difficult (but we'll nedd it to
+ * support all types, so maybe then).
+ */
+static bytea *
+serialize_mv_histogram(MVHistogram histogram)
+{
+ int i = 0, j = 0;
+
+ /* total size (histogram header + all buckets) */
+ Size total_len;
+ char *tmp = NULL;
+ bytea *result = NULL;
+
+ /* we need to accumulate all boundary values (min/max) */
+ int idx = 0;
+ int max_values = histogram->nbuckets * histogram->ndimensions * 2;
+ Datum * values = (Datum*)palloc0(max_values * sizeof(Datum));
+ Size len = sizeof(Datum);
+
+ /* we'll collect unique boundary values into this */
+ int ndistinct = 0;
+ Datum *lookup = NULL;
+
+ /*
+ * Collect the boundary values first, sort them and generate a small
+ * array with only distinct values.
+ */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ values[idx++] = histogram->buckets[i]->min[j];
+ values[idx++] = histogram->buckets[i]->max[j];
+ }
+ }
+
+ /*
+ * We've allocated just enough space for all boundary values, but
+ * this may change once we start handling NULL values (as we'll
+ * probably skip those).
+ *
+ * Also, we expect at least one boundary value at this moment.
+ */
+ Assert(max_values == idx);
+ Assert(idx > 1);
+
+ /*
+ * Sort the collected boundary values using a simple memcmp-based
+ * comparator (this won't work for pass-by-reference types), and
+ * then walk the data and count the distinct values.
+ */
+ qsort((void *) values, idx, len, compare_scalars_memcmp_2);
+
+ ndistinct = 1;
+ for (i = 1; i < max_values; i++)
+ ndistinct += (values[i-1] != values[i]) ? 1 : 0;
+
+ /*
+ * At this moment we can allocate the bytea value (and we'll collect
+ * the boundary values directly into it).
+ *
+ * The bytea will be structured like this:
+ *
+ * - varlena header : VARHDRSZ
+ * - histogram header : offsetof(MVHistogram,buckets)
+ * - number of boundary values : sizeof(uint32)
+ * - boundary values : ndistinct * sizeof(Datum)
+ * - buckets : nbuckets * BUCKET_SIZE_SERIALIZED
+ *
+ * We'll assume 2B indexes into the boundary values, because each
+ * bucket 'split' introduces one boundary value. Moreover, multiple
+ * splits may introduce the same value, so this should be enough for
+ * at least 65k buckets (and likely more). That's more than enough
+ * for reasonable histogram sizes.
+ */
+
+ Assert(ndistinct <= 65536);
+
+ total_len = VARHDRSZ + offsetof(MVHistogramData, buckets) +
+ (sizeof(uint32) + ndistinct * sizeof(Datum)) +
+ histogram->nbuckets * BUCKET_SIZE_SERIALIZED(histogram->ndimensions);
+
+ result = (bytea*)palloc0(total_len);
+ tmp = VARDATA(result);
+
+ SET_VARSIZE(result, total_len);
+
+ /* copy the global histogram header */
+ memcpy(tmp, histogram, offsetof(MVHistogramData, buckets));
+ tmp += offsetof(MVHistogramData, buckets);
+
+ /*
+ * Copy the number of distinct values, and then all the distinct
+ * values currently stored in the 'values' array (sorted).
+ */
+ memcpy(tmp, &ndistinct, sizeof(uint32));
+ tmp += sizeof(uint32);
+
+ lookup = (Datum*)tmp;
+
+ for (i = 0; i < max_values; i++)
+ {
+ /* skip values that are equal to the previous one */
+ if ((i > 0) && (values[i-1] == values[i]))
+ continue;
+
+ memcpy(tmp, &values[i], sizeof(Datum));
+ tmp += sizeof(Datum);
+ }
+
+ Assert(tmp - (char*)lookup == ndistinct * sizeof(Datum));
+
+ /* now serialize all the buckets - first the header, without the
+ * variable-length part, then all the variable length parts */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ MVBucket bucket = histogram->buckets[i];
+ uint16 indexes[histogram->ndimensions];
+
+ /* write the common bucket header */
+ memcpy(tmp, bucket, offsetof(MVBucketData, ndistincts));
+ tmp += offsetof(MVBucketData, ndistincts);
+
+ /* per-dimension ndistincts / nullsonly */
+ memcpy(tmp, bucket->ndistincts, sizeof(uint32)*histogram->ndimensions);
+ tmp += sizeof(uint32)*histogram->ndimensions;
+
+ memcpy(tmp, bucket->nullsonly, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ memcpy(tmp, bucket->min_inclusive, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ memcpy(tmp, bucket->max_inclusive, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ /* and now translate the min (and then max) boundaries to indexes */
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ Datum *v = (Datum*)bsearch(&bucket->min[j], lookup, ndistinct,
+ sizeof(Datum), compare_scalars_memcmp_2);
+
+ Assert(v != NULL);
+ indexes[j] = (v - lookup); /* Datum arithmetics (not char) */
+ Assert(indexes[j] < ndistinct); /* we have to be within the array */
+ }
+
+ memcpy(tmp, indexes, sizeof(uint16)*histogram->ndimensions);
+ tmp += sizeof(uint16)*histogram->ndimensions;
+
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ Datum *v = (Datum*)bsearch(&bucket->max[j], lookup, ndistinct,
+ sizeof(Datum), compare_scalars_memcmp_2);
+ Assert(v != NULL);
+ indexes[j] = (v - lookup); /* Datum arithmetics (not char) */
+ Assert(indexes[j] < ndistinct); /* we have to be within the array */
+ }
+
+ memcpy(tmp, indexes, sizeof(uint16)*histogram->ndimensions);
+ tmp += sizeof(uint16)*histogram->ndimensions;
+ }
+
+ return result;
+}
+
+/*
+ * Reverse to serialize histogram. This essentially expands the serialized
+ * form back to MVHistogram / MVBucket.
+ */
+MVHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_length;
+ char *tmp = NULL;
+ MVHistogram histogram;
+
+ uint32 nlookup; /* Datum lookup table */
+ Datum *lookup = NULL;
+
+ if (data == NULL)
+ return NULL;
+
+ /* get pointer to the data part of the varlena */
+ tmp = VARDATA(data);
+
+ histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ /* copy the histogram header in place */
+ memcpy(histogram, tmp, offsetof(MVHistogramData, buckets));
+ tmp += offsetof(MVHistogramData, buckets);
+
+ if (histogram->magic != MVHIST_MAGIC)
+ {
+ pfree(histogram);
+ elog(WARNING, "not a MV Histogram (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(histogram->type == MVHIST_TYPE_BASIC);
+ Assert(histogram->nbuckets > 0);
+ Assert(histogram->nbuckets <= MVHIST_MAX_BUCKETS);
+ Assert(histogram->ndimensions > 0);
+ Assert(histogram->ndimensions <= MVSTATS_MAX_DIMENSIONS);
+
+ /* now, get the size of the lookup table */
+ memcpy(&nlookup, tmp, sizeof(uint32));
+ tmp += sizeof(uint32);
+ lookup = (Datum*)tmp;
+
+ /* skip to the first bucket */
+ tmp += sizeof(Datum) * nlookup;
+
+ /* check the total serialized length */
+ expected_length = offsetof(MVHistogramData, buckets) +
+ sizeof(uint32) + nlookup * sizeof(Datum) +
+ histogram->nbuckets * BUCKET_SIZE_SERIALIZED(histogram->ndimensions);
+
+ /* check serialized length */
+ if (VARSIZE_ANY_EXHDR(data) != expected_length)
+ {
+ elog(ERROR, "invalid MV histogram serialized size (expected %ld, got %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_length);
+ return NULL;
+ }
+
+ /* allocate bucket pointers */
+ histogram->buckets = (MVBucket*)palloc0(histogram->nbuckets * sizeof(MVBucket));
+
+ /* deserialize the buckets, one by one */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ /* don't allocate space for the build-only fields */
+ MVBucket bucket = (MVBucket)palloc0(offsetof(MVBucketData, rows));
+ uint16 *indexes = NULL;
+
+ /* write the common bucket header */
+ memcpy(bucket, tmp, offsetof(MVBucketData, ndistincts));
+ tmp += offsetof(MVBucketData, ndistincts);
+
+ /* per-dimension ndistincts / nullsonly */
+ bucket->ndistincts = (uint32*)palloc0(sizeof(uint32)*histogram->ndimensions);
+ memcpy(bucket->ndistincts, tmp, sizeof(uint32)*histogram->ndimensions);
+ tmp += sizeof(uint32)*histogram->ndimensions;
+
+ bucket->nullsonly = (bool*)palloc0(sizeof(bool)*histogram->ndimensions);
+ memcpy(bucket->nullsonly, tmp, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ bucket->min_inclusive = (bool*)palloc0(sizeof(bool)*histogram->ndimensions);
+ memcpy(bucket->min_inclusive, tmp, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ bucket->max_inclusive = (bool*)palloc0(sizeof(bool)*histogram->ndimensions);
+ memcpy(bucket->max_inclusive, tmp, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ /* translate the indexes back to Datum values */
+ bucket->min = (Datum*)palloc0(sizeof(Datum)*histogram->ndimensions);
+ bucket->max = (Datum*)palloc0(sizeof(Datum)*histogram->ndimensions);
+
+ indexes = (uint16*)tmp;
+ tmp += sizeof(uint16) * histogram->ndimensions;
+ for (j = 0; j < histogram->ndimensions; j++)
+ memcpy(&bucket->min[j], &lookup[indexes[j]], sizeof(Datum));
+
+ indexes = (uint16*)tmp;
+ tmp += sizeof(uint16) * histogram->ndimensions;
+ for (j = 0; j < histogram->ndimensions; j++)
+ memcpy(&bucket->max[j], &lookup[indexes[j]], sizeof(Datum));
+
+ histogram->buckets[i] = bucket;
+ }
+
+ return histogram;
+}
+
+/*
+ * Serialize MCV list into a bytea value.
+ *
+ * This does not use any kind of deduplication (compared to histogram
+ * serialization), as we don't expect the same efficiency here.
+ *
+ * This simply writes a MCV header (number of items, ...) and then Datum
+ * values for all attribute of a item, followed by the item frequency
+ * (as a double).
+ */
+static bytea *
+serialize_mv_mcvlist(MCVList mcvlist)
+{
+ int i;
+
+ /* we need to store nitems, and each needs ndimension * Datum, plus a double */
+ Size len = VARHDRSZ + offsetof(MCVListData, items) + mcvlist->nitems * (sizeof(Datum) * mcvlist->ndimensions + sizeof(double));
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, mcvlist, offsetof(MCVListData, items));
+ tmp += offsetof(MCVListData, items);
+
+ /* now, walk through the items and store values + frequency for each MCV item */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ memcpy(tmp, mcvlist->items[i]->values, mcvlist->ndimensions * sizeof(Datum));
+ tmp += mcvlist->ndimensions * sizeof(Datum);
+
+ memcpy(tmp, &mcvlist->items[i]->frequency, sizeof(double));
+ tmp += sizeof(double);
+ }
+
+ return output;
+
+}
+
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ Assert(mcvlist->nitems > 0);
+ Assert((mcvlist->ndimensions >= 2) && (mcvlist->ndimensions <= MVSTATS_MAX_DIMENSIONS));
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MCVListData,items) +
+ mcvlist->nitems * (sizeof(Datum) * mcvlist->ndimensions + sizeof(double));
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem) * mcvlist->nitems);
+
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = (MCVItem)palloc0(offsetof(MCVItemData, values) +
+ mcvlist->ndimensions * sizeof(Datum));
+
+ memcpy(item->values, tmp, mcvlist->ndimensions * sizeof(Datum));
+ tmp += mcvlist->ndimensions * sizeof(Datum);
+
+ memcpy(&item->frequency, tmp, sizeof(double));
+ tmp += sizeof(double);
+
+ mcvlist->items[i] = item;
+ }
+
+ return mcvlist;
+}
+
+static void
+update_mv_stats(Oid mvoid, MVHistogram histogram, MCVList mcvlist)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (histogram != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stahist-1] = false;
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(serialize_mv_histogram(histogram));
+ }
+
+ if (mcvlist != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stamcv -1] = false;
+ values[Anum_pg_mv_statistic_stamcv - 1]
+ = PointerGetDatum(serialize_mv_mcvlist(mcvlist));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+
+/* MV stats */
+
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+Datum
+pg_mv_stats_mvclist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+Datum
+pg_mv_stats_histogram_gnuplot(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+
+ /* FIXME (handle the length properly using StringBuilder */
+ Size len = 1024*1024;
+ char *buffer = palloc0(len);
+ char *str = buffer;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+
+ MVHistogram hist = deserialize_mv_histogram(data);
+
+ for (i = 0; i < hist->nbuckets; i++)
+ {
+ str += snprintf(str, len - (str - buffer),
+ "set object %d rect from %ld,%ld to %ld,%ld lw 1\n",
+ (i+1),
+ hist->buckets[i]->min[0], hist->buckets[i]->min[1],
+ hist->buckets[i]->max[0], hist->buckets[i]->max[1]);
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(buffer));
+
+}
+
+bytea *
+fetch_mv_histogram(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *stahist = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum hist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ stahist = DatumGetByteaP(hist);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return stahist;
+}
+
+bytea *
+fetch_mv_mcvlist(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *mcvlist = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum tmp = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ mcvlist = DatumGetByteaP(tmp);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return mcvlist;
+}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, by looking at the number of
+ * distinct values (combination of column values for bucket, column
+ * values for a dimension). This is somehow naive, but seems to work
+ * quite well. See the discussion at select_bucket_to_partition and
+ * partition_bucket for more details about alternative algorithms.
+ *
+ * So the current algorithm looks like this:
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (max distinct combinations)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (max distinct values)
+ * split the bucket into two buckets
+ *
+ */
+static MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ int attr_cnt, VacAttrStats **vacattrstats,
+ int numrows_total)
+{
+ int i;
+ int ndistinct;
+ int numattrs = attrs->dim1;
+ int ndistincts[numattrs];
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVHIST_MAGIC;
+ histogram->type = MVHIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets = (MVBucket*)palloc0(MVHIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0] = create_initial_mv_bucket(numrows, rows_copy, attrs,
+ attr_cnt, vacattrstats);
+
+ ndistinct = histogram->buckets[0]->ndistinct;
+
+ /* keep the global ndistinct values */
+ for (i = 0; i < numattrs; i++)
+ ndistincts[i] = histogram->buckets[0]->ndistincts[i];
+
+ while (histogram->nbuckets < MVHIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets, histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets] = partition_bucket(bucket, attrs,
+ attr_cnt, vacattrstats);
+
+ histogram->nbuckets += 1;
+ }
+
+ /*
+ * FIXME store the histogram in a catalog in a serialized form (simple for
+ * pass-by-value, more complicated for buckets on varlena types)
+ */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ int d;
+ histogram->buckets[i]->ntuples = (histogram->buckets[i]->numrows * 1.0) / numrows_total;
+ histogram->buckets[i]->ndistinct = (histogram->buckets[i]->ndistinct * 1.0) / ndistinct;
+
+ for (d = 0; d < numattrs; d++)
+ histogram->buckets[i]->ndistincts[d] = (histogram->buckets[i]->ndistincts[d] * 1.0) / ndistincts[d];
+ }
+
+ return histogram;
+
+}
+
+/*
+ * Mine associations between the columns, in the form (A => B).
+ *
+ * At the moment this only works for associations between two columns,
+ * but it might be useful to mine for rules involving multiple columns
+ * on the left side. That is rules [A,B] => C and so on. Handling
+ * multiple columns on the right side is not necessary, because such
+ * rules may be decomposed into a set of rules, one for each column.
+ * I.e. A => [B,C] is exactly the same as (A => B) & (A => C).
+ *
+ * Those rules don't immediately identify redundant clauses, because the
+ * user may choose "incompatible conditions" (e.g. by using a zip code
+ * and a mismatching city) and so on. This should however be easy to
+ * identify from a histogram, because the conditions will match a bucket
+ * with low frequencies.
+ *
+ * The question is whether this can be useful when we have a histogram,
+ * because such incompatible conditions should result in not matching
+ * any buckets (or matching only buckets with low frequencies).
+ *
+ * The problem is that histograms work like this when the sorting is
+ * compatible with the meaning of the data. We're often using data types
+ * that support sorting (e.g. INT, BIGING) as a kind of labels where
+ * the sorting really does not make much sense. Sorting by ZIP code will
+ * result in sorting the cities quite randomly, and similarly for most
+ * surrogate primary / foreign keys. In such cases the histograms are
+ * pretty useless.
+ *
+ * So, a good approach might be testing the independence of the data
+ * (by building a contingency table) and buildint the MV histogram only
+ * when there's a dependency. For the 'label' data this should notice
+ * the histogram is useless. So we won't build it (and we may use that
+ * as a sign supporting the association rule).
+ *
+ * Another option is to look at selectivity of A and B separately, and
+ * then use the minimum of those.
+ *
+ * TODO investigate using histogram and MCV list to confirm the
+ * associative rule
+ *
+ * TODO investigate statistical testing of the distribution (to decide
+ * whether it makes sense to build the histogram)
+ *
+ * TODO Using a min/max of selectivities would probably make more sense
+ * for the associated columns.
+ */
+static void
+build_mv_associations(int numrows, HeapTuple *rows, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ bool isNull;
+ Size len = 2 * sizeof(Datum); /* only simple associations a => b */
+ int numattrs = attrs->dim1;
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct columns in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we should expected to
+ * observe in the sample - we can then use the average group
+ * size as a threshold. That seems better than a static approach.
+ */
+ int min_group_size = 10;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /* info for the interesting attributes only
+ *
+ * TODO Compute this only once and pass it to all the methods
+ * that need it.
+ */
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* We'll reuse the same array for all the combinations */
+ Datum * values = (Datum*)palloc0(numrows * 2 * sizeof(Datum));
+
+ Assert(numattrs >= 2);
+
+ for (dima = 0; dima < numattrs; dima++)
+ {
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+
+ int supporting = 0;
+ int contradicting = 0;
+
+ Datum val_a, val_b;
+ int violations = 0;
+ int group_size = 0;
+
+ int supporting_rows = 0;
+
+ /* skip (dima==dimb) */
+ if (dima == dimb)
+ continue;
+
+ /*
+ * FIXME Not sure if this handles NULL values properly (not sure
+ * how to do that). We assume that NULL means 0 for now,
+ * handling it just like any other value.
+ */
+ for (i = 0; i < numrows; i++)
+ {
+ values[i*2] = heap_getattr(rows[i], attrs->values[dima], stats[dima]->tupDesc, &isNull);
+ values[i*2+1] = heap_getattr(rows[i], attrs->values[dimb], stats[dimb]->tupDesc, &isNull);
+ }
+
+ qsort_arg((void *) values, numrows, sizeof(Datum) * 2, compare_scalars_memcmp, &len);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful. When contradicting,
+ * use it always.
+ */
+
+ /* start with values from the first row */
+ val_a = values[0];
+ val_b = values[1];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ if (values[2*i] != val_a) /* end of the group */
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ */
+ supporting += ((violations == 0) && (group_size >= min_group_size)) ? 1 : 0;
+ contradicting += (violations != 0) ? 1 : 0;
+
+ supporting_rows += ((violations == 0) && (group_size >= min_group_size)) ? group_size : 0;
+
+ /* current values start a new group */
+ val_a = values[2*i];
+ val_b = values[2*i+1];
+ violations = 0;
+ group_size = 1;
+ }
+ else
+ {
+ if (values[2*i+1] != val_b) /* mismatch of a B value */
+ {
+ val_b = values[2*i+1];
+ violations += 1;
+ }
+
+ group_size += 1;
+ }
+ }
+
+ /* FIXME handle the last group */
+ supporting += ((violations == 0) && (group_size >= min_group_size)) ? 1 : 0;
+ contradicting += (violations != 0) ? 1 : 0;
+ supporting_rows += ((violations == 0) && (group_size >= min_group_size)) ? group_size : 0;
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical rule.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means the columns have the same values (or one is a 'label'),
+ * making the conditions rather redundant. Although it's possible
+ * that the query uses incompatible combination of values.
+ */
+ if (supporting_rows > (numrows - supporting_rows) * 10)
+ {
+ // elog(WARNING, "%d => %d : supporting=%d contradicting=%d", dima, dimb, supporting, contradicting);
+ }
+
+ }
+ }
+
+ pfree(values);
+
+}
+
+/*
+ * Compute the list of most common items, where item is a combination of
+ * values for all the columns. For small number of distinct values, we
+ * may be able to represent the distribution pretty exactly, with
+ * per-item statistics.
+ *
+ * If we can represent the distribution using a MCV list only, it's great
+ * because that allows much better estimates (especially for equality).
+ * Such discrete distributions are also easier to combine (more
+ * efficient and more accurate) than when using histograms.
+ *
+ * FIXME This does not handle NULL values at the moment.
+ *
+ * TODO When computing equality selectivity (a=1 AND b=2), we can do that
+ * pretty exactly assuming (a) we hit a MCV item and (b) the
+ * histogram is built on those two columns only (i.e. there are no
+ * other columns). In that case we can estimate the selectivity
+ * using only the MCV.
+ *
+ * When we don't hit a MCV item, we can use the frequency of the
+ * least probable MCV item as upper bound of the selectivity
+ * (otherwise it'd get into the MCV list). Again, this only works
+ * when the histogram size matches the restricted columns.
+ *
+ * When the histogram is larger (i.e. there are additional columns),
+ * we can't be sure how is the selectivity distributed among the MCV
+ * list and the histogram (we may get several MCV items matching
+ * the conditions and several histogram buckets at the same time).
+ *
+ * In this case we can probably clamp the selectivity by minimum of
+ * selectivities for each condition. For example if we know the
+ * number of distinct values for each column, we can use 1/ndistinct
+ * as a per-column estimate. Or rather 1/ndistinct + selectivity
+ * derived from the MCV list.
+ *
+ * If there's no histogram (thus the distribution is approximated
+ * only by the MCV list), the size of the stats (whether there are
+ * some other columns, not referenced in the conditions) does not
+ * matter. We can do pretty accurate estimation using the MCV.
+ *
+ * TODO Currently there's no logic to consider building only a MCV list
+ * (and not building the histogram at all).
+ *
+ * TODO For types that don't reasonably support ordering (either because
+ * the type does not support that or when the user adds some option
+ * to the ADD STATISTICS command - e.g. UNSORTED_STATS), building
+ * the histogram may be pointless and inefficient. This is esp.
+ * true for varlena types that may be quite large and a large MCV
+ * list may be a better choice, because it makes equality estimates
+ * more accurate. Due to the unsorted nature, range queries on those
+ * attributes are rather useless anyway.
+ *
+ * Another thing is that by restricting to MCV list and equality
+ * conditions, we can use hash values instead of long varlena values.
+ * The equality estimation will be very accurate.
+ *
+ * This however complicates matching the columns to available
+ * statistics, as it will require matching clauses (not columns) to
+ * stats. And it may get quite complex - e.g. what if there are
+ * multiple clauses, each compatible with different stats subset?
+ *
+ * FIXME Create a special-purpose type for MCV items (instead of a plain
+ * Datum array, which is very difficult to work with).
+ */
+static MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats,
+ int *numrows_filtered)
+{
+ int i, j, idx = 0;
+ int numattrs = attrs->dim1;
+ Size len = sizeof(Datum) * numattrs;
+ bool isNull;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ *
+ * TODO We're using Datum (8B), even for data types smaller than this
+ * (notably int4 and float4). Maybe we could save some space here,
+ * although it seems the bytea compression will handle it just fine.
+ */
+ Datum * values = palloc0(numrows * numattrs * sizeof(Datum));
+
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ values[idx++] = heap_getattr(rows[j], attrs->values[i], stats[i]->tupDesc, &isNull);
+
+ qsort_arg((void *) values, numrows, sizeof(Datum) * numattrs, compare_scalars_memcmp, &len);
+
+ /*
+ * Count the number of distinct values - we need this to determine
+ * the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (memcmp(&values[i * numattrs], &values[(i-1) * numattrs], len) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array.
+ *
+ * TODO for now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ *
+ * TODO see if we can fit all the distinct values in the MCV list
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ /*
+ * If there are less than some number of items, store all with at
+ * least two rows in the sample.
+ *
+ * FIXME We can do this only if we believe we got all the distinct
+ * values of the table.
+ */
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (memcmp(&values[i * numattrs], &values[(i-1) * numattrs], len) != 0))
+ {
+ /* count the MCV item if exceeding the threshold */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* by default we keep all the rows (even if there's no MCV list) */
+ *numrows_filtered = numrows;
+
+ /* we know the number of mcvitems, now collect them in a 2nd pass */
+ if (nitems > 0)
+ {
+ /* we need to store the frequency for each group, so (numattrs + 1) */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ /* now repeat the same loop as above, but this time copy the data
+ * for items exceeding the threshold */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+
+ /* last row or a new group */
+ if ((i == numrows) || (memcmp(&values[i * numattrs], &values[(i-1) * numattrs], len) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* first, allocate the item (with the proper size of values) */
+ MCVItem item = (MCVItem)palloc0(offsetof(MCVItemData, values) +
+ sizeof(Datum)*mcvlist->ndimensions);
+
+ /* then copy values from the _previous_ group */
+ memcpy(item->values, &values[(i-1)*numattrs], len);
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ mcvlist->items[nitems] = item;
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV items.
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+ Datum keys[numattrs];
+
+ /* collect the key values */
+ for (j = 0; j < numattrs; j++)
+ keys[j] = heap_getattr(rows[i], attrs->values[j], stats[j]->tupDesc, &isNull);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ if (memcmp(keys, mcvlist->items[j]->values, sizeof(Datum)*numattrs) == 0)
+ {
+ match = true;
+ break;
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the first part */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ pfree(rows_filtered);
+
+ }
+ }
+
+ pfree(values);
+
+ /*
+ * TODO Single-dimensional MCV is stored sorted by frequency (descending).
+ * Maybe this should be stored like that too?
+ */
+
+ return mcvlist;
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+static int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+static int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting Datum[] (row of Datums) when
+ * counting distinct values.
+ */
+static int
+compare_scalars_memcmp(const void *a, const void *b, void *arg)
+{
+ Size len = *(Size*)arg;
+
+ return memcmp(a, b, len);
+}
+
+static int
+compare_scalars_memcmp_2(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(Datum));
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index cb16c53..28bad78 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -34,6 +34,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_rowsecurity.h"
@@ -89,7 +90,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -137,8 +138,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
@@ -412,7 +414,8 @@ static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKM
static void ATExecGenericOptions(Relation rel, List *options);
static void ATExecEnableRowSecurity(Relation rel);
static void ATExecDisableRowSecurity(Relation rel);
-
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode);
static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
static const char *storage_name(char c);
@@ -2963,6 +2966,7 @@ AlterTableGetLockLevel(List *cmds)
* updates.
*/
case AT_SetStatistics: /* Uses MVCC in getTableAttrs() */
+ case AT_AddStatistics: /* XXX not sure if the right level */
case AT_ClusterOn: /* Uses MVCC in getIndexes() */
case AT_DropCluster: /* Uses MVCC in getIndexes() */
case AT_SetOptions: /* Uses MVCC in getTableAttrs() */
@@ -3110,6 +3114,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
pass = AT_PASS_ADD_CONSTR;
break;
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
+ case AT_AddStatistics: /* XXX maybe not the right place */
ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode);
/* Performs own permission checks */
ATPrepSetStatistics(rel, cmd->name, cmd->def, lockmode);
@@ -3405,6 +3410,9 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
ATExecSetStatistics(rel, cmd->name, cmd->def, lockmode);
break;
+ case AT_AddStatistics: /* ADD STATISTICS */
+ ATExecAddStatistics(tab, rel, (StatisticsDef *) cmd->def, lockmode);
+ break;
case AT_SetOptions: /* ALTER COLUMN SET ( options ) */
ATExecSetOptions(rel, cmd->name, cmd->def, false, lockmode);
break;
@@ -11614,3 +11622,197 @@ RangeVarCallbackForAlterRelation(const RangeVar *rv, Oid relid, Oid oldrelid,
ReleaseSysCache(tuple);
}
+
+/* used for sorting the attnums in ATExecAddStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the ALTER TABLE ... ADD STATISTICS (options) ON (columns).
+ *
+ * The code is an unholy mix of pieces that really belong to other parts
+ * of the source tree.
+ *
+ * FIXME Check that the types are pass-by-value and support sort,
+ * although maybe we can live without the sort (and only build
+ * MCV list / association rules).
+ *
+ * FIXME This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ Oid atttypids[INDEX_MAX_KEYS];
+ int numcols = 0;
+
+ Oid mvstatoid;
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+
+ /* by default build everything */
+ bool build_histogram = true,
+ build_mcv = true,
+ build_associations = true;
+
+ /* build regular MCV (not hashed by default) */
+ bool mcv_hashed = false;
+
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
+
+ Assert(IsA(def, StatisticsDef));
+
+ /* transform the column names to attnum values */
+
+ foreach(l, def->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ atttypids[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->atttypid;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, def->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv_hashed") == 0)
+ mcv_hashed = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "associations") == 0)
+ build_associations = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* TODO check that this is not used with 'histogram off' */
+
+ /* sanity check */
+ if (max_buckets < 1024)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is 1024")));
+
+ else if (max_buckets > 32768) /* FIXME use the proper constant */
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is 1024")));
+
+ }
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* TODO check that this is not used with 'mcv off' */
+
+ /* sanity check */
+ if (max_mcv_items < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be non-negative")));
+
+ else if (max_mcv_items > 8192) /* FIXME use the proper constant */
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is 8192")));
+
+ }
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_mcv_hashed -1] = BoolGetDatum(mcv_hashed);
+ values[Anum_pg_mv_statistic_assoc_enabled -1] = BoolGetDatum(build_associations);
+
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+
+ nulls[Anum_pg_mv_statistic_staassoc -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ mvstatoid = simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ heap_freetuple(htup);
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ return;
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 225756c..18464b9 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3879,6 +3879,17 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static StatisticsDef *
+_copyStatisticsDef(const StatisticsDef *from)
+{
+ StatisticsDef *newnode = makeNode(StatisticsDef);
+
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4690,6 +4701,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_StatisticsDef:
+ retval = _copyStatisticsDef(from);
+ break;
case T_PrivGrantee:
retval = _copyPrivGrantee(from);
break;
@@ -4702,7 +4716,6 @@ copyObject(const void *from)
case T_XmlSerialize:
retval = _copyXmlSerialize(from);
break;
-
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
retval = 0; /* keep compiler quiet */
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 9b657fb..9c32735 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -24,6 +24,9 @@
#include "utils/lsyscache.h"
#include "utils/selfuncs.h"
+#include "utils/mvstats.h"
+#include "catalog/pg_collation.h"
+#include "utils/typcache.h"
/*
* Data structure for accumulating info about possible range-query
@@ -43,6 +46,23 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+static bool is_mv_compatible(Node *clause, Oid varRelid, Index *varno,
+ Bitmapset **attnums);
+static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
+ Oid varRelid, Oid *relid);
+static int choose_mv_histogram(int nmvstats, MVStats mvstats,
+ Bitmapset *attnums);
+static List *clauselist_mv_split(List *clauses, Oid varRelid,
+ List **mvclauses, MVStats mvstats);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStats mvstats);
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStats mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStats mvstats);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -100,14 +120,74 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+ int nmvstats = 0;
+ MVStats mvstats = NULL;
+
+ /* attributes in mv-compatible clauses */
+ Bitmapset *mvattnums = NULL;
+
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
varRelid, jointype, sjinfo);
+ /* collect attributes from mv-compatible clauses */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid);
+
+ /*
+ * If there are mv-compatible clauses, referencing at least two
+ * columns (otherwise it makes no sense to use mv stats), fetch the
+ * MV histograms for the relation (only the column keys, not the
+ * histograms yet - we'll decide which histogram to use first).
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* fetch info from the catalog (not the serialized stats yet) */
+ mvstats = list_mv_stats(relid, &nmvstats, true);
+
+ /*
+ * If there are candidate statistics, choose the histogram first.
+ * At the moment we only use a single statistics, covering the
+ * most columns (using info from the previous step). If there
+ * are multiple such histograms, we'll use the smallest one
+ * (with the lowest number of dimensions).
+ *
+ * This may not be optimal choice, if the 'smaller' stats has
+ * much less buckets than the rejected one (making it less
+ * accurate).
+ *
+ * We may end up without multivariate statistics, if none of the
+ * stats matches at least two columns from the clauses (in that
+ * case we may just use the single dimensional stats).
+ */
+ if (nmvstats > 0)
+ {
+ int idx = choose_mv_histogram(nmvstats, mvstats, mvattnums);
+
+ if (idx >= 0) /* we have a matching stats */
+ {
+ MVStats mvstat = &mvstats[idx];
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(clauses, varRelid, &mvclauses, mvstat);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+ }
+
/*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
@@ -782,3 +862,1010 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+
+
+/*
+ * Estimate selectivity for the list of MV-compatible clauses, using that
+ * particular histogram.
+ *
+ * When we hit a single bucket, we don't know what portion of it actually
+ * matches the clauses (e.g. equality), and we use 1/2 the bucket by
+ * default. However, the MV histograms are usually less detailed than
+ * the per-column ones, meaning the sum of buckets is often quite high
+ * (thanks to combining a lot of "partially hit" buckets).
+ *
+ * There are several ways to improve this, usually with cases when it
+ * won't really help. Also, the more complex the process, the worse
+ * the failures (i.e. misestimates).
+ *
+ * (1) Use the MV histogram only as a way to combine multiple
+ * per-column histograms, essentially rewriting
+ *
+ * P(A & B) = P(A) * P(B|A)
+ *
+ * where P(B|A) may be computed using a proper "slice" of the
+ * histogram, by first selecting only buckets where A is true, and
+ * then using the boundaries to 'restrict' the per-colunm histogram.
+ *
+ * With more clauses, it gets more complicated, of course
+ *
+ * P(A & B & C) = P(A & C) * P(B|A & C)
+ * = P(A) * P(C|A) * P(B|A & C)
+ *
+ * and so on.
+ *
+ * Of course, the question is how well and efficiently we can
+ * compute the conditional probabilities - whether this approach
+ * can improve the estimates (instead of amplifying the errors).
+ *
+ * Also, this does not eliminate the need for histogram on [A,B,C].
+ *
+ * (2) Use multiple smaller (and more accurate) histograms, and combine
+ * them using a process similar to the above. E.g. by assuming that
+ * B and C are independent, we can rewrite
+ *
+ * P(B|A & C) = P(B|A)
+ *
+ * so we can rewrite the whole formula to
+ *
+ * P(A & B & C) = P(A) * P(C|A) * P(B|A)
+ *
+ * and we're OK with two 2D histograms [A,C] and [A,B].
+ *
+ * It'd be nice to perform some sort of statistical test (Fisher
+ * or another chi-squared test) to identify independent components
+ * and automatically separate them into smaller histograms.
+ *
+ * (3) Using the estimated number of distinct values in a bucket to
+ * decide the selectivity of equality in the bucket (instead of
+ * blindly using 1/2 of the bucket, we may use 1/ndistinct).
+ * Of course, if the ndistinct estimate is way off, or when the
+ * distribution is not uniform (one distict items get much more
+ * items), this will fail. Also, we currently don't have ndistinct
+ * estimate available at this moment (but it shouldn't be that
+ * difficult to compute as ndistinct and ntuples should be available).
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities
+ * (i.e. the selectivity of the most restrictive clause), because
+ * that's the maximum we can ever get from ANDed list of clauses.
+ * This may probably prevent issues with hitting too many buckets
+ * and low precision histograms.
+ *
+ * TODO We may support some additional conditions, most importantly
+ * those matching multiple columns (e.g. "a = b" or "a < b").
+ * Ultimately we could track multi-table histograms for join
+ * cardinality estimation.
+ *
+ * TODO Currently this is only estimating all clauses, or clauses
+ * matching varRelid (when it's not 0). I'm not sure what's the
+ * purpose of varRelid, but my assumption is this is used for
+ * join conditions and such. In that case we can use those clauses
+ * to restrict the other (i.e. filter the histogram buckets first,
+ * before estimating the other clauses). This is essentially equal
+ * to computing P(A|B) where "B" are the clauses not matching the
+ * varRelid.
+ *
+ * TODO Further thoughts on processing equality clauses - maybe it'd be
+ * better to look for stats (with MCV) covered by the equality
+ * clauses, because then we have a chance to find an exact match
+ * in the MCV list, which is pretty much the best we can do. We may
+ * also look at the least frequent MCV item, and use it as a upper
+ * boundary for the selectivity (had there been a more frequent
+ * item, it'd be in the MCV list).
+ *
+ * These conditions may then be used as a condition for the other
+ * selectivities, i.e. we may estimate P(A,B) first, and then
+ * compute P(C|A,B) from another histogram. This may be useful when
+ * we can estimate P(A,B) accurately (e.g. because it's a complete
+ * equality match evaluated on MCV list), and then compute the
+ * conditional probability P(C|A,B), giving us the requested stats
+ *
+ * P(A,B,C) = P(A,B) * P(C|A,B)
+ *
+ * TODO There are several options for 'sanity clamping' the estimates.
+ *
+ * First, if we have selectivities for each condition, then
+ *
+ * P(A,B) <= MIN(P(A), P(B))
+ *
+ * Because additional conditions (connected by AND) can only lower
+ * the probability.
+ *
+ * So we can do some basic sanity checks using the single-variate
+ * stats (the ones we have right now).
+ *
+ * Second, when we have multivariate stats with a MCV list, then
+ *
+ * (a) if we have a full equality condition (one equality condition
+ * on each column) and we found a match in the MCV list, this is
+ * the selectivity (and it's supposed to be exact)
+ *
+ * (b) if we have a full equality condition and we haven't found a
+ * match in the MCV list, then the selectivity is below the
+ * lowest selectivity in the MCV list
+ *
+ * (c) if we have a equality condition (not full), we can still
+ * search the MCV for matches and use the sum of probabilities
+ * as a lower boundary for the histogram (if there are no
+ * matches in the MCV list, then we have no boundary)
+ *
+ * Third, if there are multiple multivariate stats for a set of
+ * clauses, we may compute all of them and then somehow aggregate
+ * them - e.g. by choosing the minimum, median or average. The
+ * multi-variate stats are susceptible to overestimation (because
+ * we take 50% of the bucket for partial matches). Some stats may
+ * give better estimates than others, but it's very difficult to
+ * say determine that in advance which one is the best (it depends
+ * on the number of buckets, number of additional columns not
+ * referenced in the clauses etc.) so we may compute all and then
+ * choose a sane aggregation (minimum seems like a good approach).
+ * Of course, this may result in longer / more expensive estimation
+ * (CPU-wise), but it may be worth it.
+ *
+ * There are ways to address this, though. First, it's possible to
+ * add a GUC choosing whether to do a 'simple' (using a single
+ * stats expected to give the best estimate) and 'complex' (combining
+ * the multiple estimates).
+ *
+ * multivariate_estimates = (simple|full)
+ *
+ * Also, this might be enabled at a table level, by something like
+ *
+ * ALTER TABLE ... SET STATISTICS (simple|full)
+ *
+ * Which would make it possible to use this only for the tables
+ * where the simple approach does not work.
+ *
+ * Also, there are ways to optimize this algorithmically. E.g. we
+ * may try to get an estimate from a matching MCV list first, and
+ * if we happen to get a "full equality match" we may stop computing
+ * the estimates from other stats (for this condition) because
+ * that's probably the best estimate we can really get.
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive).
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
+{
+ bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
+}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ *
+ */
+static Bitmapset *
+collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid, Oid *relid)
+{
+ Index varno = 0;
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ is_mv_compatible(clause, varRelid, &varno, &attnums);
+ }
+
+ /*
+ * If there are at least two attributes referenced by the clause(s),
+ * fetch the relation info (and pass back the Oid of the relation).
+ */
+ if (bms_num_members(attnums) > 1)
+ {
+ RelOptInfo *rel = find_base_rel(root, varno);
+ *relid = root->simple_rte_array[bms_singleton_member(rel->relids)]->relid;
+ }
+ else
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ *relid = InvalidOid;
+ }
+
+ return attnums;
+}
+
+/*
+ * We're looking for a histogram matching at least 2 attributes, and we
+ * want the smallest histogram available wrt. to number of buckets (to
+ * get efficient estimation and likely better precision. The precision
+ * depends on the total number of buckets too, but the lower the number
+ * of dimensions the smaller (and more precise) the buckets can get.
+ */
+static int
+choose_mv_histogram(int nmvstats, MVStats mvstats, Bitmapset *attnums)
+{
+ int i, j;
+
+ int choice = -1;
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ int matches = 0; /* columns matching this histogram */
+
+ int2vector * attrs = mvstats[i].stakeys;
+ int numattrs = mvstats[i].stakeys->dim1;
+
+ /* count columns covered by the histogram */
+ for (j = 0; j < numattrs; j++)
+ if (bms_is_member(attrs->values[j], attnums))
+ matches++;
+
+ /*
+ * Use this histogram when it improves the number of matches or
+ * when it keeps the number of matches and is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = i;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen histogram, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(List *clauses, Oid varRelid, List **mvclauses, MVStats mvstats)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ RestrictInfo *rinfo;
+ Node *clause = (Node *) lfirst(l);
+
+ /*
+ * Only restrictinfo may be mv-compatible, so everything else
+ * goes to the non-mv list directly
+ *
+ * TODO create a macro/function to decide mv-compatible clauses
+ * (along the is_opclause for example)
+ */
+ if (! IsA(clause, RestrictInfo))
+ {
+ non_mvclauses = lappend(non_mvclauses, clause);
+ continue;
+ }
+
+ rinfo = (RestrictInfo *) clause;
+ clause = (Node*)rinfo->clause;
+
+ /* Pseudoconstants go directly to the non-mv list too. */
+ if (rinfo->pseudoconstant)
+ {
+ non_mvclauses = lappend(non_mvclauses, rinfo);
+ continue;
+ }
+
+ if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ {
+ non_mvclauses = lappend(non_mvclauses, rinfo);
+ continue;
+ }
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ case F_EQSEL:
+ if (! IS_SPECIAL_VARNO(var->varno)) /* FIXME necessary here? */
+ {
+ bool match = false;
+ for (i = 0; i < numattrs; i++)
+ if (attrs->values[i] == var->varattno)
+ match = true;
+
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, rinfo);
+ }
+ }
+ }
+ }
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ */
+static bool
+is_mv_compatible(Node *clause, Oid varRelid, Index *varno, Bitmapset **attnums)
+{
+
+ if (IsA(clause, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* get the actual clause from the RestrictInfo ... */
+ clause = (Node*)rinfo->clause;
+
+ /* is it 'variable op constant' ? */
+ if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ case F_EQSEL:
+ *varno = var->varno;
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return true;
+ }
+ }
+ }
+ }
+
+ return false;
+
+}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * it's assumed we can skip computing the estimate from histogram,
+ * because all the rows matching the condition are represented by the
+ * MCV item.
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram.
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStats mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ ListCell * l;
+ char * mcvitems = NULL;
+ MCVList mcvlist = NULL;
+
+ Bitmapset *matches = NULL; /* attributes with equality matches */
+
+ /* there's no MCV list yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = deserialize_mv_mcvlist(fetch_mv_mcvlist(mvstats->mvoid));
+
+ Assert(mcvlist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ mcvitems = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(mcvitems, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* no match here */
+ *lowsel = 1.0;
+
+ /* loop through the list of MV-compatible clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ /* operator */
+ FmgrInfo opproc;
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo ltproc, gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, mvstats->stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /* process the MCV list first */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool tmp;
+ MCVItem item = mcvlist->items[i];
+
+ /* find the lowest selectivity in the MCV */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /* skip MCV items already ruled out */
+ if (mcvitems[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* TODO consider bsearch here (list is sorted by values)
+ * TODO handle other operators too (LT, GT)
+ * TODO identify "full match" when the clauses fully
+ * match the whole MCV list (so that checking the
+ * histogram is not needed)
+ */
+ if (get_oprrest(expr->opno) == F_EQSEL)
+ {
+ /*
+ * We don't care about isgt in equality, because it does not matter
+ * whether it's (var = const) or (const = var).
+ */
+ if (memcmp(&cst->constvalue, &item->values[idx], sizeof(Datum)) != 0)
+ mcvitems[i] = MVSTATS_MATCH_NONE;
+ else
+ matches = bms_add_member(matches, idx);
+ }
+ else if (get_oprrest(expr->opno) == F_SCALARLTSEL) /* column < constant */
+ {
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (tmp)
+ {
+ mcvitems[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARLTSEL) */
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ mcvitems[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+ }
+ }
+ else if (get_oprrest(expr->opno) == F_SCALARGTSEL) /* column > constant */
+ {
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+ if (tmp)
+ {
+ mcvitems[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ mcvitems[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARGTSEL) */
+
+ }
+ }
+ }
+
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ if (mcvitems[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ *fullmatch = (bms_num_members(matches) == mcvlist->ndimensions);
+
+ pfree(mcvitems);
+ pfree(mcvlist);
+
+ return s;
+}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram list for the stats, the function returns 0.0.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStats mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ ListCell * l;
+ char *buckets = NULL;
+ MVHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = deserialize_mv_histogram(fetch_mv_histogram(mvstats->mvoid));
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ buckets = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(buckets, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ *
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR
+ | TYPECACHE_GT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, mvstats->stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ bool tmp;
+ MVBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if (buckets[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ *
+ * FIXME Once the min/max values are deduplicated, we can easily minimize
+ * the number of calls to the comparator (assuming we keep the
+ * deduplicated structure). See the note on compression at MVBucket
+ * serialize/deserialize methods.
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL: /* column < constant */
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->max[idx]));
+
+ if (tmp)
+ buckets[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->min[idx],
+ cst->constvalue));
+
+ if (tmp)
+ buckets[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+ break;
+
+ case F_SCALARGTSEL: /* column > constant */
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->max[idx]));
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+
+ if (tmp)
+ buckets[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->min[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+
+ if (tmp)
+ buckets[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the lt/gt
+ * operators fetched from type cache.
+ *
+ * TODO We'll use the default 50% estimate, but that's probably way off
+ * if there are multiple distinct values. Consider tweaking this a
+ * somehow, e.g. using only a part inversely proportional to the
+ * estimated number of distinct values in the bucket.
+ *
+ * TODO This does not handle inclusion flags at the moment, thus counting
+ * some buckets twice (when hitting the boundary).
+ *
+ * TODO Optimization is that if max[i] == min[i], it's effectively a MCV
+ * item and we can count the whole bucket as a complete match (thus
+ * using 100% bucket selectivity and not just 50%).
+ *
+ * TODO Technically some buckets may "degenerate" into single-value
+ * buckets (not necessarily for all the dimensions) - maybe this
+ * is better than keeping a separate MCV list (multi-dimensional).
+ * Update: Actually, that's unlikely to be better than a separate
+ * MCV list for two reasons - first, it requires ~2x the space
+ * (because of storing lower/upper boundaries) and second because
+ * the buckets are ranges - depending on the partitioning algorithm
+ * it may not even degenerate into (min=max) bucket. For example the
+ * the current partitioning algorithm never does that.
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* constvalue < min */
+ continue;
+ }
+
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* constvalue > max */
+ continue;
+ }
+
+ /* partial match */
+ buckets[i] = MVSTATS_MATCH_PARTIAL;
+
+ break;
+ }
+ }
+ }
+ }
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ if (buckets[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (buckets[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+ return s;
+}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 77d2f29..038c878 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -365,6 +365,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
create_generic_options alter_generic_options
relation_expr_list dostmt_opt_list
+%type <list> OptStatsOptions
+%type <str> stats_options_name
+%type <node> stats_options_arg
+%type <defelt> stats_options_elem
+%type <list> stats_options_list
+
+
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -483,7 +490,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <keyword> unreserved_keyword type_func_name_keyword
%type <keyword> col_name_keyword reserved_keyword
-%type <node> TableConstraint TableLikeClause
+%type <node> TableConstraint TableLikeClause TableStatistics
%type <ival> TableLikeOptionList TableLikeOption
%type <list> ColQualList
%type <node> ColConstraint ColConstraintElem ConstraintAttr
@@ -2327,6 +2334,14 @@ alter_table_cmd:
n->subtype = AT_DisableRowSecurity;
$$ = (Node *)n;
}
+ /* ALTER TABLE <name> ADD STATISTICS (options) ON (columns) ... */
+ | ADD_P TableStatistics
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_AddStatistics;
+ n->def = $2;
+ $$ = (Node *)n;
+ }
| alter_generic_options
{
AlterTableCmd *n = makeNode(AlterTableCmd);
@@ -3397,6 +3412,56 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * ALTER TABLE relname ADD STATISTICS (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+TableStatistics:
+ STATISTICS OptStatsOptions ON '(' columnList ')'
+ {
+ StatisticsDef *n = makeNode(StatisticsDef);
+ n->keys = $5;
+ n->options = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
+OptStatsOptions:
+ '(' stats_options_list ')' { $$ = $2; }
+ | /*EMPTY*/ { $$ = NIL; }
+ ;
+
+stats_options_list:
+ stats_options_elem
+ {
+ $$ = list_make1($1);
+ }
+ | stats_options_list ',' stats_options_elem
+ {
+ $$ = lappend($1, $3);
+ }
+ ;
+
+stats_options_elem:
+ stats_options_name stats_options_arg
+ {
+ $$ = makeDefElem($1, $2);
+ }
+ ;
+
+stats_options_name:
+ NonReservedWord { $$ = $1; }
+ ;
+
+stats_options_arg:
+ opt_boolean_or_string { $$ = (Node *) makeString($1); }
+ | NumericOnly { $$ = (Node *) $1; }
+ | /* EMPTY */ { $$ = NULL; }
+ ;
+
/*****************************************************************************
*
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 94d951c..ec90773 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -499,6 +500,17 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 128
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index 870692c..d57cdbe 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,11 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3259, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3259
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3264, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3264
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..703931e
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,89 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3260
+
+CATALOG(pg_mv_statistic,3260)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+
+ /* statistics requested to build */
+ bool hist_enabled; /* build histogram? */
+ bool mcv_enabled; /* build MCV list? */
+ bool mcv_hashed; /* build hashed MCV? */
+ bool assoc_enabled; /* analyze associations? */
+
+ /* histogram / MCV size */
+ int32 hist_max_buckets; /* max buckets */
+ int32 mcv_max_items; /* max MCV items */
+
+ /* statistics that are available (if requested) */
+ bool hist_built; /* histogram was built */
+ bool mcv_built; /* MCV list was built */
+ bool assoc_built; /* associations were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea staassoc; /* association rules (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_attrdef
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 14
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_hist_enabled 2
+#define Anum_pg_mv_statistic_mcv_enabled 3
+#define Anum_pg_mv_statistic_mcv_hashed 4
+#define Anum_pg_mv_statistic_assoc_enabled 5
+#define Anum_pg_mv_statistic_hist_max_buckets 6
+#define Anum_pg_mv_statistic_mcv_max_items 7
+#define Anum_pg_mv_statistic_hist_built 8
+#define Anum_pg_mv_statistic_mcv_built 9
+#define Anum_pg_mv_statistic_assoc_built 10
+#define Anum_pg_mv_statistic_stakeys 11
+#define Anum_pg_mv_statistic_staassoc 12
+#define Anum_pg_mv_statistic_stamcv 13
+#define Anum_pg_mv_statistic_stahist 14
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 3ce9849..6961b7c 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2647,6 +2647,13 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s 2 0 16 "26 25" _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3261 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3262 ( pg_mv_stats_mvclist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_mvclist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3263 ( pg_mv_stats_histogram_gnuplot PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_histogram_gnuplot _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: 2D histogram gnuplot");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index a4af551..02b9aa3 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3952, 3954);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 154d943..36e675b 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -410,6 +410,7 @@ typedef enum NodeTag
T_XmlSerialize,
T_WithClause,
T_CommonTableExpr,
+ T_StatisticsDef,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index f3aa69e..e7ed773 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -542,6 +542,14 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct StatisticsDef
+{
+ NodeTag type;
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+} StatisticsDef;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1337,7 +1345,8 @@ typedef enum AlterTableType
AT_ReplicaIdentity, /* REPLICA IDENTITY */
AT_EnableRowSecurity, /* ENABLE ROW SECURITY */
AT_DisableRowSecurity, /* DISABLE ROW SECURITY */
- AT_GenericOptions /* OPTIONS (...) */
+ AT_GenericOptions, /* OPTIONS (...) */
+ AT_AddStatistics /* add statistics */
} AlterTableType;
typedef struct ReplicaIdentityStmt
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..157891a
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,283 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+/*
+ * Multivariate statistics for planner/optimizer, implementing extensions
+ * of the single-column statistics:
+ *
+ * - multivariate MCV list
+ * - multivariate histograms
+ *
+ * There's also an experimental support for associative rules (values in
+ * one column implying values in other columns - e.g. ZIP code implies
+ * name of a city, etc.).
+ *
+ * The current implementation has various limitations:
+ *
+ * (a) it supports only data types passed by value
+ *
+ * (b) no support for NULL values
+ *
+ * Both (a) and (b) should be straightforward to fix (and usually
+ * described in comments at related data structures or functions).
+ *
+ * The stats may be built only directly on columns, not on expressions.
+ * And there are usually some additional technical limits (e.g. number
+ * of columns in a histogram, etc.).
+ *
+ * Those limits serve mostly as sanity checks and while increasing them
+ * is possible (the implementation should not break), it's expected to
+ * lead either to very bad precision or expensive planning.
+ */
+
+/*
+ * Multivariate histograms
+ *
+ * Histograms are a collection of buckets, represented by n-dimensional
+ * rectangles. Each rectangle is delimited by an array of lower and
+ * upper boundaries, so that for for the i-th attribute
+ *
+ * min[i] <= value[i] <= max[i]
+ *
+ * Each bucket tracks frequency (fraction of tuples it contains),
+ * information about the inequalities, number of distinct values in
+ * each dimension (which is used when building the histogram) etc.
+ *
+ * The boundaries may be either inclusive or exclusive, or the whole
+ * dimension may be NULL.
+ *
+ * The buckets may overlap (assuming the build algorithm keeps the
+ * frequencies additive) or may not cover the whole space (i.e. allow
+ * gaps). This entirely depends on the algorithm used to build the
+ * histogram.
+ *
+ * The histograms are marked with a 'magic' constant, mostly to make
+ * sure the bytea really is a histogram in serialized form.
+ *
+ * We do expect to support multiple histogram types, with different
+ * features etc. The 'type' field is used to identify those types.
+ * Technically some histogram types might use completely different
+ * bucket representation, but that's not expected at the moment.
+ *
+ * TODO Add pointer to 'private' data, meant for private data for
+ * other algorithms for building the histogram.
+ *
+ * TODO The current implementation does not handle NULL values (it's
+ * somehow prepared for that, but the algorithm building the
+ * histogram ignores them). The idea is to build buckets with one
+ * or more NULL-only dimensions - there'll be at most 2^ndimensions
+ * such buckets, which for 8 atttributes (current limit) is 256.
+ * That's quite reasonable, considering we expect thousands of
+ * buckets in total.
+ *
+ * TODO This structure is used both when building the histogram, and
+ * then when using it to compute estimates. That's why the last
+ * few elements are not used once the histogram is built.
+ *
+ * TODO The limit on number of buckets is quite arbitrary, aiming for
+ * sufficient accuracy while still being fast. Probably should be
+ * replaced with a dynamic limit dependent on statistics target,
+ * number of attributes (dimensions) and statistics target
+ * associated with the attributes. Also, this needs to be related
+ * to the number of sampled rows, by either clamping it to a
+ * reasonable number (after seeing the number of rows) or using
+ * it when computing the number of rows to sample. Something like
+ * 10 rows per bucket seems reasonable.
+ *
+ * TODO We may replace the bool arrays with a suitably large data type
+ * (say, uint16 or uint32) and get rid of the allocations. It's
+ * unlikely we'll ever support more than 32 columns as that'd
+ * result in poor precision, huge histograms (splitting each
+ * dimension once would mean 2^32 buckets), and very expensive
+ * estimation. MCVItem already does it this way.
+ *
+ * TODO Actually the distinct stats (both for combination of all columns
+ * and for combinations of various subsets of columns) should be
+ * moved to a separate structure (next to histogram/MCV/...) to
+ * make it useful even without a histogram computed etc.
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+ float ndistinct; /* frequency of distinct values */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized), but
+ * it could be useful for estimating ndistinct for combinations of
+ * columns.
+ *
+ * It would mean tracking 2^N values for each bucket, and even if
+ * those values might be stores in 1B it's still a lot of space
+ * (considering the expected number of buckets).
+ *
+ * TODO Consider tracking ndistincts for all attribute combinations.
+ */
+ uint32 *ndistincts;
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /*
+ * Sample tuples falling into this bucket, index of the dimension
+ * the bucket was split by in the last step.
+ *
+ * XXX These fields are needed only while building the histogram,
+ * and are not serialized at all.
+ */
+ HeapTuple *rows;
+ uint32 numrows;
+ int last_split_dimension;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVHIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVHIST_TYPE_BASIC 1 /* basic histogram type */
+
+/* limits (mostly sanity check, may be relaxed in the future) */
+#define MVHIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/* bucket size in a serialized form */
+#define BUCKET_SIZE_SERIALIZED(ndims) \
+ (offsetof(MVBucketData, ndistincts) + \
+ (ndims) * (2 * sizeof(uint16) + sizeof(uint32) + 3 * sizeof(bool)))
+
+
+/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ *
+ * This already uses the trick with using uint32 as a null bitmap.
+ *
+ * TODO Shouldn't the MCVItemData use plain pointer for values, instead
+ * of the single-item array trick?
+ *
+ * TODO It's possible to build a special case of MCV list, storing not
+ * the actual values but only 32/64-bit hash. This is only useful
+ * for estimating equality clauses and for large varlena types.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ uint32 nulls; /* lags of NULL values (up to 32 columns) */
+ Datum values[1]; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/* TODO consider increasing the limit, and/or using statistics target */
+#define MVSTAT_MCVLIST_MAX_ITEMS 1024 /* max items in MCV list */
+
+
+/*
+ * Basic info about the stats, used when choosing what to use
+ *
+ * TODO Add info about what statistics is available (histogram, MCV,
+ * hashed MCV, assciative rules).
+ */
+typedef struct MVStatsData {
+ Oid mvoid; /* OID of the stats in pg_mv_statistic */
+ int2vector *stakeys; /* attnums for columns in the stats */
+ bool hist_built; /* histogram is already available */
+ bool mcv_built; /* MCV list is already available */
+ bool assoc_built; /* associative rules available */
+} MVStatsData;
+
+typedef struct MVStatsData *MVStats;
+
+
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+MVStats list_mv_stats(Oid relid, int *nstats, bool built_only);
+bytea * fetch_mv_histogram(Oid mvoid);
+bytea * fetch_mv_mcvlist(Oid mvoid);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVHistogram deserialize_mv_histogram(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_gnuplot(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mvclist_info(PG_FUNCTION_ARGS);
+
+#endif
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index f97229f..a275bd5 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,7 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,Tomas Vondra wrote:
attached is a WIP patch implementing multivariate statistics.
I think that is pretty useful.
Oracle has an identical feature called "extended statistics".
That's probably an entirely different thing, but it would be very
nice to have statistics to estimate the correlation between columns
of different tables, to improve the estimate for the number of rows
in a join.
Yours,
Laurenz Albe
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi!
On 13.10.2014 09:36, Albe Laurenz wrote:
Tomas Vondra wrote:
attached is a WIP patch implementing multivariate statistics.
I think that is pretty useful.
Oracle has an identical feature called "extended statistics".That's probably an entirely different thing, but it would be very
nice to have statistics to estimate the correlation between columns
of different tables, to improve the estimate for the number of rows
in a join.
I don't have a clear idea of how that should work, but from the quick
look at how join selectivity estimation is implemented, I believe two
things might be possible:
(a) using conditional probabilities
Say we have a join "ta JOIN tb ON (ta.x = tb.y)"
Currently, the selectivity is derived from stats on the two keys.
Essentially probabilities P(x), P(y), represented by the MCV lists.
But if there are additional WHERE conditions on the tables, and we
have suitable multivariate stats, it's possible to use conditional
probabilities.
E.g. if the query actually uses
... ta JOIN tb ON (ta.x = tb.y) WHERE ta.z = 10
and we have stats on (ta.x, ta.z), we can use P(x|z=10) instead.
If the two columns are correlated, this might be much different.
(b) using this for multi-column conditions
If the join condition involves multiple columns, e.g.
ON (ta.x = tb.y AND ta.p = tb.q)
and we happen to have stats on (ta.x,ta.p) and (tb.y,tb.q), we may
use this to compute the cardinality (pretty much as we do today).
But I haven't really worked on this so far, I suspect there are various
subtle issues and I certainly don't plan to address this in the first
phase of the patch.
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 13, 2014 at 11:00 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
Hi,
attached is a WIP patch implementing multivariate statistics. The code
certainly is not "ready" - parts of it look as if written by a rogue
chimp who got bored of attempts to type the complete works of William
Shakespeare, and decided to try something different.
I'm really glad you're working on this. I had been thinking of looking into
doing this myself.
The last point is really just "unfinished implementation" - the syntax I
propose is this:ALTER TABLE ... ADD STATISTICS (options) ON (columns)
where the options influence the MCV list and histogram size, etc. The
options are recognized and may give you an idea of what it might do, but
it's not really used at the moment (except for storing in the
pg_mv_statistic catalog).
I've not really gotten around to looking at the patch yet, but I'm also
wondering if it would be simple include allowing functional statistics too.
The pg_mv_statistic name seems to indicate multi columns, but how about
stats on date(datetime_column), or perhaps any non-volatile function. This
would help to solve the problem highlighted here
/messages/by-id/CAApHDvp2vH=7O-gp-zAf7aWy+A-WHWVg7h3Vc6=5pf9Uf34DhQ@mail.gmail.com
. Without giving it too much thought, perhaps any expression that can be
indexed should be allowed to have stats? Would that be really difficult to
implement in comparison to what you've already done with the patch so far?
I'm quite interested in reviewing your work on this, but it appears that
some of your changes are not C89:
src\backend\commands\analyze.c(3774): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(3774): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(3774): error C2133: 'indexes' : unknown
size [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2133: 'ndistincts' : unknown
size [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2133: 'keys' : unknown size
[D:\Postgres\a\postgres.vcxproj]
The compiler I'm using is a bit too stupid to understand the C99 syntax.
I guess you'd need to palloc() these arrays instead in order to comply with
the project standards.
http://www.postgresql.org/docs/devel/static/install-requirements.html
I'm going to sign myself up to review this, so probably my first feedback
would be the compiling problem.
Regards
David Rowley
Dne 29 Říjen 2014, 10:41, David Rowley napsal(a):
I've not really gotten around to looking at the patch yet, but I'm also
wondering if it would be simple include allowing functional statistics
too.
The pg_mv_statistic name seems to indicate multi columns, but how about
stats on date(datetime_column), or perhaps any non-volatile function. This
would help to solve the problem highlighted here
/messages/by-id/CAApHDvp2vH=7O-gp-zAf7aWy+A-WHWVg7h3Vc6=5pf9Uf34DhQ@mail.gmail.com
. Without giving it too much thought, perhaps any expression that can be
indexed should be allowed to have stats? Would that be really difficult to
implement in comparison to what you've already done with the patch so far?
I don't know, but it seems mostly orthogonal to what the patch aims to do.
If we add collecting statistics on expressions (on a single column), then I'd
expect it to be reasonably simple to add this to the multi-column case.
There are features like join stats or range type stats, that are probably
more directly related to the patch (but out of scope for the initial
version).
I'm quite interested in reviewing your work on this, but it appears that
some of your changes are not C89:src\backend\commands\analyze.c(3774): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(3774): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(3774): error C2133: 'indexes' : unknown
size [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2133: 'ndistincts' : unknown
size [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2133: 'keys' : unknown size
[D:\Postgres\a\postgres.vcxproj]The compiler I'm using is a bit too stupid to understand the C99 syntax.
I guess you'd need to palloc() these arrays instead in order to comply
with
the project standards.http://www.postgresql.org/docs/devel/static/install-requirements.html
I'm going to sign myself up to review this, so probably my first feedback
would be the compiling problem.
I'll look into that. The thing is I don't have access to MSVC, so it's a bit
difficult to spot / fix those issues :-(
regards
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 29/10/14 10:41, David Rowley wrote:
On Mon, Oct 13, 2014 at 11:00 AM, Tomas Vondra <tv@fuzzy.cz
The last point is really just "unfinished implementation" - the syntax I
propose is this:ALTER TABLE ... ADD STATISTICS (options) ON (columns)
where the options influence the MCV list and histogram size, etc. The
options are recognized and may give you an idea of what it might do, but
it's not really used at the moment (except for storing in the
pg_mv_statistic catalog).I've not really gotten around to looking at the patch yet, but I'm also
wondering if it would be simple include allowing functional statistics
too. The pg_mv_statistic name seems to indicate multi columns, but how
about stats on date(datetime_column), or perhaps any non-volatile
function. This would help to solve the problem highlighted here
/messages/by-id/CAApHDvp2vH=7O-gp-zAf7aWy+A-WHWVg7h3Vc6=5pf9Uf34DhQ@mail.gmail.com
. Without giving it too much thought, perhaps any expression that can be
indexed should be allowed to have stats? Would that be really difficult
to implement in comparison to what you've already done with the patch so
far?
I would not over-complicate requirements for the first version of this,
I think it's already complicated enough.
Quick look at the patch suggests that it mainly needs discussion about
design and particular implementation choices, there is fair amount of
TODOs and FIXMEs. I'd like to look at it too but I doubt that I'll have
time to do in depth review in this CF.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Dne 29 Říjen 2014, 12:31, Petr Jelinek napsal(a):
On 29/10/14 10:41, David Rowley wrote:
On Mon, Oct 13, 2014 at 11:00 AM, Tomas Vondra <tv@fuzzy.cz
The last point is really just "unfinished implementation" - the
syntax I
propose is this:ALTER TABLE ... ADD STATISTICS (options) ON (columns)
where the options influence the MCV list and histogram size, etc.
The
options are recognized and may give you an idea of what it might do,
but
it's not really used at the moment (except for storing in the
pg_mv_statistic catalog).I've not really gotten around to looking at the patch yet, but I'm also
wondering if it would be simple include allowing functional statistics
too. The pg_mv_statistic name seems to indicate multi columns, but how
about stats on date(datetime_column), or perhaps any non-volatile
function. This would help to solve the problem highlighted here
/messages/by-id/CAApHDvp2vH=7O-gp-zAf7aWy+A-WHWVg7h3Vc6=5pf9Uf34DhQ@mail.gmail.com
. Without giving it too much thought, perhaps any expression that can be
indexed should be allowed to have stats? Would that be really difficult
to implement in comparison to what you've already done with the patch so
far?I would not over-complicate requirements for the first version of this,
I think it's already complicated enough.
My thoughts, exactly. I'm not willing to put more features into the
initial version of the patch. Actually, I'm thinking about ripping out
some experimental features (particularly "hashed MCV" and "associative
rules").
Quick look at the patch suggests that it mainly needs discussion about
design and particular implementation choices, there is fair amount of
TODOs and FIXMEs. I'd like to look at it too but I doubt that I'll have
time to do in depth review in this CF.
Yes. I think it's a bit premature to discuss the code thoroughly at this
point - I'd like to discuss the general approach to the feature (i.e.
minimizing the impact on those not using it, etc.).
The most interesting part of the code are probably the comments,
explaining the design in more detail, known shortcomings and possible ways
to address them.
regards
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Oct 30, 2014 at 12:48 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
Dne 29 Říjen 2014, 12:31, Petr Jelinek napsal(a):
I've not really gotten around to looking at the patch yet, but I'm also
wondering if it would be simple include allowing functional statistics
too. The pg_mv_statistic name seems to indicate multi columns, but how
about stats on date(datetime_column), or perhaps any non-volatile
function. This would help to solve the problem highlighted here/messages/by-id/CAApHDvp2vH=7O-gp-zAf7aWy+A-WHWVg7h3Vc6=5pf9Uf34DhQ@mail.gmail.com
. Without giving it too much thought, perhaps any expression that can be
indexed should be allowed to have stats? Would that be really difficult
to implement in comparison to what you've already done with the patch so
far?I would not over-complicate requirements for the first version of this,
I think it's already complicated enough.My thoughts, exactly. I'm not willing to put more features into the
initial version of the patch. Actually, I'm thinking about ripping out
some experimental features (particularly "hashed MCV" and "associative
rules").
That's fair, but I didn't really mean to imply that you should go work on
that too and that it should be part of this patch..
I was thinking more along the lines of that I don't really agree with the
table name for the new stats and that at some later date someone will want
to add expression stats and we'd probably better come up design that would
be friendly towards that. At this time I can only think that the name of
the table might not suit well to expression stats, I'd hate to see someone
have to invent a 3rd table to support these when we could likely come up
with something that could be extended later and still make sense both today
and in the future.
I was just looking at how expression indexes are stored in pg_index and I
see that if it's an expression index that the expression is stored in
the indexprs column which is of type pg_node_tree, so quite possibly at
some point in the future the new stats table could just have an extra
column added, and for today, we'd just need to come up with a future proof
name... Perhaps pg_statistic_ext or pg_statisticx, and name functions and
source files something along those lines instead?
Regards
David Rowley
On Thu, Oct 30, 2014 at 12:21 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
Dne 29 Říjen 2014, 10:41, David Rowley napsal(a):
I'm quite interested in reviewing your work on this, but it appears that
some of your changes are not C89:src\backend\commands\analyze.c(3774): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(3774): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(3774): error C2133: 'indexes' : unknown
size [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2133: 'ndistincts' :unknown
size [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2133: 'keys' : unknown size
[D:\Postgres\a\postgres.vcxproj]I'll look into that. The thing is I don't have access to MSVC, so it's a
bit
difficult to spot / fix those issues :-(
It should be a pretty simple fix, just use the files and line numbers from
the above. It's just a problem that in those 3 places you're declaring an
array of a variable size, which is not allowed in C89. The thing to do
instead would just be to palloc() the size you need and the pfree() it when
you're done.
Regards
David Rowley
Dne 30 Říjen 2014, 10:17, David Rowley napsal(a):
On Thu, Oct 30, 2014 at 12:48 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
Dne 29 Říjen 2014, 12:31, Petr Jelinek napsal(a):
I've not really gotten around to looking at the patch yet, but I'm
also
wondering if it would be simple include allowing functional
statistics
too. The pg_mv_statistic name seems to indicate multi columns, but
how
about stats on date(datetime_column), or perhaps any non-volatile
function. This would help to solve the problem highlighted here/messages/by-id/CAApHDvp2vH=7O-gp-zAf7aWy+A-WHWVg7h3Vc6=5pf9Uf34DhQ@mail.gmail.com
. Without giving it too much thought, perhaps any expression that can
be
indexed should be allowed to have stats? Would that be really
difficult
to implement in comparison to what you've already done with the patch
so
far?
I would not over-complicate requirements for the first version of
this,
I think it's already complicated enough.
My thoughts, exactly. I'm not willing to put more features into the
initial version of the patch. Actually, I'm thinking about ripping out
some experimental features (particularly "hashed MCV" and "associative
rules").That's fair, but I didn't really mean to imply that you should go work on
that too and that it should be part of this patch..
I was thinking more along the lines of that I don't really agree with the
table name for the new stats and that at some later date someone will want
to add expression stats and we'd probably better come up design that would
be friendly towards that. At this time I can only think that the name of
the table might not suit well to expression stats, I'd hate to see someone
have to invent a 3rd table to support these when we could likely come up
with something that could be extended later and still make sense both
today
and in the future.I was just looking at how expression indexes are stored in pg_index and I
see that if it's an expression index that the expression is stored in
the indexprs column which is of type pg_node_tree, so quite possibly at
some point in the future the new stats table could just have an extra
column added, and for today, we'd just need to come up with a future proof
name... Perhaps pg_statistic_ext or pg_statisticx, and name functions and
source files something along those lines instead?
Ah, OK. I don't think the catalog name "pg_mv_statistic" is somehow
inappropriate for this purpose, though. IMHO the "multivariate" does not
mean "only columns" or "no expressions", it simply describes that the
approximated density function has multiple input variables, be it
attributes or expressions.
But maybe there's a better name.
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 30.10.2014 10:23, David Rowley wrote:
On Thu, Oct 30, 2014 at 12:21 AM, Tomas Vondra <tv@fuzzy.cz
<mailto:tv@fuzzy.cz>> wrote:Dne 29 Říjen 2014, 10:41, David Rowley napsal(a):
I'm quite interested in reviewing your work on this, but it
appears that
some of your changes are not C89:
src\backend\commands\analyze.c(3774): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(3774): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(3774): error C2133: 'indexes' :unknown
size [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2133: 'ndistincts' :unknown
size [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2133: 'keys' :unknown size
[D:\Postgres\a\postgres.vcxproj]
I'll look into that. The thing is I don't have access to MSVC, so
it's a bit difficult to spot / fix those issues :-(It should be a pretty simple fix, just use the files and line
numbers from the above. It's just a problem that in those 3 places
you're declaring an array of a variable size, which is not allowed in
C89. The thing to do instead would just be to palloc() the size you
need and the pfree() it when you're done.
Attached is a patch that should fix these issues.
The bad news is there are a few installcheck failures (and were in the
previous patch, but I haven't noticed for some reason). Apparently,
there's some mixup in how the patch handles Var->varno in some causes,
causing issues with a handful of regression tests.
The problem is that is_mv_compatible (checking whether the condition is
compatible with multivariate stats) does this
if (! ((varRelid == 0) || (varRelid == var->varno)))
return false;
/* Also skip special varno values, and system attributes ... */
if ((IS_SPECIAL_VARNO(var->varno)) ||
(! AttrNumberIsForUserDefinedAttr(var->varattno)))
return false;
assuming that after this, varno represents an index into the range
table, and passes it out to the caller.
And the caller (collect_mv_attnums) does this:
RelOptInfo *rel = find_base_rel(root, varno);
which fails with errors like these:
ERROR: no relation entry for relid 0
ERROR: no relation entry for relid 1880
or whatever. What's even stranger is this:
regression=# SELECT table_name, is_updatable, is_insertable_into
regression-# FROM information_schema.views
regression-# WHERE table_name = 'rw_view1';
ERROR: no relation entry for relid 0
regression=# SELECT table_name, is_updatable, is_insertable_into
regression-# FROM information_schema.views
regression-# ;
regression=# SELECT table_name, is_updatable, is_insertable_into
regression-# FROM information_schema.views
regression-# WHERE table_name = 'rw_view1';
table_name | is_updatable | is_insertable_into
------------+--------------+--------------------
(0 rows)
regression=# explain SELECT table_name, is_updatable, is_insertable_into
FROM information_schema.views
WHERE table_name = 'rw_view1';
ERROR: no relation entry for relid 0
So, the query fails. After removing the WHERE clause it works, and this
somehow fixes the original query (with the WHERE clause). Nevertheless,
I still can't do explain on the query.
Clearly, I'm doing something wrong. I suspect it's caused either by
conditions involving function calls, or the fact that the view is a join
of multiple tables. But what?
For simple queries (single table, ...) it seems to be working fine.
regards
Tomas
Attachments:
multivar-stats-v2.patchtext/x-diff; name=multivar-stats-v2.patchDownload
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index b257b02..6e63afe 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a819952..bb82fe8 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -152,6 +152,18 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.stakeys AS attnums,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mvclist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 954e5a6..32e0d07 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -54,7 +55,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Data structure for Algorithm S from Knuth 3.4.2 */
typedef struct
@@ -111,6 +116,62 @@ static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
+/* multivariate statistics (histogram, MCV list, associative rules) */
+
+static void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+static void update_mv_stats(Oid relid,
+ MVHistogram histogram, MCVList mcvlist);
+
+/* multivariate histograms */
+static MVHistogram build_mv_histogram(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ int attr_cnt, VacAttrStats **vacattrstats,
+ int numrows_total);
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs, int natts,
+ VacAttrStats **vacattrstats);
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+/* multivariate MCV list */
+static MCVList build_mv_mcvlist(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats,
+ int *numrows_filtered);
+
+/* multivariate associative rules */
+static void build_mv_associations(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+/* serialization */
+static bytea * serialize_mv_histogram(MVHistogram histogram);
+static bytea * serialize_mv_mcvlist(MCVList mcvlist);
+
+/* comparators, used when constructing multivariate stats */
+static int compare_scalars_simple(const void *a, const void *b, void *arg);
+static int compare_scalars_partition(const void *a, const void *b, void *arg);
+static int compare_scalars_memcmp(const void *a, const void *b, void *arg);
+static int compare_scalars_memcmp_2(const void *a, const void *b);
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+/* some debugging methods */
+#ifdef MVSTATS_DEBUG
+static void print_mv_histogram_info(MVHistogram histogram);
+#endif
+
+
/*
* analyze_rel() -- analyze one relation
*/
@@ -472,6 +533,13 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For small number of dimensions it works, but
+ * for complex stats it'd be nice use sample proportional to
+ * the table (say, 0.5% - 1%) instead of a fixed size.
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -574,6 +642,9 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
@@ -2815,3 +2886,1985 @@ compare_mcvs(const void *a, const void *b)
return da - db;
}
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+static void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ MVStats mvstats;
+ int nmvstats;
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ *
+ * TODO move this to a separate method or something ...
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel), &nmvstats, false);
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ MCVList mcvlist = NULL;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = mvstats[i].stakeys;
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze associations between pairs of columns.
+ *
+ * FIXME store the identified associations back to pg_mv_statistic
+ */
+ build_mv_associations(numrows, rows, attrs, natts, vacattrstats);
+
+ /* build the MCV list */
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, natts, vacattrstats, &numrows_filtered);
+
+ /*
+ * Build a multivariate histogram on the columns.
+ *
+ * FIXME remove the rows used to build the MCV from the histogram.
+ * Another option might be subtracting the MCV selectivities
+ * from the histogram, but I'm not sure whether that works
+ * accurately (maybe it introduces additional errors).
+ */
+ if (numrows_filtered > 0)
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, natts, vacattrstats, numrows);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(mvstats[i].mvoid, histogram, mcvlist);
+
+#ifdef MVSTATS_DEBUG
+ print_mv_histogram_info(histogram);
+#endif
+
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(stats[i]->compute_stats == compute_scalar_stats);
+
+ /* TODO remove the 'pass by value' requirement */
+ Assert(stats[i]->attrtype->typbyval);
+ }
+
+ return stats;
+}
+
+/*
+ * TODO Add ndistinct estimation, probably the one described in "Towards
+ * Estimation Error Guarantees for Distinct Values, PODS 2000,
+ * p. 268-279" (the ones called GEE, or maybe AE).
+ *
+ * TODO The "combined" ndistinct is more likely to scale with the number
+ * of rows (in the table), because a single column behaving this
+ * way is sufficient for such behavior.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* info for the interesting attributes only */
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* resulting bucket */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+ bucket->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /*
+ * All the sample rows fall into the initial bucket.
+ *
+ * FIXME This is wrong (unless all columns are NOT NULL), because we
+ * skipped the NULL values.
+ */
+ bucket->numrows = numrows;
+ bucket->ntuples = numrows;
+ bucket->rows = rows;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ /*
+ * The initial bucket was not split at all, so we'll start with the
+ * first dimension in the next round (index = 0).
+ */
+ bucket->last_split_dimension = -1;
+
+ return bucket;
+}
+
+/*
+ * TODO Fix to handle arbitrarily-sized histograms (not just 2D ones)
+ * and call the right output procedures (for the particular type).
+ *
+ * TODO This should somehow fetch info about the data types, and use
+ * the appropriate output functions to print the boundary values.
+ * Right now this prints the 8B value as an integer.
+ *
+ * TODO Also, provide a special function for 2D histogram, printing
+ * a gnuplot script (with rectangles).
+ *
+ * TODO For string types (once supported) we can sort the strings first,
+ * assign them a sequence of integers and use the original values
+ * as labels.
+ */
+#ifdef MVSTATS_DEBUG
+static void
+print_mv_histogram_info(MVHistogram histogram)
+{
+ int i = 0;
+
+ elog(WARNING, "histogram nbuckets=%d", histogram->nbuckets);
+
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ MVBucket bucket = histogram->buckets[i];
+ elog(WARNING, " bucket %d : ndistinct=%f ntuples=%d min=[%ld, %ld], max=[%ld, %ld] distinct=[%d,%d]",
+ i, bucket->ndistinct, bucket->numrows,
+ bucket->min[0], bucket->min[1], bucket->max[0], bucket->max[1],
+ bucket->ndistincts[0], bucket->ndistincts[1]);
+ }
+}
+#endif
+
+/*
+ * A very simple partitioning selection criteria - choose the bucket
+ * with the highest number of distinct values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int ndistinct = 1; /* if ndistinct=1, we can't split the bucket */
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* if the ndistinct count is higher, use this bucket */
+ if (buckets[i]->ndistinct > ndistinct) {
+ bucket = buckets[i];
+ ndistinct = buckets[i]->ndistinct;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - splits the dimensions in
+ * a round-robin manner (considering only those with ndistinct>1). That
+ * is first a dimension 0 is split, then 1, 2, ... until reaching the
+ * end of attribute list, and then wrapping back to 0. Of course,
+ * dimensions with a single distinct value are skipped.
+ *
+ * This is essentially what Muralikrishna/DeWitt described in their SIGMOD
+ * article (M. Muralikrishna, David J. DeWitt: Equi-Depth Histograms For
+ * Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD
+ * Conference 1988: 28-36).
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * This splits the bucket by tweaking the existing one, and returning the
+ * new bucket (essentially shrinking the existing one in-place and returning
+ * the other "half" as a new bucket). The caller is responsible for adding
+ * the new bucket into the list of buckets.
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case of
+ * strongly dependent columns - e.g. y=x).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g. to
+ * split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(bucket->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = bucket->rows;
+ int oldnrows = bucket->numrows;
+
+ /* info for the interesting attributes only */
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(bucket->ndistinct > 1);
+ Assert(bucket->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split, in a round robin manner.
+ * We'll use the first one with (ndistinct > 1).
+ *
+ * If we happen to wrap around, something clearly went wrong (we
+ * can't mess with the last_split_dimension directly, because we
+ * couldn't do this check).
+ */
+ dimension = bucket->last_split_dimension;
+ while (true)
+ {
+ dimension = (dimension + 1) % numattrs;
+
+ if (bucket->ndistincts[dimension] > 1)
+ break;
+
+ /* if we ran the previous split dimension, it's infinite loop */
+ Assert(dimension != bucket->last_split_dimension);
+ }
+
+ /* Remember the dimension for the next split of this bucket. */
+ bucket->last_split_dimension = dimension;
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < bucket->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(bucket->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ split_value = values[0].value;
+ for (i = 1; i < bucket->numrows; i++)
+ {
+ /* count distinct values */
+ if (values[i].value != values[i-1].value)
+ ndistinct += 1;
+
+ /* once we've seen 1/2 distinct values (and use the value) */
+ if (ndistinct > bucket->ndistincts[dimension] / 2)
+ {
+ split_value = values[i].value;
+ break;
+ }
+
+ /* keep track how many rows belong to the first bucket */
+ nrows += 1;
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < bucket->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ bucket->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_bucket->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ bucket->numrows = nrows;
+ new_bucket->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&bucket->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_bucket->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ new_bucket->last_split_dimension = bucket->last_split_dimension;
+
+ /* allocate the per-dimension arrays */
+ new_bucket->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * FIXME Make this work with all types (not just pass-by-value ones).
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j, idx = 0;
+ int numattrs = attrs->dim1;
+ Size len = sizeof(Datum) * numattrs;
+ bool isNull;
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ Datum * values = palloc0(bucket->numrows * numattrs * sizeof(Datum));
+
+ for (j = 0; j < bucket->numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ values[idx++] = heap_getattr(bucket->rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isNull);
+
+ qsort_arg((void *) values, bucket->numrows, sizeof(Datum) * numattrs,
+ compare_scalars_memcmp, &len);
+
+ bucket->ndistinct = 1;
+
+ for (i = 1; i < bucket->numrows; i++)
+ if (memcmp(&values[i * numattrs], &values[(i-1) * numattrs], len) != 0)
+ bucket->ndistinct += 1;
+
+ pfree(values);
+
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ *
+ * TODO Remove unnecessary parameters - don't pass in the whole arrays,
+ * just the proper elements.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ Datum * values = (Datum*)palloc0(bucket->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < bucket->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(bucket->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ bucket->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs etc.).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ bucket->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+MVStats
+list_mv_stats(Oid relid, int *nstats, bool built_only)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ MVStats result;
+
+ /* start with 16 items, that should be enough for most cases */
+ int maxitems = 16;
+ result = (MVStats)palloc0(sizeof(MVStatsData) * maxitems);
+ *nstats = 0;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /*
+ * Skip statistics that were not computed yet (if only stats
+ * that were already built were requested)
+ */
+ if (built_only && (! (stats->hist_built || stats->mcv_built || stats->assoc_built)))
+ continue;
+
+ /* double the array size if needed */
+ if (*nstats == maxitems)
+ {
+ maxitems *= 2;
+ result = (MVStats)repalloc(result, sizeof(MVStatsData) * maxitems);
+ }
+
+ result[*nstats].mvoid = HeapTupleGetOid(htup);
+ result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ result[*nstats].hist_built = stats->hist_built;
+ result[*nstats].mcv_built = stats->mcv_built;
+ result[*nstats].assoc_built = stats->assoc_built;
+ *nstats += 1;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+
+/*
+ * Serialize the MV histogram into a bytea value.
+ *
+ * The serialized first deduplicates the boundary values into a separate
+ * array, and uses 2B indexes when serializing the buckets. This stores
+ * a significant amount of space because each bucket split adds a single
+ * new boundary value, so e.g. with 4 attributes and 8191 splits (thus
+ * 8192 buckets), there are only ~8200 distinct boundary values.
+ *
+ * But as each bucket has 8 boundary values (4+4), that's ~64k Datums.
+ * That's roughly 65kB vs. 512kB, but we haven't included the indexes
+ * used to reference the boundary values. By using int16 indexes (which
+ * should be more than enough for all reasonable histogram sizes),
+ * this amounts to ~128kB (8192*8*2). So in total it's ~196kB vs. 512kB,
+ * i.e. more than 2x compression, which is nice.
+ *
+ * The implementation is simple - walk through the buckets, collect all
+ * the boundary values, keep only distinct values (in a sorted array)
+ * and then replace the values with indexes (using binary search).
+ *
+ * It's possible to either serialize/deserialize the histogram into
+ * a MVHistogram, or create a special structure working with this
+ * compressed structure (and keep MVBucket/MVHistogram only for the
+ * building phase). This might actually work better thanks to better
+ * CPU cache hit ratio, and simpler deserialization.
+ *
+ * This encoding will probably prevent automatic varlena compression,
+ * because first part of the serialized bytea will be an array of unique
+ * values (although sorted), and pglz decides whether to compress by
+ * trying to compress the first part (~1kB or so). Which will be poor,
+ * due to the lack of repetition.
+ *
+ * But in this case this is probably desirable - the data in general
+ * won't be really compressible (in addition to the 2x compression we
+ * got thanks to the encoding). In a sense the encoding scheme is
+ * actually a context-aware compression (usually compressing to ~30%).
+ * So this seems appropriate in this case.
+ *
+ * FIXME Make this work with arbitrary types.
+ *
+ * TODO Try to keep the compressed form, instead of deserializing it to
+ * MVHistogram/MVBucket.
+ *
+ * TODO We might get a bit better compression by considering the actual
+ * data type length. The current implementation treats all data as
+ * 8B values, but for INT it's actually 4B etc. OTOH this is only
+ * related to the lookup table, and most of the space is occupied
+ * by the buckets (with int16 indexes). And we don't have type info
+ * at the moment, so it would be difficult (but we'll nedd it to
+ * support all types, so maybe then).
+ */
+static bytea *
+serialize_mv_histogram(MVHistogram histogram)
+{
+ int i = 0, j = 0;
+
+ /* total size (histogram header + all buckets) */
+ Size total_len;
+ char *tmp = NULL;
+ bytea *result = NULL;
+
+ /* we need to accumulate all boundary values (min/max) */
+ int idx = 0;
+ int max_values = histogram->nbuckets * histogram->ndimensions * 2;
+ Datum * values = (Datum*)palloc0(max_values * sizeof(Datum));
+ Size len = sizeof(Datum);
+
+ /* we'll collect unique boundary values into this */
+ int ndistinct = 0;
+ Datum *lookup = NULL;
+ uint16 *indexes = (uint16*)palloc0(sizeof(uint16) * histogram->ndimensions);
+
+ /*
+ * Collect the boundary values first, sort them and generate a small
+ * array with only distinct values.
+ */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ values[idx++] = histogram->buckets[i]->min[j];
+ values[idx++] = histogram->buckets[i]->max[j];
+ }
+ }
+
+ /*
+ * We've allocated just enough space for all boundary values, but
+ * this may change once we start handling NULL values (as we'll
+ * probably skip those).
+ *
+ * Also, we expect at least one boundary value at this moment.
+ */
+ Assert(max_values == idx);
+ Assert(idx > 1);
+
+ /*
+ * Sort the collected boundary values using a simple memcmp-based
+ * comparator (this won't work for pass-by-reference types), and
+ * then walk the data and count the distinct values.
+ */
+ qsort((void *) values, idx, len, compare_scalars_memcmp_2);
+
+ ndistinct = 1;
+ for (i = 1; i < max_values; i++)
+ ndistinct += (values[i-1] != values[i]) ? 1 : 0;
+
+ /*
+ * At this moment we can allocate the bytea value (and we'll collect
+ * the boundary values directly into it).
+ *
+ * The bytea will be structured like this:
+ *
+ * - varlena header : VARHDRSZ
+ * - histogram header : offsetof(MVHistogram,buckets)
+ * - number of boundary values : sizeof(uint32)
+ * - boundary values : ndistinct * sizeof(Datum)
+ * - buckets : nbuckets * BUCKET_SIZE_SERIALIZED
+ *
+ * We'll assume 2B indexes into the boundary values, because each
+ * bucket 'split' introduces one boundary value. Moreover, multiple
+ * splits may introduce the same value, so this should be enough for
+ * at least 65k buckets (and likely more). That's more than enough
+ * for reasonable histogram sizes.
+ */
+
+ Assert(ndistinct <= 65536);
+
+ total_len = VARHDRSZ + offsetof(MVHistogramData, buckets) +
+ (sizeof(uint32) + ndistinct * sizeof(Datum)) +
+ histogram->nbuckets * BUCKET_SIZE_SERIALIZED(histogram->ndimensions);
+
+ result = (bytea*)palloc0(total_len);
+ tmp = VARDATA(result);
+
+ SET_VARSIZE(result, total_len);
+
+ /* copy the global histogram header */
+ memcpy(tmp, histogram, offsetof(MVHistogramData, buckets));
+ tmp += offsetof(MVHistogramData, buckets);
+
+ /*
+ * Copy the number of distinct values, and then all the distinct
+ * values currently stored in the 'values' array (sorted).
+ */
+ memcpy(tmp, &ndistinct, sizeof(uint32));
+ tmp += sizeof(uint32);
+
+ lookup = (Datum*)tmp;
+
+ for (i = 0; i < max_values; i++)
+ {
+ /* skip values that are equal to the previous one */
+ if ((i > 0) && (values[i-1] == values[i]))
+ continue;
+
+ memcpy(tmp, &values[i], sizeof(Datum));
+ tmp += sizeof(Datum);
+ }
+
+ Assert(tmp - (char*)lookup == ndistinct * sizeof(Datum));
+
+ /* now serialize all the buckets - first the header, without the
+ * variable-length part, then all the variable length parts */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ MVBucket bucket = histogram->buckets[i];
+
+ /* write the common bucket header */
+ memcpy(tmp, bucket, offsetof(MVBucketData, ndistincts));
+ tmp += offsetof(MVBucketData, ndistincts);
+
+ /* per-dimension ndistincts / nullsonly */
+ memcpy(tmp, bucket->ndistincts, sizeof(uint32)*histogram->ndimensions);
+ tmp += sizeof(uint32)*histogram->ndimensions;
+
+ memcpy(tmp, bucket->nullsonly, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ memcpy(tmp, bucket->min_inclusive, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ memcpy(tmp, bucket->max_inclusive, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ /* and now translate the min (and then max) boundaries to indexes */
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ Datum *v = (Datum*)bsearch(&bucket->min[j], lookup, ndistinct,
+ sizeof(Datum), compare_scalars_memcmp_2);
+
+ Assert(v != NULL);
+ indexes[j] = (v - lookup); /* Datum arithmetics (not char) */
+ Assert(indexes[j] < ndistinct); /* we have to be within the array */
+ }
+
+ memcpy(tmp, indexes, sizeof(uint16)*histogram->ndimensions);
+ tmp += sizeof(uint16)*histogram->ndimensions;
+
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ Datum *v = (Datum*)bsearch(&bucket->max[j], lookup, ndistinct,
+ sizeof(Datum), compare_scalars_memcmp_2);
+ Assert(v != NULL);
+ indexes[j] = (v - lookup); /* Datum arithmetics (not char) */
+ Assert(indexes[j] < ndistinct); /* we have to be within the array */
+ }
+
+ memcpy(tmp, indexes, sizeof(uint16)*histogram->ndimensions);
+ tmp += sizeof(uint16)*histogram->ndimensions;
+ }
+
+ pfree(indexes);
+
+ return result;
+}
+
+/*
+ * Reverse to serialize histogram. This essentially expands the serialized
+ * form back to MVHistogram / MVBucket.
+ */
+MVHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_length;
+ char *tmp = NULL;
+ MVHistogram histogram;
+
+ uint32 nlookup; /* Datum lookup table */
+ Datum *lookup = NULL;
+
+ if (data == NULL)
+ return NULL;
+
+ /* get pointer to the data part of the varlena */
+ tmp = VARDATA(data);
+
+ histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ /* copy the histogram header in place */
+ memcpy(histogram, tmp, offsetof(MVHistogramData, buckets));
+ tmp += offsetof(MVHistogramData, buckets);
+
+ if (histogram->magic != MVHIST_MAGIC)
+ {
+ pfree(histogram);
+ elog(WARNING, "not a MV Histogram (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(histogram->type == MVHIST_TYPE_BASIC);
+ Assert(histogram->nbuckets > 0);
+ Assert(histogram->nbuckets <= MVHIST_MAX_BUCKETS);
+ Assert(histogram->ndimensions > 0);
+ Assert(histogram->ndimensions <= MVSTATS_MAX_DIMENSIONS);
+
+ /* now, get the size of the lookup table */
+ memcpy(&nlookup, tmp, sizeof(uint32));
+ tmp += sizeof(uint32);
+ lookup = (Datum*)tmp;
+
+ /* skip to the first bucket */
+ tmp += sizeof(Datum) * nlookup;
+
+ /* check the total serialized length */
+ expected_length = offsetof(MVHistogramData, buckets) +
+ sizeof(uint32) + nlookup * sizeof(Datum) +
+ histogram->nbuckets * BUCKET_SIZE_SERIALIZED(histogram->ndimensions);
+
+ /* check serialized length */
+ if (VARSIZE_ANY_EXHDR(data) != expected_length)
+ {
+ elog(ERROR, "invalid MV histogram serialized size (expected %ld, got %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_length);
+ return NULL;
+ }
+
+ /* allocate bucket pointers */
+ histogram->buckets = (MVBucket*)palloc0(histogram->nbuckets * sizeof(MVBucket));
+
+ /* deserialize the buckets, one by one */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ /* don't allocate space for the build-only fields */
+ MVBucket bucket = (MVBucket)palloc0(offsetof(MVBucketData, rows));
+ uint16 *indexes = NULL;
+
+ /* write the common bucket header */
+ memcpy(bucket, tmp, offsetof(MVBucketData, ndistincts));
+ tmp += offsetof(MVBucketData, ndistincts);
+
+ /* per-dimension ndistincts / nullsonly */
+ bucket->ndistincts = (uint32*)palloc0(sizeof(uint32)*histogram->ndimensions);
+ memcpy(bucket->ndistincts, tmp, sizeof(uint32)*histogram->ndimensions);
+ tmp += sizeof(uint32)*histogram->ndimensions;
+
+ bucket->nullsonly = (bool*)palloc0(sizeof(bool)*histogram->ndimensions);
+ memcpy(bucket->nullsonly, tmp, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ bucket->min_inclusive = (bool*)palloc0(sizeof(bool)*histogram->ndimensions);
+ memcpy(bucket->min_inclusive, tmp, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ bucket->max_inclusive = (bool*)palloc0(sizeof(bool)*histogram->ndimensions);
+ memcpy(bucket->max_inclusive, tmp, sizeof(bool)*histogram->ndimensions);
+ tmp += sizeof(bool)*histogram->ndimensions;
+
+ /* translate the indexes back to Datum values */
+ bucket->min = (Datum*)palloc0(sizeof(Datum)*histogram->ndimensions);
+ bucket->max = (Datum*)palloc0(sizeof(Datum)*histogram->ndimensions);
+
+ indexes = (uint16*)tmp;
+ tmp += sizeof(uint16) * histogram->ndimensions;
+ for (j = 0; j < histogram->ndimensions; j++)
+ memcpy(&bucket->min[j], &lookup[indexes[j]], sizeof(Datum));
+
+ indexes = (uint16*)tmp;
+ tmp += sizeof(uint16) * histogram->ndimensions;
+ for (j = 0; j < histogram->ndimensions; j++)
+ memcpy(&bucket->max[j], &lookup[indexes[j]], sizeof(Datum));
+
+ histogram->buckets[i] = bucket;
+ }
+
+ return histogram;
+}
+
+/*
+ * Serialize MCV list into a bytea value.
+ *
+ * This does not use any kind of deduplication (compared to histogram
+ * serialization), as we don't expect the same efficiency here.
+ *
+ * This simply writes a MCV header (number of items, ...) and then Datum
+ * values for all attribute of a item, followed by the item frequency
+ * (as a double).
+ */
+static bytea *
+serialize_mv_mcvlist(MCVList mcvlist)
+{
+ int i;
+
+ /* we need to store nitems, and each needs ndimension * Datum, plus a double */
+ Size len = VARHDRSZ + offsetof(MCVListData, items) + mcvlist->nitems * (sizeof(Datum) * mcvlist->ndimensions + sizeof(double));
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, mcvlist, offsetof(MCVListData, items));
+ tmp += offsetof(MCVListData, items);
+
+ /* now, walk through the items and store values + frequency for each MCV item */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ memcpy(tmp, mcvlist->items[i]->values, mcvlist->ndimensions * sizeof(Datum));
+ tmp += mcvlist->ndimensions * sizeof(Datum);
+
+ memcpy(tmp, &mcvlist->items[i]->frequency, sizeof(double));
+ tmp += sizeof(double);
+ }
+
+ return output;
+
+}
+
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ Assert(mcvlist->nitems > 0);
+ Assert((mcvlist->ndimensions >= 2) && (mcvlist->ndimensions <= MVSTATS_MAX_DIMENSIONS));
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MCVListData,items) +
+ mcvlist->nitems * (sizeof(Datum) * mcvlist->ndimensions + sizeof(double));
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem) * mcvlist->nitems);
+
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = (MCVItem)palloc0(offsetof(MCVItemData, values) +
+ mcvlist->ndimensions * sizeof(Datum));
+
+ memcpy(item->values, tmp, mcvlist->ndimensions * sizeof(Datum));
+ tmp += mcvlist->ndimensions * sizeof(Datum);
+
+ memcpy(&item->frequency, tmp, sizeof(double));
+ tmp += sizeof(double);
+
+ mcvlist->items[i] = item;
+ }
+
+ return mcvlist;
+}
+
+static void
+update_mv_stats(Oid mvoid, MVHistogram histogram, MCVList mcvlist)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (histogram != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stahist-1] = false;
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(serialize_mv_histogram(histogram));
+ }
+
+ if (mcvlist != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stamcv -1] = false;
+ values[Anum_pg_mv_statistic_stamcv - 1]
+ = PointerGetDatum(serialize_mv_mcvlist(mcvlist));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+
+/* MV stats */
+
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+Datum
+pg_mv_stats_mvclist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+Datum
+pg_mv_stats_histogram_gnuplot(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+
+ /* FIXME (handle the length properly using StringBuilder */
+ Size len = 1024*1024;
+ char *buffer = palloc0(len);
+ char *str = buffer;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+
+ MVHistogram hist = deserialize_mv_histogram(data);
+
+ for (i = 0; i < hist->nbuckets; i++)
+ {
+ str += snprintf(str, len - (str - buffer),
+ "set object %d rect from %ld,%ld to %ld,%ld lw 1\n",
+ (i+1),
+ hist->buckets[i]->min[0], hist->buckets[i]->min[1],
+ hist->buckets[i]->max[0], hist->buckets[i]->max[1]);
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(buffer));
+
+}
+
+bytea *
+fetch_mv_histogram(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *stahist = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum hist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ stahist = DatumGetByteaP(hist);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return stahist;
+}
+
+bytea *
+fetch_mv_mcvlist(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *mcvlist = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum tmp = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ mcvlist = DatumGetByteaP(tmp);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return mcvlist;
+}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, by looking at the number of
+ * distinct values (combination of column values for bucket, column
+ * values for a dimension). This is somehow naive, but seems to work
+ * quite well. See the discussion at select_bucket_to_partition and
+ * partition_bucket for more details about alternative algorithms.
+ *
+ * So the current algorithm looks like this:
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (max distinct combinations)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (max distinct values)
+ * split the bucket into two buckets
+ *
+ */
+static MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ int attr_cnt, VacAttrStats **vacattrstats,
+ int numrows_total)
+{
+ int i;
+ int ndistinct;
+ int numattrs = attrs->dim1;
+ int *ndistincts = (int*)palloc0(sizeof(int) * numattrs);
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVHIST_MAGIC;
+ histogram->type = MVHIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets = (MVBucket*)palloc0(MVHIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0] = create_initial_mv_bucket(numrows, rows_copy, attrs,
+ attr_cnt, vacattrstats);
+
+ ndistinct = histogram->buckets[0]->ndistinct;
+
+ /* keep the global ndistinct values */
+ for (i = 0; i < numattrs; i++)
+ ndistincts[i] = histogram->buckets[0]->ndistincts[i];
+
+ while (histogram->nbuckets < MVHIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets, histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets] = partition_bucket(bucket, attrs,
+ attr_cnt, vacattrstats);
+
+ histogram->nbuckets += 1;
+ }
+
+ /*
+ * FIXME store the histogram in a catalog in a serialized form (simple for
+ * pass-by-value, more complicated for buckets on varlena types)
+ */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ int d;
+ histogram->buckets[i]->ntuples = (histogram->buckets[i]->numrows * 1.0) / numrows_total;
+ histogram->buckets[i]->ndistinct = (histogram->buckets[i]->ndistinct * 1.0) / ndistinct;
+
+ for (d = 0; d < numattrs; d++)
+ histogram->buckets[i]->ndistincts[d] = (histogram->buckets[i]->ndistincts[d] * 1.0) / ndistincts[d];
+ }
+
+ pfree(ndistincts);
+
+ return histogram;
+
+}
+
+/*
+ * Mine associations between the columns, in the form (A => B).
+ *
+ * At the moment this only works for associations between two columns,
+ * but it might be useful to mine for rules involving multiple columns
+ * on the left side. That is rules [A,B] => C and so on. Handling
+ * multiple columns on the right side is not necessary, because such
+ * rules may be decomposed into a set of rules, one for each column.
+ * I.e. A => [B,C] is exactly the same as (A => B) & (A => C).
+ *
+ * Those rules don't immediately identify redundant clauses, because the
+ * user may choose "incompatible conditions" (e.g. by using a zip code
+ * and a mismatching city) and so on. This should however be easy to
+ * identify from a histogram, because the conditions will match a bucket
+ * with low frequencies.
+ *
+ * The question is whether this can be useful when we have a histogram,
+ * because such incompatible conditions should result in not matching
+ * any buckets (or matching only buckets with low frequencies).
+ *
+ * The problem is that histograms work like this when the sorting is
+ * compatible with the meaning of the data. We're often using data types
+ * that support sorting (e.g. INT, BIGING) as a kind of labels where
+ * the sorting really does not make much sense. Sorting by ZIP code will
+ * result in sorting the cities quite randomly, and similarly for most
+ * surrogate primary / foreign keys. In such cases the histograms are
+ * pretty useless.
+ *
+ * So, a good approach might be testing the independence of the data
+ * (by building a contingency table) and buildint the MV histogram only
+ * when there's a dependency. For the 'label' data this should notice
+ * the histogram is useless. So we won't build it (and we may use that
+ * as a sign supporting the association rule).
+ *
+ * Another option is to look at selectivity of A and B separately, and
+ * then use the minimum of those.
+ *
+ * TODO investigate using histogram and MCV list to confirm the
+ * associative rule
+ *
+ * TODO investigate statistical testing of the distribution (to decide
+ * whether it makes sense to build the histogram)
+ *
+ * TODO Using a min/max of selectivities would probably make more sense
+ * for the associated columns.
+ */
+static void
+build_mv_associations(int numrows, HeapTuple *rows, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ bool isNull;
+ Size len = 2 * sizeof(Datum); /* only simple associations a => b */
+ int numattrs = attrs->dim1;
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct columns in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we should expected to
+ * observe in the sample - we can then use the average group
+ * size as a threshold. That seems better than a static approach.
+ */
+ int min_group_size = 10;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /* info for the interesting attributes only
+ *
+ * TODO Compute this only once and pass it to all the methods
+ * that need it.
+ */
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* We'll reuse the same array for all the combinations */
+ Datum * values = (Datum*)palloc0(numrows * 2 * sizeof(Datum));
+
+ Assert(numattrs >= 2);
+
+ for (dima = 0; dima < numattrs; dima++)
+ {
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+
+ int supporting = 0;
+ int contradicting = 0;
+
+ Datum val_a, val_b;
+ int violations = 0;
+ int group_size = 0;
+
+ int supporting_rows = 0;
+
+ /* skip (dima==dimb) */
+ if (dima == dimb)
+ continue;
+
+ /*
+ * FIXME Not sure if this handles NULL values properly (not sure
+ * how to do that). We assume that NULL means 0 for now,
+ * handling it just like any other value.
+ */
+ for (i = 0; i < numrows; i++)
+ {
+ values[i*2] = heap_getattr(rows[i], attrs->values[dima], stats[dima]->tupDesc, &isNull);
+ values[i*2+1] = heap_getattr(rows[i], attrs->values[dimb], stats[dimb]->tupDesc, &isNull);
+ }
+
+ qsort_arg((void *) values, numrows, sizeof(Datum) * 2, compare_scalars_memcmp, &len);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful. When contradicting,
+ * use it always.
+ */
+
+ /* start with values from the first row */
+ val_a = values[0];
+ val_b = values[1];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ if (values[2*i] != val_a) /* end of the group */
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ */
+ supporting += ((violations == 0) && (group_size >= min_group_size)) ? 1 : 0;
+ contradicting += (violations != 0) ? 1 : 0;
+
+ supporting_rows += ((violations == 0) && (group_size >= min_group_size)) ? group_size : 0;
+
+ /* current values start a new group */
+ val_a = values[2*i];
+ val_b = values[2*i+1];
+ violations = 0;
+ group_size = 1;
+ }
+ else
+ {
+ if (values[2*i+1] != val_b) /* mismatch of a B value */
+ {
+ val_b = values[2*i+1];
+ violations += 1;
+ }
+
+ group_size += 1;
+ }
+ }
+
+ /* FIXME handle the last group */
+ supporting += ((violations == 0) && (group_size >= min_group_size)) ? 1 : 0;
+ contradicting += (violations != 0) ? 1 : 0;
+ supporting_rows += ((violations == 0) && (group_size >= min_group_size)) ? group_size : 0;
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical rule.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means the columns have the same values (or one is a 'label'),
+ * making the conditions rather redundant. Although it's possible
+ * that the query uses incompatible combination of values.
+ */
+ if (supporting_rows > (numrows - supporting_rows) * 10)
+ {
+ // elog(WARNING, "%d => %d : supporting=%d contradicting=%d", dima, dimb, supporting, contradicting);
+ }
+
+ }
+ }
+
+ pfree(values);
+
+}
+
+/*
+ * Compute the list of most common items, where item is a combination of
+ * values for all the columns. For small number of distinct values, we
+ * may be able to represent the distribution pretty exactly, with
+ * per-item statistics.
+ *
+ * If we can represent the distribution using a MCV list only, it's great
+ * because that allows much better estimates (especially for equality).
+ * Such discrete distributions are also easier to combine (more
+ * efficient and more accurate) than when using histograms.
+ *
+ * FIXME This does not handle NULL values at the moment.
+ *
+ * TODO When computing equality selectivity (a=1 AND b=2), we can do that
+ * pretty exactly assuming (a) we hit a MCV item and (b) the
+ * histogram is built on those two columns only (i.e. there are no
+ * other columns). In that case we can estimate the selectivity
+ * using only the MCV.
+ *
+ * When we don't hit a MCV item, we can use the frequency of the
+ * least probable MCV item as upper bound of the selectivity
+ * (otherwise it'd get into the MCV list). Again, this only works
+ * when the histogram size matches the restricted columns.
+ *
+ * When the histogram is larger (i.e. there are additional columns),
+ * we can't be sure how is the selectivity distributed among the MCV
+ * list and the histogram (we may get several MCV items matching
+ * the conditions and several histogram buckets at the same time).
+ *
+ * In this case we can probably clamp the selectivity by minimum of
+ * selectivities for each condition. For example if we know the
+ * number of distinct values for each column, we can use 1/ndistinct
+ * as a per-column estimate. Or rather 1/ndistinct + selectivity
+ * derived from the MCV list.
+ *
+ * If there's no histogram (thus the distribution is approximated
+ * only by the MCV list), the size of the stats (whether there are
+ * some other columns, not referenced in the conditions) does not
+ * matter. We can do pretty accurate estimation using the MCV.
+ *
+ * TODO Currently there's no logic to consider building only a MCV list
+ * (and not building the histogram at all).
+ *
+ * TODO For types that don't reasonably support ordering (either because
+ * the type does not support that or when the user adds some option
+ * to the ADD STATISTICS command - e.g. UNSORTED_STATS), building
+ * the histogram may be pointless and inefficient. This is esp.
+ * true for varlena types that may be quite large and a large MCV
+ * list may be a better choice, because it makes equality estimates
+ * more accurate. Due to the unsorted nature, range queries on those
+ * attributes are rather useless anyway.
+ *
+ * Another thing is that by restricting to MCV list and equality
+ * conditions, we can use hash values instead of long varlena values.
+ * The equality estimation will be very accurate.
+ *
+ * This however complicates matching the columns to available
+ * statistics, as it will require matching clauses (not columns) to
+ * stats. And it may get quite complex - e.g. what if there are
+ * multiple clauses, each compatible with different stats subset?
+ *
+ * FIXME Create a special-purpose type for MCV items (instead of a plain
+ * Datum array, which is very difficult to work with).
+ */
+static MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats,
+ int *numrows_filtered)
+{
+ int i, j, idx = 0;
+ int numattrs = attrs->dim1;
+ Size len = sizeof(Datum) * numattrs;
+ bool isNull;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ *
+ * TODO We're using Datum (8B), even for data types smaller than this
+ * (notably int4 and float4). Maybe we could save some space here,
+ * although it seems the bytea compression will handle it just fine.
+ */
+ Datum * values = palloc0(numrows * numattrs * sizeof(Datum));
+
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ values[idx++] = heap_getattr(rows[j], attrs->values[i], stats[i]->tupDesc, &isNull);
+
+ qsort_arg((void *) values, numrows, sizeof(Datum) * numattrs, compare_scalars_memcmp, &len);
+
+ /*
+ * Count the number of distinct values - we need this to determine
+ * the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (memcmp(&values[i * numattrs], &values[(i-1) * numattrs], len) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array.
+ *
+ * TODO for now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ *
+ * TODO see if we can fit all the distinct values in the MCV list
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ /*
+ * If there are less than some number of items, store all with at
+ * least two rows in the sample.
+ *
+ * FIXME We can do this only if we believe we got all the distinct
+ * values of the table.
+ */
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (memcmp(&values[i * numattrs], &values[(i-1) * numattrs], len) != 0))
+ {
+ /* count the MCV item if exceeding the threshold */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* by default we keep all the rows (even if there's no MCV list) */
+ *numrows_filtered = numrows;
+
+ /* we know the number of mcvitems, now collect them in a 2nd pass */
+ if (nitems > 0)
+ {
+ /* we need to store the frequency for each group, so (numattrs + 1) */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ /* now repeat the same loop as above, but this time copy the data
+ * for items exceeding the threshold */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+
+ /* last row or a new group */
+ if ((i == numrows) || (memcmp(&values[i * numattrs], &values[(i-1) * numattrs], len) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* first, allocate the item (with the proper size of values) */
+ MCVItem item = (MCVItem)palloc0(offsetof(MCVItemData, values) +
+ sizeof(Datum)*mcvlist->ndimensions);
+
+ /* then copy values from the _previous_ group */
+ memcpy(item->values, &values[(i-1)*numattrs], len);
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ mcvlist->items[nitems] = item;
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV items.
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+ Datum *keys = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* collect the key values */
+ for (j = 0; j < numattrs; j++)
+ keys[j] = heap_getattr(rows[i], attrs->values[j], stats[j]->tupDesc, &isNull);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ if (memcmp(keys, mcvlist->items[j]->values, sizeof(Datum)*numattrs) == 0)
+ {
+ match = true;
+ break;
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+
+ pfree(keys);
+ }
+
+ /* replace the first part */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ pfree(rows_filtered);
+
+ }
+ }
+
+ pfree(values);
+
+ /*
+ * TODO Single-dimensional MCV is stored sorted by frequency (descending).
+ * Maybe this should be stored like that too?
+ */
+
+ return mcvlist;
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+static int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+static int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting Datum[] (row of Datums) when
+ * counting distinct values.
+ */
+static int
+compare_scalars_memcmp(const void *a, const void *b, void *arg)
+{
+ Size len = *(Size*)arg;
+
+ return memcmp(a, b, len);
+}
+
+static int
+compare_scalars_memcmp_2(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(Datum));
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 714a9f1..7f9e54f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -35,6 +35,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_rowsecurity.h"
@@ -91,7 +92,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -139,8 +140,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
@@ -414,7 +416,8 @@ static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKM
static void ATExecGenericOptions(Relation rel, List *options);
static void ATExecEnableRowSecurity(Relation rel);
static void ATExecDisableRowSecurity(Relation rel);
-
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode);
static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
static const char *storage_name(char c);
@@ -2965,6 +2968,7 @@ AlterTableGetLockLevel(List *cmds)
* updates.
*/
case AT_SetStatistics: /* Uses MVCC in getTableAttrs() */
+ case AT_AddStatistics: /* XXX not sure if the right level */
case AT_ClusterOn: /* Uses MVCC in getIndexes() */
case AT_DropCluster: /* Uses MVCC in getIndexes() */
case AT_SetOptions: /* Uses MVCC in getTableAttrs() */
@@ -3112,6 +3116,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
pass = AT_PASS_ADD_CONSTR;
break;
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
+ case AT_AddStatistics: /* XXX maybe not the right place */
ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode);
/* Performs own permission checks */
ATPrepSetStatistics(rel, cmd->name, cmd->def, lockmode);
@@ -3407,6 +3412,9 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
ATExecSetStatistics(rel, cmd->name, cmd->def, lockmode);
break;
+ case AT_AddStatistics: /* ADD STATISTICS */
+ ATExecAddStatistics(tab, rel, (StatisticsDef *) cmd->def, lockmode);
+ break;
case AT_SetOptions: /* ALTER COLUMN SET ( options ) */
ATExecSetOptions(rel, cmd->name, cmd->def, false, lockmode);
break;
@@ -11616,3 +11624,197 @@ RangeVarCallbackForAlterRelation(const RangeVar *rv, Oid relid, Oid oldrelid,
ReleaseSysCache(tuple);
}
+
+/* used for sorting the attnums in ATExecAddStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the ALTER TABLE ... ADD STATISTICS (options) ON (columns).
+ *
+ * The code is an unholy mix of pieces that really belong to other parts
+ * of the source tree.
+ *
+ * FIXME Check that the types are pass-by-value and support sort,
+ * although maybe we can live without the sort (and only build
+ * MCV list / association rules).
+ *
+ * FIXME This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ Oid atttypids[INDEX_MAX_KEYS];
+ int numcols = 0;
+
+ Oid mvstatoid;
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+
+ /* by default build everything */
+ bool build_histogram = true,
+ build_mcv = true,
+ build_associations = true;
+
+ /* build regular MCV (not hashed by default) */
+ bool mcv_hashed = false;
+
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
+
+ Assert(IsA(def, StatisticsDef));
+
+ /* transform the column names to attnum values */
+
+ foreach(l, def->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ atttypids[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->atttypid;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, def->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv_hashed") == 0)
+ mcv_hashed = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "associations") == 0)
+ build_associations = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* TODO check that this is not used with 'histogram off' */
+
+ /* sanity check */
+ if (max_buckets < 1024)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is 1024")));
+
+ else if (max_buckets > 32768) /* FIXME use the proper constant */
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is 1024")));
+
+ }
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* TODO check that this is not used with 'mcv off' */
+
+ /* sanity check */
+ if (max_mcv_items < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be non-negative")));
+
+ else if (max_mcv_items > 8192) /* FIXME use the proper constant */
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is 8192")));
+
+ }
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_mcv_hashed -1] = BoolGetDatum(mcv_hashed);
+ values[Anum_pg_mv_statistic_assoc_enabled -1] = BoolGetDatum(build_associations);
+
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+
+ nulls[Anum_pg_mv_statistic_staassoc -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ mvstatoid = simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ heap_freetuple(htup);
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ return;
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e76b5b3..da35331 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3903,6 +3903,17 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static StatisticsDef *
+_copyStatisticsDef(const StatisticsDef *from)
+{
+ StatisticsDef *newnode = makeNode(StatisticsDef);
+
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4717,6 +4728,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_StatisticsDef:
+ retval = _copyStatisticsDef(from);
+ break;
case T_PrivGrantee:
retval = _copyPrivGrantee(from);
break;
@@ -4729,7 +4743,6 @@ copyObject(const void *from)
case T_XmlSerialize:
retval = _copyXmlSerialize(from);
break;
-
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
retval = 0; /* keep compiler quiet */
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 9b657fb..9c32735 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -24,6 +24,9 @@
#include "utils/lsyscache.h"
#include "utils/selfuncs.h"
+#include "utils/mvstats.h"
+#include "catalog/pg_collation.h"
+#include "utils/typcache.h"
/*
* Data structure for accumulating info about possible range-query
@@ -43,6 +46,23 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+static bool is_mv_compatible(Node *clause, Oid varRelid, Index *varno,
+ Bitmapset **attnums);
+static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
+ Oid varRelid, Oid *relid);
+static int choose_mv_histogram(int nmvstats, MVStats mvstats,
+ Bitmapset *attnums);
+static List *clauselist_mv_split(List *clauses, Oid varRelid,
+ List **mvclauses, MVStats mvstats);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStats mvstats);
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStats mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStats mvstats);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -100,14 +120,74 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+ int nmvstats = 0;
+ MVStats mvstats = NULL;
+
+ /* attributes in mv-compatible clauses */
+ Bitmapset *mvattnums = NULL;
+
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
varRelid, jointype, sjinfo);
+ /* collect attributes from mv-compatible clauses */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid);
+
+ /*
+ * If there are mv-compatible clauses, referencing at least two
+ * columns (otherwise it makes no sense to use mv stats), fetch the
+ * MV histograms for the relation (only the column keys, not the
+ * histograms yet - we'll decide which histogram to use first).
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* fetch info from the catalog (not the serialized stats yet) */
+ mvstats = list_mv_stats(relid, &nmvstats, true);
+
+ /*
+ * If there are candidate statistics, choose the histogram first.
+ * At the moment we only use a single statistics, covering the
+ * most columns (using info from the previous step). If there
+ * are multiple such histograms, we'll use the smallest one
+ * (with the lowest number of dimensions).
+ *
+ * This may not be optimal choice, if the 'smaller' stats has
+ * much less buckets than the rejected one (making it less
+ * accurate).
+ *
+ * We may end up without multivariate statistics, if none of the
+ * stats matches at least two columns from the clauses (in that
+ * case we may just use the single dimensional stats).
+ */
+ if (nmvstats > 0)
+ {
+ int idx = choose_mv_histogram(nmvstats, mvstats, mvattnums);
+
+ if (idx >= 0) /* we have a matching stats */
+ {
+ MVStats mvstat = &mvstats[idx];
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(clauses, varRelid, &mvclauses, mvstat);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+ }
+
/*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
@@ -782,3 +862,1010 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+
+
+/*
+ * Estimate selectivity for the list of MV-compatible clauses, using that
+ * particular histogram.
+ *
+ * When we hit a single bucket, we don't know what portion of it actually
+ * matches the clauses (e.g. equality), and we use 1/2 the bucket by
+ * default. However, the MV histograms are usually less detailed than
+ * the per-column ones, meaning the sum of buckets is often quite high
+ * (thanks to combining a lot of "partially hit" buckets).
+ *
+ * There are several ways to improve this, usually with cases when it
+ * won't really help. Also, the more complex the process, the worse
+ * the failures (i.e. misestimates).
+ *
+ * (1) Use the MV histogram only as a way to combine multiple
+ * per-column histograms, essentially rewriting
+ *
+ * P(A & B) = P(A) * P(B|A)
+ *
+ * where P(B|A) may be computed using a proper "slice" of the
+ * histogram, by first selecting only buckets where A is true, and
+ * then using the boundaries to 'restrict' the per-colunm histogram.
+ *
+ * With more clauses, it gets more complicated, of course
+ *
+ * P(A & B & C) = P(A & C) * P(B|A & C)
+ * = P(A) * P(C|A) * P(B|A & C)
+ *
+ * and so on.
+ *
+ * Of course, the question is how well and efficiently we can
+ * compute the conditional probabilities - whether this approach
+ * can improve the estimates (instead of amplifying the errors).
+ *
+ * Also, this does not eliminate the need for histogram on [A,B,C].
+ *
+ * (2) Use multiple smaller (and more accurate) histograms, and combine
+ * them using a process similar to the above. E.g. by assuming that
+ * B and C are independent, we can rewrite
+ *
+ * P(B|A & C) = P(B|A)
+ *
+ * so we can rewrite the whole formula to
+ *
+ * P(A & B & C) = P(A) * P(C|A) * P(B|A)
+ *
+ * and we're OK with two 2D histograms [A,C] and [A,B].
+ *
+ * It'd be nice to perform some sort of statistical test (Fisher
+ * or another chi-squared test) to identify independent components
+ * and automatically separate them into smaller histograms.
+ *
+ * (3) Using the estimated number of distinct values in a bucket to
+ * decide the selectivity of equality in the bucket (instead of
+ * blindly using 1/2 of the bucket, we may use 1/ndistinct).
+ * Of course, if the ndistinct estimate is way off, or when the
+ * distribution is not uniform (one distict items get much more
+ * items), this will fail. Also, we currently don't have ndistinct
+ * estimate available at this moment (but it shouldn't be that
+ * difficult to compute as ndistinct and ntuples should be available).
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities
+ * (i.e. the selectivity of the most restrictive clause), because
+ * that's the maximum we can ever get from ANDed list of clauses.
+ * This may probably prevent issues with hitting too many buckets
+ * and low precision histograms.
+ *
+ * TODO We may support some additional conditions, most importantly
+ * those matching multiple columns (e.g. "a = b" or "a < b").
+ * Ultimately we could track multi-table histograms for join
+ * cardinality estimation.
+ *
+ * TODO Currently this is only estimating all clauses, or clauses
+ * matching varRelid (when it's not 0). I'm not sure what's the
+ * purpose of varRelid, but my assumption is this is used for
+ * join conditions and such. In that case we can use those clauses
+ * to restrict the other (i.e. filter the histogram buckets first,
+ * before estimating the other clauses). This is essentially equal
+ * to computing P(A|B) where "B" are the clauses not matching the
+ * varRelid.
+ *
+ * TODO Further thoughts on processing equality clauses - maybe it'd be
+ * better to look for stats (with MCV) covered by the equality
+ * clauses, because then we have a chance to find an exact match
+ * in the MCV list, which is pretty much the best we can do. We may
+ * also look at the least frequent MCV item, and use it as a upper
+ * boundary for the selectivity (had there been a more frequent
+ * item, it'd be in the MCV list).
+ *
+ * These conditions may then be used as a condition for the other
+ * selectivities, i.e. we may estimate P(A,B) first, and then
+ * compute P(C|A,B) from another histogram. This may be useful when
+ * we can estimate P(A,B) accurately (e.g. because it's a complete
+ * equality match evaluated on MCV list), and then compute the
+ * conditional probability P(C|A,B), giving us the requested stats
+ *
+ * P(A,B,C) = P(A,B) * P(C|A,B)
+ *
+ * TODO There are several options for 'sanity clamping' the estimates.
+ *
+ * First, if we have selectivities for each condition, then
+ *
+ * P(A,B) <= MIN(P(A), P(B))
+ *
+ * Because additional conditions (connected by AND) can only lower
+ * the probability.
+ *
+ * So we can do some basic sanity checks using the single-variate
+ * stats (the ones we have right now).
+ *
+ * Second, when we have multivariate stats with a MCV list, then
+ *
+ * (a) if we have a full equality condition (one equality condition
+ * on each column) and we found a match in the MCV list, this is
+ * the selectivity (and it's supposed to be exact)
+ *
+ * (b) if we have a full equality condition and we haven't found a
+ * match in the MCV list, then the selectivity is below the
+ * lowest selectivity in the MCV list
+ *
+ * (c) if we have a equality condition (not full), we can still
+ * search the MCV for matches and use the sum of probabilities
+ * as a lower boundary for the histogram (if there are no
+ * matches in the MCV list, then we have no boundary)
+ *
+ * Third, if there are multiple multivariate stats for a set of
+ * clauses, we may compute all of them and then somehow aggregate
+ * them - e.g. by choosing the minimum, median or average. The
+ * multi-variate stats are susceptible to overestimation (because
+ * we take 50% of the bucket for partial matches). Some stats may
+ * give better estimates than others, but it's very difficult to
+ * say determine that in advance which one is the best (it depends
+ * on the number of buckets, number of additional columns not
+ * referenced in the clauses etc.) so we may compute all and then
+ * choose a sane aggregation (minimum seems like a good approach).
+ * Of course, this may result in longer / more expensive estimation
+ * (CPU-wise), but it may be worth it.
+ *
+ * There are ways to address this, though. First, it's possible to
+ * add a GUC choosing whether to do a 'simple' (using a single
+ * stats expected to give the best estimate) and 'complex' (combining
+ * the multiple estimates).
+ *
+ * multivariate_estimates = (simple|full)
+ *
+ * Also, this might be enabled at a table level, by something like
+ *
+ * ALTER TABLE ... SET STATISTICS (simple|full)
+ *
+ * Which would make it possible to use this only for the tables
+ * where the simple approach does not work.
+ *
+ * Also, there are ways to optimize this algorithmically. E.g. we
+ * may try to get an estimate from a matching MCV list first, and
+ * if we happen to get a "full equality match" we may stop computing
+ * the estimates from other stats (for this condition) because
+ * that's probably the best estimate we can really get.
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive).
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
+{
+ bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
+}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ *
+ */
+static Bitmapset *
+collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid, Oid *relid)
+{
+ Index varno = 0;
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ is_mv_compatible(clause, varRelid, &varno, &attnums);
+ }
+
+ /*
+ * If there are at least two attributes referenced by the clause(s),
+ * fetch the relation info (and pass back the Oid of the relation).
+ */
+ if (bms_num_members(attnums) > 1)
+ {
+ RelOptInfo *rel = find_base_rel(root, varno);
+ *relid = root->simple_rte_array[bms_singleton_member(rel->relids)]->relid;
+ }
+ else
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ *relid = InvalidOid;
+ }
+
+ return attnums;
+}
+
+/*
+ * We're looking for a histogram matching at least 2 attributes, and we
+ * want the smallest histogram available wrt. to number of buckets (to
+ * get efficient estimation and likely better precision. The precision
+ * depends on the total number of buckets too, but the lower the number
+ * of dimensions the smaller (and more precise) the buckets can get.
+ */
+static int
+choose_mv_histogram(int nmvstats, MVStats mvstats, Bitmapset *attnums)
+{
+ int i, j;
+
+ int choice = -1;
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ int matches = 0; /* columns matching this histogram */
+
+ int2vector * attrs = mvstats[i].stakeys;
+ int numattrs = mvstats[i].stakeys->dim1;
+
+ /* count columns covered by the histogram */
+ for (j = 0; j < numattrs; j++)
+ if (bms_is_member(attrs->values[j], attnums))
+ matches++;
+
+ /*
+ * Use this histogram when it improves the number of matches or
+ * when it keeps the number of matches and is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = i;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen histogram, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(List *clauses, Oid varRelid, List **mvclauses, MVStats mvstats)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ RestrictInfo *rinfo;
+ Node *clause = (Node *) lfirst(l);
+
+ /*
+ * Only restrictinfo may be mv-compatible, so everything else
+ * goes to the non-mv list directly
+ *
+ * TODO create a macro/function to decide mv-compatible clauses
+ * (along the is_opclause for example)
+ */
+ if (! IsA(clause, RestrictInfo))
+ {
+ non_mvclauses = lappend(non_mvclauses, clause);
+ continue;
+ }
+
+ rinfo = (RestrictInfo *) clause;
+ clause = (Node*)rinfo->clause;
+
+ /* Pseudoconstants go directly to the non-mv list too. */
+ if (rinfo->pseudoconstant)
+ {
+ non_mvclauses = lappend(non_mvclauses, rinfo);
+ continue;
+ }
+
+ if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ {
+ non_mvclauses = lappend(non_mvclauses, rinfo);
+ continue;
+ }
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ case F_EQSEL:
+ if (! IS_SPECIAL_VARNO(var->varno)) /* FIXME necessary here? */
+ {
+ bool match = false;
+ for (i = 0; i < numattrs; i++)
+ if (attrs->values[i] == var->varattno)
+ match = true;
+
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, rinfo);
+ }
+ }
+ }
+ }
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ */
+static bool
+is_mv_compatible(Node *clause, Oid varRelid, Index *varno, Bitmapset **attnums)
+{
+
+ if (IsA(clause, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* get the actual clause from the RestrictInfo ... */
+ clause = (Node*)rinfo->clause;
+
+ /* is it 'variable op constant' ? */
+ if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ case F_EQSEL:
+ *varno = var->varno;
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return true;
+ }
+ }
+ }
+ }
+
+ return false;
+
+}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * it's assumed we can skip computing the estimate from histogram,
+ * because all the rows matching the condition are represented by the
+ * MCV item.
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram.
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStats mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ ListCell * l;
+ char * mcvitems = NULL;
+ MCVList mcvlist = NULL;
+
+ Bitmapset *matches = NULL; /* attributes with equality matches */
+
+ /* there's no MCV list yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = deserialize_mv_mcvlist(fetch_mv_mcvlist(mvstats->mvoid));
+
+ Assert(mcvlist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ mcvitems = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(mcvitems, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* no match here */
+ *lowsel = 1.0;
+
+ /* loop through the list of MV-compatible clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ /* operator */
+ FmgrInfo opproc;
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo ltproc, gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, mvstats->stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /* process the MCV list first */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool tmp;
+ MCVItem item = mcvlist->items[i];
+
+ /* find the lowest selectivity in the MCV */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /* skip MCV items already ruled out */
+ if (mcvitems[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* TODO consider bsearch here (list is sorted by values)
+ * TODO handle other operators too (LT, GT)
+ * TODO identify "full match" when the clauses fully
+ * match the whole MCV list (so that checking the
+ * histogram is not needed)
+ */
+ if (get_oprrest(expr->opno) == F_EQSEL)
+ {
+ /*
+ * We don't care about isgt in equality, because it does not matter
+ * whether it's (var = const) or (const = var).
+ */
+ if (memcmp(&cst->constvalue, &item->values[idx], sizeof(Datum)) != 0)
+ mcvitems[i] = MVSTATS_MATCH_NONE;
+ else
+ matches = bms_add_member(matches, idx);
+ }
+ else if (get_oprrest(expr->opno) == F_SCALARLTSEL) /* column < constant */
+ {
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (tmp)
+ {
+ mcvitems[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARLTSEL) */
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ mcvitems[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+ }
+ }
+ else if (get_oprrest(expr->opno) == F_SCALARGTSEL) /* column > constant */
+ {
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+ if (tmp)
+ {
+ mcvitems[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ mcvitems[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARGTSEL) */
+
+ }
+ }
+ }
+
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ if (mcvitems[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ *fullmatch = (bms_num_members(matches) == mcvlist->ndimensions);
+
+ pfree(mcvitems);
+ pfree(mcvlist);
+
+ return s;
+}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram list for the stats, the function returns 0.0.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStats mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ ListCell * l;
+ char *buckets = NULL;
+ MVHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = deserialize_mv_histogram(fetch_mv_histogram(mvstats->mvoid));
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ buckets = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(buckets, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ *
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR
+ | TYPECACHE_GT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, mvstats->stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ bool tmp;
+ MVBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if (buckets[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ *
+ * FIXME Once the min/max values are deduplicated, we can easily minimize
+ * the number of calls to the comparator (assuming we keep the
+ * deduplicated structure). See the note on compression at MVBucket
+ * serialize/deserialize methods.
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL: /* column < constant */
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->max[idx]));
+
+ if (tmp)
+ buckets[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->min[idx],
+ cst->constvalue));
+
+ if (tmp)
+ buckets[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+ break;
+
+ case F_SCALARGTSEL: /* column > constant */
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->max[idx]));
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+
+ if (tmp)
+ buckets[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->min[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+
+ if (tmp)
+ buckets[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the lt/gt
+ * operators fetched from type cache.
+ *
+ * TODO We'll use the default 50% estimate, but that's probably way off
+ * if there are multiple distinct values. Consider tweaking this a
+ * somehow, e.g. using only a part inversely proportional to the
+ * estimated number of distinct values in the bucket.
+ *
+ * TODO This does not handle inclusion flags at the moment, thus counting
+ * some buckets twice (when hitting the boundary).
+ *
+ * TODO Optimization is that if max[i] == min[i], it's effectively a MCV
+ * item and we can count the whole bucket as a complete match (thus
+ * using 100% bucket selectivity and not just 50%).
+ *
+ * TODO Technically some buckets may "degenerate" into single-value
+ * buckets (not necessarily for all the dimensions) - maybe this
+ * is better than keeping a separate MCV list (multi-dimensional).
+ * Update: Actually, that's unlikely to be better than a separate
+ * MCV list for two reasons - first, it requires ~2x the space
+ * (because of storing lower/upper boundaries) and second because
+ * the buckets are ranges - depending on the partitioning algorithm
+ * it may not even degenerate into (min=max) bucket. For example the
+ * the current partitioning algorithm never does that.
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* constvalue < min */
+ continue;
+ }
+
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+
+ if (tmp)
+ {
+ buckets[i] = MVSTATS_MATCH_NONE; /* constvalue > max */
+ continue;
+ }
+
+ /* partial match */
+ buckets[i] = MVSTATS_MATCH_PARTIAL;
+
+ break;
+ }
+ }
+ }
+ }
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ if (buckets[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (buckets[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+ return s;
+}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index bd180e7..d725ae0 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -366,6 +366,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
create_generic_options alter_generic_options
relation_expr_list dostmt_opt_list
+%type <list> OptStatsOptions
+%type <str> stats_options_name
+%type <node> stats_options_arg
+%type <defelt> stats_options_elem
+%type <list> stats_options_list
+
+
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -484,7 +491,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <keyword> unreserved_keyword type_func_name_keyword
%type <keyword> col_name_keyword reserved_keyword
-%type <node> TableConstraint TableLikeClause
+%type <node> TableConstraint TableLikeClause TableStatistics
%type <ival> TableLikeOptionList TableLikeOption
%type <list> ColQualList
%type <node> ColConstraint ColConstraintElem ConstraintAttr
@@ -2312,6 +2319,14 @@ alter_table_cmd:
n->subtype = AT_DisableRowSecurity;
$$ = (Node *)n;
}
+ /* ALTER TABLE <name> ADD STATISTICS (options) ON (columns) ... */
+ | ADD_P TableStatistics
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_AddStatistics;
+ n->def = $2;
+ $$ = (Node *)n;
+ }
| alter_generic_options
{
AlterTableCmd *n = makeNode(AlterTableCmd);
@@ -3382,6 +3397,56 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * ALTER TABLE relname ADD STATISTICS (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+TableStatistics:
+ STATISTICS OptStatsOptions ON '(' columnList ')'
+ {
+ StatisticsDef *n = makeNode(StatisticsDef);
+ n->keys = $5;
+ n->options = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
+OptStatsOptions:
+ '(' stats_options_list ')' { $$ = $2; }
+ | /*EMPTY*/ { $$ = NIL; }
+ ;
+
+stats_options_list:
+ stats_options_elem
+ {
+ $$ = list_make1($1);
+ }
+ | stats_options_list ',' stats_options_elem
+ {
+ $$ = lappend($1, $3);
+ }
+ ;
+
+stats_options_elem:
+ stats_options_name stats_options_arg
+ {
+ $$ = makeDefElem($1, $2);
+ }
+ ;
+
+stats_options_name:
+ NonReservedWord { $$ = $1; }
+ ;
+
+stats_options_arg:
+ opt_boolean_or_string { $$ = (Node *) makeString($1); }
+ | NumericOnly { $$ = (Node *) $1; }
+ | /* EMPTY */ { $$ = NULL; }
+ ;
+
/*****************************************************************************
*
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 94d951c..ec90773 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -499,6 +500,17 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 128
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index 870692c..d2266c0 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,11 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3259, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3259
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3264, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3264
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..d725957
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,89 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3260
+
+CATALOG(pg_mv_statistic,3260)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+
+ /* statistics requested to build */
+ bool hist_enabled; /* build histogram? */
+ bool mcv_enabled; /* build MCV list? */
+ bool mcv_hashed; /* build hashed MCV? */
+ bool assoc_enabled; /* analyze associations? */
+
+ /* histogram / MCV size */
+ int32 hist_max_buckets; /* max buckets */
+ int32 mcv_max_items; /* max MCV items */
+
+ /* statistics that are available (if requested) */
+ bool hist_built; /* histogram was built */
+ bool mcv_built; /* MCV list was built */
+ bool assoc_built; /* associations were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea staassoc; /* association rules (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_attrdef
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 14
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_hist_enabled 2
+#define Anum_pg_mv_statistic_mcv_enabled 3
+#define Anum_pg_mv_statistic_mcv_hashed 4
+#define Anum_pg_mv_statistic_assoc_enabled 5
+#define Anum_pg_mv_statistic_hist_max_buckets 6
+#define Anum_pg_mv_statistic_mcv_max_items 7
+#define Anum_pg_mv_statistic_hist_built 8
+#define Anum_pg_mv_statistic_mcv_built 9
+#define Anum_pg_mv_statistic_assoc_built 10
+#define Anum_pg_mv_statistic_stakeys 11
+#define Anum_pg_mv_statistic_staassoc 12
+#define Anum_pg_mv_statistic_stamcv 13
+#define Anum_pg_mv_statistic_stahist 14
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 497e652..c3c03b6 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2676,6 +2676,13 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s 2 0 16 "26 25" _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3261 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3262 ( pg_mv_stats_mvclist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_mvclist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3263 ( pg_mv_stats_histogram_gnuplot PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_histogram_gnuplot _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: 2D histogram gnuplot");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index a4af551..c7839c0 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3265, 3954);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index bc71fea..b916edd 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -413,6 +413,7 @@ typedef enum NodeTag
T_XmlSerialize,
T_WithClause,
T_CommonTableExpr,
+ T_StatisticsDef,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3e4f815..c3e458a 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -543,6 +543,14 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct StatisticsDef
+{
+ NodeTag type;
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+} StatisticsDef;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1338,7 +1346,8 @@ typedef enum AlterTableType
AT_ReplicaIdentity, /* REPLICA IDENTITY */
AT_EnableRowSecurity, /* ENABLE ROW SECURITY */
AT_DisableRowSecurity, /* DISABLE ROW SECURITY */
- AT_GenericOptions /* OPTIONS (...) */
+ AT_GenericOptions, /* OPTIONS (...) */
+ AT_AddStatistics /* add statistics */
} AlterTableType;
typedef struct ReplicaIdentityStmt
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..157891a
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,283 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+/*
+ * Multivariate statistics for planner/optimizer, implementing extensions
+ * of the single-column statistics:
+ *
+ * - multivariate MCV list
+ * - multivariate histograms
+ *
+ * There's also an experimental support for associative rules (values in
+ * one column implying values in other columns - e.g. ZIP code implies
+ * name of a city, etc.).
+ *
+ * The current implementation has various limitations:
+ *
+ * (a) it supports only data types passed by value
+ *
+ * (b) no support for NULL values
+ *
+ * Both (a) and (b) should be straightforward to fix (and usually
+ * described in comments at related data structures or functions).
+ *
+ * The stats may be built only directly on columns, not on expressions.
+ * And there are usually some additional technical limits (e.g. number
+ * of columns in a histogram, etc.).
+ *
+ * Those limits serve mostly as sanity checks and while increasing them
+ * is possible (the implementation should not break), it's expected to
+ * lead either to very bad precision or expensive planning.
+ */
+
+/*
+ * Multivariate histograms
+ *
+ * Histograms are a collection of buckets, represented by n-dimensional
+ * rectangles. Each rectangle is delimited by an array of lower and
+ * upper boundaries, so that for for the i-th attribute
+ *
+ * min[i] <= value[i] <= max[i]
+ *
+ * Each bucket tracks frequency (fraction of tuples it contains),
+ * information about the inequalities, number of distinct values in
+ * each dimension (which is used when building the histogram) etc.
+ *
+ * The boundaries may be either inclusive or exclusive, or the whole
+ * dimension may be NULL.
+ *
+ * The buckets may overlap (assuming the build algorithm keeps the
+ * frequencies additive) or may not cover the whole space (i.e. allow
+ * gaps). This entirely depends on the algorithm used to build the
+ * histogram.
+ *
+ * The histograms are marked with a 'magic' constant, mostly to make
+ * sure the bytea really is a histogram in serialized form.
+ *
+ * We do expect to support multiple histogram types, with different
+ * features etc. The 'type' field is used to identify those types.
+ * Technically some histogram types might use completely different
+ * bucket representation, but that's not expected at the moment.
+ *
+ * TODO Add pointer to 'private' data, meant for private data for
+ * other algorithms for building the histogram.
+ *
+ * TODO The current implementation does not handle NULL values (it's
+ * somehow prepared for that, but the algorithm building the
+ * histogram ignores them). The idea is to build buckets with one
+ * or more NULL-only dimensions - there'll be at most 2^ndimensions
+ * such buckets, which for 8 atttributes (current limit) is 256.
+ * That's quite reasonable, considering we expect thousands of
+ * buckets in total.
+ *
+ * TODO This structure is used both when building the histogram, and
+ * then when using it to compute estimates. That's why the last
+ * few elements are not used once the histogram is built.
+ *
+ * TODO The limit on number of buckets is quite arbitrary, aiming for
+ * sufficient accuracy while still being fast. Probably should be
+ * replaced with a dynamic limit dependent on statistics target,
+ * number of attributes (dimensions) and statistics target
+ * associated with the attributes. Also, this needs to be related
+ * to the number of sampled rows, by either clamping it to a
+ * reasonable number (after seeing the number of rows) or using
+ * it when computing the number of rows to sample. Something like
+ * 10 rows per bucket seems reasonable.
+ *
+ * TODO We may replace the bool arrays with a suitably large data type
+ * (say, uint16 or uint32) and get rid of the allocations. It's
+ * unlikely we'll ever support more than 32 columns as that'd
+ * result in poor precision, huge histograms (splitting each
+ * dimension once would mean 2^32 buckets), and very expensive
+ * estimation. MCVItem already does it this way.
+ *
+ * TODO Actually the distinct stats (both for combination of all columns
+ * and for combinations of various subsets of columns) should be
+ * moved to a separate structure (next to histogram/MCV/...) to
+ * make it useful even without a histogram computed etc.
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+ float ndistinct; /* frequency of distinct values */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized), but
+ * it could be useful for estimating ndistinct for combinations of
+ * columns.
+ *
+ * It would mean tracking 2^N values for each bucket, and even if
+ * those values might be stores in 1B it's still a lot of space
+ * (considering the expected number of buckets).
+ *
+ * TODO Consider tracking ndistincts for all attribute combinations.
+ */
+ uint32 *ndistincts;
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /*
+ * Sample tuples falling into this bucket, index of the dimension
+ * the bucket was split by in the last step.
+ *
+ * XXX These fields are needed only while building the histogram,
+ * and are not serialized at all.
+ */
+ HeapTuple *rows;
+ uint32 numrows;
+ int last_split_dimension;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVHIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVHIST_TYPE_BASIC 1 /* basic histogram type */
+
+/* limits (mostly sanity check, may be relaxed in the future) */
+#define MVHIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/* bucket size in a serialized form */
+#define BUCKET_SIZE_SERIALIZED(ndims) \
+ (offsetof(MVBucketData, ndistincts) + \
+ (ndims) * (2 * sizeof(uint16) + sizeof(uint32) + 3 * sizeof(bool)))
+
+
+/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ *
+ * This already uses the trick with using uint32 as a null bitmap.
+ *
+ * TODO Shouldn't the MCVItemData use plain pointer for values, instead
+ * of the single-item array trick?
+ *
+ * TODO It's possible to build a special case of MCV list, storing not
+ * the actual values but only 32/64-bit hash. This is only useful
+ * for estimating equality clauses and for large varlena types.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ uint32 nulls; /* lags of NULL values (up to 32 columns) */
+ Datum values[1]; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/* TODO consider increasing the limit, and/or using statistics target */
+#define MVSTAT_MCVLIST_MAX_ITEMS 1024 /* max items in MCV list */
+
+
+/*
+ * Basic info about the stats, used when choosing what to use
+ *
+ * TODO Add info about what statistics is available (histogram, MCV,
+ * hashed MCV, assciative rules).
+ */
+typedef struct MVStatsData {
+ Oid mvoid; /* OID of the stats in pg_mv_statistic */
+ int2vector *stakeys; /* attnums for columns in the stats */
+ bool hist_built; /* histogram is already available */
+ bool mcv_built; /* MCV list is already available */
+ bool assoc_built; /* associative rules available */
+} MVStatsData;
+
+typedef struct MVStatsData *MVStats;
+
+
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+MVStats list_mv_stats(Oid relid, int *nstats, bool built_only);
+bytea * fetch_mv_histogram(Oid mvoid);
+bytea * fetch_mv_mcvlist(Oid mvoid);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVHistogram deserialize_mv_histogram(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_gnuplot(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mvclist_info(PG_FUNCTION_ARGS);
+
+#endif
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index f97229f..a275bd5 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,7 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/regression.diffs b/src/test/regress/regression.diffs
new file mode 100644
index 0000000..179c09d
--- /dev/null
+++ b/src/test/regress/regression.diffs
@@ -0,0 +1,294 @@
+*** /home/tomas/work/postgres/src/test/regress/expected/updatable_views.out 2014-10-29 00:22:04.820171312 +0100
+--- /home/tomas/work/postgres/src/test/regress/results/updatable_views.out 2014-11-10 02:54:44.083052362 +0100
+***************
+*** 657,668 ****
+ FROM information_schema.views
+ WHERE table_name LIKE 'rw_view%'
+ ORDER BY table_name;
+! table_name | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into
+! ------------+--------------+--------------------+----------------------+----------------------+----------------------------
+! rw_view1 | NO | NO | NO | NO | NO
+! rw_view2 | NO | NO | NO | NO | NO
+! (2 rows)
+!
+ SELECT table_name, column_name, is_updatable
+ FROM information_schema.columns
+ WHERE table_name LIKE 'rw_view%'
+--- 657,663 ----
+ FROM information_schema.views
+ WHERE table_name LIKE 'rw_view%'
+ ORDER BY table_name;
+! ERROR: no relation entry for relid 1880
+ SELECT table_name, column_name, is_updatable
+ FROM information_schema.columns
+ WHERE table_name LIKE 'rw_view%'
+***************
+*** 710,721 ****
+ FROM information_schema.views
+ WHERE table_name LIKE 'rw_view%'
+ ORDER BY table_name;
+! table_name | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into
+! ------------+--------------+--------------------+----------------------+----------------------+----------------------------
+! rw_view1 | NO | NO | NO | NO | YES
+! rw_view2 | NO | NO | NO | NO | NO
+! (2 rows)
+!
+ SELECT table_name, column_name, is_updatable
+ FROM information_schema.columns
+ WHERE table_name LIKE 'rw_view%'
+--- 705,711 ----
+ FROM information_schema.views
+ WHERE table_name LIKE 'rw_view%'
+ ORDER BY table_name;
+! ERROR: no relation entry for relid 1880
+ SELECT table_name, column_name, is_updatable
+ FROM information_schema.columns
+ WHERE table_name LIKE 'rw_view%'
+***************
+*** 746,757 ****
+ FROM information_schema.views
+ WHERE table_name LIKE 'rw_view%'
+ ORDER BY table_name;
+! table_name | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into
+! ------------+--------------+--------------------+----------------------+----------------------+----------------------------
+! rw_view1 | NO | NO | YES | NO | YES
+! rw_view2 | NO | NO | NO | NO | NO
+! (2 rows)
+!
+ SELECT table_name, column_name, is_updatable
+ FROM information_schema.columns
+ WHERE table_name LIKE 'rw_view%'
+--- 736,742 ----
+ FROM information_schema.views
+ WHERE table_name LIKE 'rw_view%'
+ ORDER BY table_name;
+! ERROR: no relation entry for relid 1880
+ SELECT table_name, column_name, is_updatable
+ FROM information_schema.columns
+ WHERE table_name LIKE 'rw_view%'
+***************
+*** 782,793 ****
+ FROM information_schema.views
+ WHERE table_name LIKE 'rw_view%'
+ ORDER BY table_name;
+! table_name | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into
+! ------------+--------------+--------------------+----------------------+----------------------+----------------------------
+! rw_view1 | NO | NO | YES | YES | YES
+! rw_view2 | NO | NO | NO | NO | NO
+! (2 rows)
+!
+ SELECT table_name, column_name, is_updatable
+ FROM information_schema.columns
+ WHERE table_name LIKE 'rw_view%'
+--- 767,773 ----
+ FROM information_schema.views
+ WHERE table_name LIKE 'rw_view%'
+ ORDER BY table_name;
+! ERROR: no relation entry for relid 1880
+ SELECT table_name, column_name, is_updatable
+ FROM information_schema.columns
+ WHERE table_name LIKE 'rw_view%'
+***************
+*** 1385,1398 ****
+ Options: check_option=local
+
+ SELECT * FROM information_schema.views WHERE table_name = 'rw_view1';
+! table_catalog | table_schema | table_name | view_definition | check_option | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into
+! ---------------+--------------+------------+------------------------------------+--------------+--------------+--------------------+----------------------+----------------------+----------------------------
+! regression | public | rw_view1 | SELECT base_tbl.a, +| LOCAL | YES | YES | NO | NO | NO
+! | | | base_tbl.b +| | | | | |
+! | | | FROM base_tbl +| | | | | |
+! | | | WHERE (base_tbl.a < base_tbl.b); | | | | | |
+! (1 row)
+!
+ INSERT INTO rw_view1 VALUES(3,4); -- ok
+ INSERT INTO rw_view1 VALUES(4,3); -- should fail
+ ERROR: new row violates WITH CHECK OPTION for "rw_view1"
+--- 1365,1371 ----
+ Options: check_option=local
+
+ SELECT * FROM information_schema.views WHERE table_name = 'rw_view1';
+! ERROR: no relation entry for relid 1880
+ INSERT INTO rw_view1 VALUES(3,4); -- ok
+ INSERT INTO rw_view1 VALUES(4,3); -- should fail
+ ERROR: new row violates WITH CHECK OPTION for "rw_view1"
+***************
+*** 1437,1449 ****
+ Options: check_option=cascaded
+
+ SELECT * FROM information_schema.views WHERE table_name = 'rw_view2';
+! table_catalog | table_schema | table_name | view_definition | check_option | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into
+! ---------------+--------------+------------+----------------------------+--------------+--------------+--------------------+----------------------+----------------------+----------------------------
+! regression | public | rw_view2 | SELECT rw_view1.a +| CASCADED | YES | YES | NO | NO | NO
+! | | | FROM rw_view1 +| | | | | |
+! | | | WHERE (rw_view1.a < 10); | | | | | |
+! (1 row)
+!
+ INSERT INTO rw_view2 VALUES (-5); -- should fail
+ ERROR: new row violates WITH CHECK OPTION for "rw_view1"
+ DETAIL: Failing row contains (-5).
+--- 1410,1416 ----
+ Options: check_option=cascaded
+
+ SELECT * FROM information_schema.views WHERE table_name = 'rw_view2';
+! ERROR: no relation entry for relid 1880
+ INSERT INTO rw_view2 VALUES (-5); -- should fail
+ ERROR: new row violates WITH CHECK OPTION for "rw_view1"
+ DETAIL: Failing row contains (-5).
+***************
+*** 1477,1489 ****
+ Options: check_option=local
+
+ SELECT * FROM information_schema.views WHERE table_name = 'rw_view2';
+! table_catalog | table_schema | table_name | view_definition | check_option | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into
+! ---------------+--------------+------------+----------------------------+--------------+--------------+--------------------+----------------------+----------------------+----------------------------
+! regression | public | rw_view2 | SELECT rw_view1.a +| LOCAL | YES | YES | NO | NO | NO
+! | | | FROM rw_view1 +| | | | | |
+! | | | WHERE (rw_view1.a < 10); | | | | | |
+! (1 row)
+!
+ INSERT INTO rw_view2 VALUES (-10); -- ok, but not in view
+ INSERT INTO rw_view2 VALUES (20); -- should fail
+ ERROR: new row violates WITH CHECK OPTION for "rw_view2"
+--- 1444,1450 ----
+ Options: check_option=local
+
+ SELECT * FROM information_schema.views WHERE table_name = 'rw_view2';
+! ERROR: no relation entry for relid 1880
+ INSERT INTO rw_view2 VALUES (-10); -- ok, but not in view
+ INSERT INTO rw_view2 VALUES (20); -- should fail
+ ERROR: new row violates WITH CHECK OPTION for "rw_view2"
+***************
+*** 1517,1529 ****
+ WHERE rw_view1.a < 10;
+
+ SELECT * FROM information_schema.views WHERE table_name = 'rw_view2';
+! table_catalog | table_schema | table_name | view_definition | check_option | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into
+! ---------------+--------------+------------+----------------------------+--------------+--------------+--------------------+----------------------+----------------------+----------------------------
+! regression | public | rw_view2 | SELECT rw_view1.a +| NONE | YES | YES | NO | NO | NO
+! | | | FROM rw_view1 +| | | | | |
+! | | | WHERE (rw_view1.a < 10); | | | | | |
+! (1 row)
+!
+ INSERT INTO rw_view2 VALUES (30); -- ok, but not in view
+ SELECT * FROM base_tbl;
+ a
+--- 1478,1484 ----
+ WHERE rw_view1.a < 10;
+
+ SELECT * FROM information_schema.views WHERE table_name = 'rw_view2';
+! ERROR: no relation entry for relid 1880
+ INSERT INTO rw_view2 VALUES (30); -- ok, but not in view
+ SELECT * FROM base_tbl;
+ a
+***************
+*** 1543,1559 ****
+ CREATE VIEW rw_view2 AS SELECT * FROM rw_view1 WHERE a > 0;
+ CREATE VIEW rw_view3 AS SELECT * FROM rw_view2 WITH CHECK OPTION;
+ SELECT * FROM information_schema.views WHERE table_name LIKE E'rw\\_view_' ORDER BY table_name;
+! table_catalog | table_schema | table_name | view_definition | check_option | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into
+! ---------------+--------------+------------+---------------------------+--------------+--------------+--------------------+----------------------+----------------------+----------------------------
+! regression | public | rw_view1 | SELECT base_tbl.a +| CASCADED | YES | YES | NO | NO | NO
+! | | | FROM base_tbl; | | | | | |
+! regression | public | rw_view2 | SELECT rw_view1.a +| NONE | YES | YES | NO | NO | NO
+! | | | FROM rw_view1 +| | | | | |
+! | | | WHERE (rw_view1.a > 0); | | | | | |
+! regression | public | rw_view3 | SELECT rw_view2.a +| CASCADED | YES | YES | NO | NO | NO
+! | | | FROM rw_view2; | | | | | |
+! (3 rows)
+!
+ INSERT INTO rw_view1 VALUES (-1); -- ok
+ INSERT INTO rw_view1 VALUES (1); -- ok
+ INSERT INTO rw_view2 VALUES (-2); -- ok, but not in view
+--- 1498,1504 ----
+ CREATE VIEW rw_view2 AS SELECT * FROM rw_view1 WHERE a > 0;
+ CREATE VIEW rw_view3 AS SELECT * FROM rw_view2 WITH CHECK OPTION;
+ SELECT * FROM information_schema.views WHERE table_name LIKE E'rw\\_view_' ORDER BY table_name;
+! ERROR: no relation entry for relid 1880
+ INSERT INTO rw_view1 VALUES (-1); -- ok
+ INSERT INTO rw_view1 VALUES (1); -- ok
+ INSERT INTO rw_view2 VALUES (-2); -- ok, but not in view
+
+======================================================================
+
+*** /home/tomas/work/postgres/src/test/regress/expected/sanity_check.out 2014-10-29 00:22:04.812171313 +0100
+--- /home/tomas/work/postgres/src/test/regress/results/sanity_check.out 2014-11-10 02:54:44.150052357 +0100
+***************
+*** 113,118 ****
+--- 113,119 ----
+ pg_language|t
+ pg_largeobject|t
+ pg_largeobject_metadata|t
++ pg_mv_statistic|t
+ pg_namespace|t
+ pg_opclass|t
+ pg_operator|t
+
+======================================================================
+
+*** /home/tomas/work/postgres/src/test/regress/expected/rowsecurity.out 2014-10-29 00:22:04.811171313 +0100
+--- /home/tomas/work/postgres/src/test/regress/results/rowsecurity.out 2014-11-10 02:54:45.775052238 +0100
+***************
+*** 901,925 ****
+ -- prepared statement with rls_regress_user0 privilege
+ PREPARE p1(int) AS SELECT * FROM t1 WHERE a <= $1;
+ EXECUTE p1(2);
+! a | b
+! ---+-----
+! 2 | bbb
+! 2 | bcd
+! 2 | yyy
+! (3 rows)
+!
+ EXPLAIN (COSTS OFF) EXECUTE p1(2);
+! QUERY PLAN
+! ----------------------------------------------
+! Append
+! -> Seq Scan on t1
+! Filter: ((a <= 2) AND ((a % 2) = 0))
+! -> Seq Scan on t2
+! Filter: ((a <= 2) AND ((a % 2) = 0))
+! -> Seq Scan on t3
+! Filter: ((a <= 2) AND ((a % 2) = 0))
+! (7 rows)
+!
+ -- superuser is allowed to bypass RLS checks
+ RESET SESSION AUTHORIZATION;
+ SET row_security TO OFF;
+--- 901,909 ----
+ -- prepared statement with rls_regress_user0 privilege
+ PREPARE p1(int) AS SELECT * FROM t1 WHERE a <= $1;
+ EXECUTE p1(2);
+! ERROR: no relation entry for relid 530
+ EXPLAIN (COSTS OFF) EXECUTE p1(2);
+! ERROR: no relation entry for relid 530
+ -- superuser is allowed to bypass RLS checks
+ RESET SESSION AUTHORIZATION;
+ SET row_security TO OFF;
+
+======================================================================
+
+*** /home/tomas/work/postgres/src/test/regress/expected/rules.out 2014-10-29 00:22:04.812171313 +0100
+--- /home/tomas/work/postgres/src/test/regress/results/rules.out 2014-11-10 02:54:48.329052050 +0100
+***************
+*** 1353,1358 ****
+--- 1353,1368 ----
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
+ LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
+ WHERE (c.relkind = 'm'::"char");
++ pg_mv_stats| SELECT n.nspname AS schemaname,
++ c.relname AS tablename,
++ s.stakeys AS attnums,
++ length(s.stamcv) AS mcvbytes,
++ pg_mv_stats_mvclist_info(s.stamcv) AS mcvinfo,
++ length(s.stahist) AS histbytes,
++ pg_mv_stats_histogram_info(s.stahist) AS histinfo
++ FROM ((pg_mv_statistic s
++ JOIN pg_class c ON ((c.oid = s.starelid)))
++ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
+ pg_policies| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ rs.rsecpolname AS policyname,
+
+======================================================================
+
diff --git a/src/test/regress/regression.out b/src/test/regress/regression.out
new file mode 100644
index 0000000..48a4a25
--- /dev/null
+++ b/src/test/regress/regression.out
@@ -0,0 +1,147 @@
+test tablespace ... ok
+test boolean ... ok
+test char ... ok
+test name ... ok
+test varchar ... ok
+test text ... ok
+test int2 ... ok
+test int4 ... ok
+test int8 ... ok
+test oid ... ok
+test float4 ... ok
+test float8 ... ok
+test bit ... ok
+test numeric ... ok
+test txid ... ok
+test uuid ... ok
+test enum ... ok
+test money ... ok
+test rangetypes ... ok
+test pg_lsn ... ok
+test regproc ... ok
+test strings ... ok
+test numerology ... ok
+test point ... ok
+test lseg ... ok
+test line ... ok
+test box ... ok
+test path ... ok
+test polygon ... ok
+test circle ... ok
+test date ... ok
+test time ... ok
+test timetz ... ok
+test timestamp ... ok
+test timestamptz ... ok
+test interval ... ok
+test abstime ... ok
+test reltime ... ok
+test tinterval ... ok
+test inet ... ok
+test macaddr ... ok
+test tstypes ... ok
+test comments ... ok
+test geometry ... ok
+test horology ... ok
+test regex ... ok
+test oidjoins ... ok
+test type_sanity ... ok
+test opr_sanity ... ok
+test insert ... ok
+test create_function_1 ... ok
+test create_type ... ok
+test create_table ... ok
+test create_function_2 ... ok
+test copy ... ok
+test copyselect ... ok
+test create_misc ... ok
+test create_operator ... ok
+test create_index ... ok
+test create_view ... ok
+test create_aggregate ... ok
+test create_function_3 ... ok
+test create_cast ... ok
+test constraints ... ok
+test triggers ... ok
+test inherit ... ok
+test create_table_like ... ok
+test typed_table ... ok
+test vacuum ... ok
+test drop_if_exists ... ok
+test updatable_views ... FAILED
+test sanity_check ... FAILED
+test errors ... ok
+test select ... ok
+test select_into ... ok
+test select_distinct ... ok
+test select_distinct_on ... ok
+test select_implicit ... ok
+test select_having ... ok
+test subselect ... ok
+test union ... ok
+test case ... ok
+test join ... ok
+test aggregates ... ok
+test transactions ... ok
+test random ... ok
+test portals ... ok
+test arrays ... ok
+test btree_index ... ok
+test hash_index ... ok
+test update ... ok
+test delete ... ok
+test namespace ... ok
+test prepared_xacts ... ok
+test privileges ... ok
+test security_label ... ok
+test collate ... ok
+test matview ... ok
+test lock ... ok
+test replica_identity ... ok
+test rowsecurity ... FAILED
+test alter_generic ... ok
+test brin ... ok
+test misc ... ok
+test psql ... ok
+test async ... ok
+test rules ... FAILED
+test event_trigger ... ok
+test select_views ... ok
+test portals_p2 ... ok
+test foreign_key ... ok
+test cluster ... ok
+test dependency ... ok
+test guc ... ok
+test bitmapops ... ok
+test combocid ... ok
+test tsearch ... ok
+test tsdicts ... ok
+test foreign_data ... ok
+test window ... ok
+test xmlmap ... ok
+test functional_deps ... ok
+test advisory_lock ... ok
+test json ... ok
+test jsonb ... ok
+test indirect_toast ... ok
+test equivclass ... ok
+test plancache ... ok
+test limit ... ok
+test plpgsql ... ok
+test copy2 ... ok
+test temp ... ok
+test domain ... ok
+test rangefuncs ... ok
+test prepare ... ok
+test without_oid ... ok
+test conversion ... ok
+test truncate ... ok
+test alter_table ... ok
+test sequence ... ok
+test polymorphism ... ok
+test rowtypes ... ok
+test returning ... ok
+test largeobject ... ok
+test with ... ok
+test xml ... ok
+test stats ... ok
On 12 October 2014 23:00, Tomas Vondra <tv@fuzzy.cz> wrote:
It however seems to be working sufficiently well at this point, enough
to get some useful feedback. So here we go.
This looks interesting and useful.
What I'd like to check before a detailed review is that this has
sufficient applicability to be useful.
My understanding is that Q9 and Q18 of TPC-H have poor plans as a
result of multi-column stats errors.
Could you look at those queries and confirm that this patch can
produce better plans for them?
If so, I will work with you to review this patch.
One aspect of the patch that seems to be missing is a user declaration
of correlation, just as we have for setting n_distinct. It seems like
an even easier place to start to just let the user specify the stats
declaratively. That way we can split the patch into two parts. First,
allow multi column stats that are user declared. Then add user stats
collected by ANALYZE. The first part is possibly contentious and thus
a good initial focus. The second part will have lots of discussion, so
good to skip for a first version.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Dne 13 Listopad 2014, 12:31, Simon Riggs napsal(a):
On 12 October 2014 23:00, Tomas Vondra <tv@fuzzy.cz> wrote:
It however seems to be working sufficiently well at this point, enough
to get some useful feedback. So here we go.This looks interesting and useful.
What I'd like to check before a detailed review is that this has
sufficient applicability to be useful.My understanding is that Q9 and Q18 of TPC-H have poor plans as a
result of multi-column stats errors.Could you look at those queries and confirm that this patch can
produce better plans for them?
Sure. I planned to do such verification/demonstration anyway, after
discussing the overall approach.
I planned to give it a try on TPC-DS, but I can start with the TPC-H
queries you propose. I'm not sure whether the poor estimates in Q9 & Q18
come from column correlation though - if it's due to some other issues
(e.g. conditions that are difficult to estimate), this patch can't do
anything with them. But it's a good start.
If so, I will work with you to review this patch.
Thanks!
One aspect of the patch that seems to be missing is a user declaration
of correlation, just as we have for setting n_distinct. It seems like
an even easier place to start to just let the user specify the stats
declaratively. That way we can split the patch into two parts. First,
allow multi column stats that are user declared. Then add user stats
collected by ANALYZE. The first part is possibly contentious and thus
a good initial focus. The second part will have lots of discussion, so
good to skip for a first version.
I'm not a big fan of this approach, for a number of reasons.
Firstly, it only works for "simple" parameters that are trivial to specify
(say, Pearson's correlation coefficient), and the patch does not work with
those at all - it only works with histograms, MCV lists (and might work
with associative rules in the future). And we certainly can't ask users to
specify multivariate histograms - because it's very difficult to do, and
also because complex stats are more susceptible to get stale after adding
new data to the table.
Secondly, even if we add such "simple" parameters to the patch, we have to
come up with a way to apply those parameters to the estimates. The
problem is that as the parameters get simpler, it's less and less useful
to compute the stats.
Another question is whether it should support more than 2 columns ...
The only place where I think this might work are the associative rules.
It's simple to specify rules like ("ZIP code" implies "city") and we could
even do some simple check against the data to see if it actually makes
sense (and 'disable' the rule if not).
But maybe I got it wrong and you have something particular in mind? Can
you give an example of how it would work?
regards
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13.11.2014 14:11, Tomas Vondra wrote:
Dne 13 Listopad 2014, 12:31, Simon Riggs napsal(a):
On 12 October 2014 23:00, Tomas Vondra <tv@fuzzy.cz> wrote:
It however seems to be working sufficiently well at this point, enough
to get some useful feedback. So here we go.This looks interesting and useful.
What I'd like to check before a detailed review is that this has
sufficient applicability to be useful.My understanding is that Q9 and Q18 of TPC-H have poor plans as a
result of multi-column stats errors.Could you look at those queries and confirm that this patch can
produce better plans for them?Sure. I planned to do such verification/demonstration anyway, after
discussing the overall approach.I planned to give it a try on TPC-DS, but I can start with the TPC-H
queries you propose. I'm not sure whether the poor estimates in Q9 & Q18
come from column correlation though - if it's due to some other issues
(e.g. conditions that are difficult to estimate), this patch can't do
anything with them. But it's a good start.If so, I will work with you to review this patch.
Thanks!
One aspect of the patch that seems to be missing is a user declaration
of correlation, just as we have for setting n_distinct. It seems like
an even easier place to start to just let the user specify the stats
declaratively. That way we can split the patch into two parts. First,
allow multi column stats that are user declared. Then add user stats
collected by ANALYZE. The first part is possibly contentious and thus
a good initial focus. The second part will have lots of discussion, so
good to skip for a first version.I'm not a big fan of this approach, for a number of reasons.
Firstly, it only works for "simple" parameters that are trivial to specify
(say, Pearson's correlation coefficient), and the patch does not work with
those at all - it only works with histograms, MCV lists (and might work
with associative rules in the future). And we certainly can't ask users to
specify multivariate histograms - because it's very difficult to do, and
also because complex stats are more susceptible to get stale after adding
new data to the table.Secondly, even if we add such "simple" parameters to the patch, we have to
come up with a way to apply those parameters to the estimates. The
problem is that as the parameters get simpler, it's less and less useful
to compute the stats.Another question is whether it should support more than 2 columns ...
The only place where I think this might work are the associative rules.
It's simple to specify rules like ("ZIP code" implies "city") and we could
even do some simple check against the data to see if it actually makes
sense (and 'disable' the rule if not).
and even this simple example has its limits, at least in Germany ZIP
codes are not unique for rural areas, where several villages have the
same ZIP code.
I guess there are just a few examples where columns are completely
functional dependent without any exceptions.
But of course, if the user gives this information just for optimization
the statistics, some exceptions don't matter.
If this information should be used for creating different execution
plans (e.g. on column A is an index and column B is functional
dependent, one could think about using this index on A and the
dependency instead of running through the whole table to find all tuples
that fit the query on column B), exceptions are a very important issue.
But maybe I got it wrong and you have something particular in mind? Can
you give an example of how it would work?regards
Tomas
--
Dipl.-Math. Katharina Büchse
Friedrich-Schiller-Universität Jena
Institut für Informatik
Lehrstuhl für Datenbanken und Informationssysteme
Ernst-Abbe-Platz 2
07743 Jena
Telefon 03641/946367
Webseite http://users.minet.uni-jena.de/~re89qen/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Dne 13 Listopad 2014, 16:51, Katharina Büchse napsal(a):
On 13.11.2014 14:11, Tomas Vondra wrote:
The only place where I think this might work are the associative rules.
It's simple to specify rules like ("ZIP code" implies "city") and we
could
even do some simple check against the data to see if it actually makes
sense (and 'disable' the rule if not).and even this simple example has its limits, at least in Germany ZIP
codes are not unique for rural areas, where several villages have the
same ZIP code.I guess there are just a few examples where columns are completely
functional dependent without any exceptions.
But of course, if the user gives this information just for optimization
the statistics, some exceptions don't matter.
If this information should be used for creating different execution
plans (e.g. on column A is an index and column B is functional
dependent, one could think about using this index on A and the
dependency instead of running through the whole table to find all tuples
that fit the query on column B), exceptions are a very important issue.
Yes, exactly. The aim of this patch is "only" improving estimates, not
removing conditions from the plan (e.g. checking only the ZIP code and not
the city name). That certainly can't be done solely based on approximate
statistics, and as you point out most real-world data either contain bugs
or are inherently imperfect (we have the same kind of ZIP/city
inconsistencies in Czech). That's not a big issue for estimates (assuming
only small fraction of rows violates the rule) though.
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tomas Vondra <tv@fuzzy.cz> wrote:
Dne 13 Listopad 2014, 16:51, Katharina Büchse napsal(a):
On 13.11.2014 14:11, Tomas Vondra wrote:
The only place where I think this might work are the associative rules.
It's simple to specify rules like ("ZIP code" implies "city") and we could
even do some simple check against the data to see if it actually makes
sense (and 'disable' the rule if not).and even this simple example has its limits, at least in Germany ZIP
codes are not unique for rural areas, where several villages have the
same ZIP code.
as you point out most real-world data either contain bugs
or are inherently imperfect (we have the same kind of ZIP/city
inconsistencies in Czech).
You can have lots of fun with U.S. zip code, too. Just on the
nominally "Madison, Wisconsin" zip codes (those starting with 537),
there are several exceptions:
select zipcode, city, locationtype
from zipcode
where zipcode like '537%'
and Decommisioned = 'false'
and zipcodetype = 'STANDARD'
and locationtype in ('PRIMARY', 'ACCEPTABLE')
order by zipcode, city;
zipcode | city | locationtype
---------+-----------+--------------
53703 | MADISON | PRIMARY
53704 | MADISON | PRIMARY
53705 | MADISON | PRIMARY
53706 | MADISON | PRIMARY
53711 | FITCHBURG | ACCEPTABLE
53711 | MADISON | PRIMARY
53713 | FITCHBURG | ACCEPTABLE
53713 | MADISON | PRIMARY
53713 | MONONA | ACCEPTABLE
53714 | MADISON | PRIMARY
53714 | MONONA | ACCEPTABLE
53715 | MADISON | PRIMARY
53716 | MADISON | PRIMARY
53716 | MONONA | ACCEPTABLE
53717 | MADISON | PRIMARY
53718 | MADISON | PRIMARY
53719 | FITCHBURG | ACCEPTABLE
53719 | MADISON | PRIMARY
53725 | MADISON | PRIMARY
53726 | MADISON | PRIMARY
53744 | MADISON | PRIMARY
(21 rows)
If you eliminate the quals besides the zipcode column you get 61
rows and it gets much stranger, with legal municipalities that are
completely surrounded by Madison that the postal service would
rather you didn't use in addressing your envelopes, but they have
to deliver to anyway, and organizations inside Madison receiving
enough mail to (literally) have their own zip code -- where the
postal service allows the organization name as a deliverable
"city".
If you want to have your own fun with this data, you can download
it here:
http://federalgovernmentzipcodes.us/free-zipcode-database.csv
I was able to load it into PostgreSQL with this:
create table zipcode
(
recordnumber integer not null,
zipcode text not null,
zipcodetype text not null,
city text not null,
state text not null,
locationtype text not null,
lat double precision,
long double precision,
xaxis double precision not null,
yaxis double precision not null,
zaxis double precision not null,
worldregion text not null,
country text not null,
locationtext text,
location text,
decommisioned text not null,
taxreturnsfiled bigint,
estimatedpopulation bigint,
totalwages bigint,
notes text
);
comment on column zipcode.zipcode is 'Zipcode or military postal code(FPO/APO)';
comment on column zipcode.zipcodetype is 'Standard, PO BOX Only, Unique, Military(implies APO or FPO)';
comment on column zipcode.city is 'offical city name(s)';
comment on column zipcode.state is 'offical state, territory, or quasi-state (AA, AE, AP) abbreviation code';
comment on column zipcode.locationtype is 'Primary, Acceptable,Not Acceptable';
comment on column zipcode.lat is 'Decimal Latitude, if available';
comment on column zipcode.long is 'Decimal Longitude, if available';
comment on column zipcode.location is 'Standard Display (eg Phoenix, AZ ; Pago Pago, AS ; Melbourne, AU )';
comment on column zipcode.decommisioned is 'If Primary location, Yes implies historical Zipcode, No Implies current Zipcode; If not Primary, Yes implies Historical Placename';
comment on column zipcode.taxreturnsfiled is 'Number of Individual Tax Returns Filed in 2008';
copy zipcode from 'filepath' with (format csv, header);
alter table zipcode add primary key (recordnumber);
create unique index zipcode_city on zipcode (zipcode, city);
I bet there are all sorts of correlation possibilities with, for
example, latitude and longitude and other variables. With 81831
rows and so many correlations among the columns, it might be a
useful data set to test with.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 15.11.2014 18:49, Kevin Grittner
If you eliminate the quals besides the zipcode column you get 61
rows and it gets much stranger, with legal municipalities that are
completely surrounded by Madison that the postal service would
rather you didn't use in addressing your envelopes, but they have
to deliver to anyway, and organizations inside Madison receiving
enough mail to (literally) have their own zip code -- where the
postal service allows the organization name as a deliverable
"city".If you want to have your own fun with this data, you can download
it here:http://federalgovernmentzipcodes.us/free-zipcode-database.csv
...
I bet there are all sorts of correlation possibilities with, for
example, latitude and longitude and other variables. With 81831
rows and so many correlations among the columns, it might be a
useful data set to test with.
Thanks for the link. I've been looking for a good dataset with such
data, and this one is by far the best one.
The current version of the patch supports only data types passed by
value (i.e. no varlena types - text, ), which means it's impossible to
build multivariate stats on some of the interesting columns (state,
city, ...).
I guess it's time to start working on removing this limitation.
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Nov 16, 2014 at 3:35 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
Thanks for the link. I've been looking for a good dataset with such
data, and this one is by far the best one.The current version of the patch supports only data types passed by
value (i.e. no varlena types - text, ), which means it's impossible to
build multivariate stats on some of the interesting columns (state,
city, ...).I guess it's time to start working on removing this limitation.
Tomas, what's your status on this patch? Are you planning to make it
more complicated than it is? For now I have switched it to a "Needs
Review" state because even your first version did not get advanced
review (that's quite big btw). I guess that we should switch it to the
next CF.
Regards,
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 8.12.2014 02:01, Michael Paquier wrote:
On Sun, Nov 16, 2014 at 3:35 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
Thanks for the link. I've been looking for a good dataset with such
data, and this one is by far the best one.The current version of the patch supports only data types passed by
value (i.e. no varlena types - text, ), which means it's impossible to
build multivariate stats on some of the interesting columns (state,
city, ...).I guess it's time to start working on removing this limitation.
Tomas, what's your status on this patch? Are you planning to make it
more complicated than it is? For now I have switched it to a "Needs
Review" state because even your first version did not get advanced
review (that's quite big btw). I guess that we should switch it to the
next CF.
Hello Michael,
I agree with moving the patch to the next CF - I'm working on the patch,
but I will take a bit more time to submit a new version and I can do
that in the next CF.
regards
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/13/2014 01:00 AM, Tomas Vondra wrote:
Hi,
attached is a WIP patch implementing multivariate statistics.
Great! Really glad to see you working on this.
+ * FIXME This sample sizing is mostly OK when computing stats for + * individual columns, but when computing multi-variate stats + * for multivariate stats (histograms, mcv, ...) it's rather + * insufficient. For small number of dimensions it works, but + * for complex stats it'd be nice use sample proportional to + * the table (say, 0.5% - 1%) instead of a fixed size.
I don't think a fraction of the table is appropriate. As long as the
sample is random, the accuracy of a sample doesn't depend much on the
size of the population. For example, if you sample 1,000 rows from a
table with 100,000 rows, or 1000 rows from a table with 100,000,000
rows, the accuracy is pretty much the same. That doesn't change when you
go from a single variable to multiple variables.
You do need a bigger sample with multiple variables, however. My gut
feeling is that if you sample N rows for a single variable, with two
variables you need to sample N^2 rows to get the same accuracy. But it's
not proportional to the table size. (I have no proof for that, but I'm
sure there is literature on this.)
+ * Multivariate histograms + * + * Histograms are a collection of buckets, represented by n-dimensional + * rectangles. Each rectangle is delimited by an array of lower and + * upper boundaries, so that for for the i-th attribute + * + * min[i] <= value[i] <= max[i] + * + * Each bucket tracks frequency (fraction of tuples it contains), + * information about the inequalities, number of distinct values in + * each dimension (which is used when building the histogram) etc. + * + * The boundaries may be either inclusive or exclusive, or the whole + * dimension may be NULL. + * + * The buckets may overlap (assuming the build algorithm keeps the + * frequencies additive) or may not cover the whole space (i.e. allow + * gaps). This entirely depends on the algorithm used to build the + * histogram.
That sounds pretty exotic. These buckets are quite different from the
single-dimension buckets we currently have.
The paper you reference in partition_bucket() function, M.
Muralikrishna, David J. DeWitt: Equi-Depth Histograms For Estimating
Selectivity Factors For Multi-Dimensional Queries. SIGMOD Conference
1988: 28-36, actually doesn't mention overlapping buckets at all. I
haven't read the code in detail, but if it implements the algorithm from
that paper, there will be no overlap.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 11.12.2014 17:53, Heikki Linnakangas wrote:
On 10/13/2014 01:00 AM, Tomas Vondra wrote:
Hi,
attached is a WIP patch implementing multivariate statistics.
Great! Really glad to see you working on this.
+ * FIXME This sample sizing is mostly OK when computing stats for + * individual columns, but when computing multi-variate stats + * for multivariate stats (histograms, mcv, ...) it's rather + * insufficient. For small number of dimensions it works, but + * for complex stats it'd be nice use sample proportional to + * the table (say, 0.5% - 1%) instead of a fixed size.I don't think a fraction of the table is appropriate. As long as the
sample is random, the accuracy of a sample doesn't depend much on
the size of the population. For example, if you sample 1,000 rows
from a table with 100,000 rows, or 1000 rows from a table with
100,000,000 rows, the accuracy is pretty much the same. That doesn't
change when you go from a single variable to multiple variables.
I might be wrong, but I doubt that. First, I read a number of papers
while working on this patch, and all of them used samples proportional
to the data set. That's an indirect evidence, though.
You do need a bigger sample with multiple variables, however. My gut
feeling is that if you sample N rows for a single variable, with two
variables you need to sample N^2 rows to get the same accuracy. But
it's not proportional to the table size. (I have no proof for that,
but I'm sure there is literature on this.)
Maybe. I think it's somehow related to the number of buckets (which
somehow determines the precision of the histogram). If you want 1000
buckets, the number of rows scanned needs to be e.g. 10x that. With
multi-variate histograms, we may shoot for more buckets (say, 100 in
each dimension).
+ * Multivariate histograms + * + * Histograms are a collection of buckets, represented by n-dimensional + * rectangles. Each rectangle is delimited by an array of lower and + * upper boundaries, so that for for the i-th attribute + * + * min[i] <= value[i] <= max[i] + * + * Each bucket tracks frequency (fraction of tuples it contains), + * information about the inequalities, number of distinct values in + * each dimension (which is used when building the histogram) etc. + * + * The boundaries may be either inclusive or exclusive, or the whole + * dimension may be NULL. + * + * The buckets may overlap (assuming the build algorithm keeps the + * frequencies additive) or may not cover the whole space (i.e. allow + * gaps). This entirely depends on the algorithm used to build the + * histogram.That sounds pretty exotic. These buckets are quite different from
the single-dimension buckets we currently have.The paper you reference in partition_bucket() function, M.
Muralikrishna, David J. DeWitt: Equi-Depth Histograms For Estimating
Selectivity Factors For Multi-Dimensional Queries. SIGMOD Conference
1988: 28-36, actually doesn't mention overlapping buckets at all. I
haven't read the code in detail, but if it implements the algorithm
from that paper, there will be no overlap.
The algorithm implemented in partition_bucket() is very simple and
naive, and it mostly resembles the algorithm described in the paper. I'm
sure there are differences, it's not a 1:1 implementation, but you're
right it produces non-overlapping buckets.
The point is that I envision more complex algorithms or different
histogram types, and some of them may produce overlapping buckets. Maybe
that's premature comment, and it will turn out it's not really necessary.
regards
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Dec 10, 2014 at 5:15 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
I agree with moving the patch to the next CF - I'm working on the patch,
but I will take a bit more time to submit a new version and I can do
that in the next CF.
OK cool. I just moved it by myself. I didn't see it yet registered in 2014-12.
Thanks,
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Dec 15, 2014 at 11:55 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Wed, Dec 10, 2014 at 5:15 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
I agree with moving the patch to the next CF - I'm working on the patch,
but I will take a bit more time to submit a new version and I can do
that in the next CF.OK cool. I just moved it by myself. I didn't see it yet registered in 2014-12.
Marked as returned with feedback. No new version showed up in the last
month and this patch was waiting for input from author.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
attached is an updated version of the multivariate stats patch. This is
going to be a bit longer mail, so I'll put here a small ToC ;-)
1) patch split into 4 parts
2) where to start / documentation
3) state of the code
4) main changes/improvements
5) remaining limitations
The motivation and design ideas, explained in the first message of this
thread are still valid. It might be a good idea to read it first:
/messages/by-id/543AFA15.4080608@fuzzy.cz
BTW if you happen to go to FOSDEM [PGDay], I'll gladly give you an intro
into the patch in person, or discuss the patch in general.
1) Patch split into 4 parts
---------------------------
Firstly, the patch got broken into the following four pieces, to make
the reviews somewhat easier:
1) 0001-shared-infrastructure-and-functional-dependencies.patch
- infrastructure, shared by all the kinds of stats added
in the following patches (catalog, ALTER TABLE, ANALYZE ...)
- implementation of a simple statistics, tracking functional
dependencies between columns (previously called "associative
rules", but that's incorrect for several reasons)
- this does not modify the optimizer in any way
2) 0002-clause-reduction-using-functional-dependencies.patch
- applies the functional dependencies to optimizer (i.e. considers
the rules in clauselist_selectivity())
3) 0003-multivariate-MCV-lists.patch
- multivariate MCV lists (both ANALYZE and optimizer parts)
4) 0004-multivariate-histograms.patch
- multivariate histograms (both ANALYZE and optimizer parts)
You may look at the patches at github here:
https://github.com/tvondra/postgres/tree/multivariate-stats-squashed
The branch is not stable, i.e. I'll rebase / squash / force-push changes
in the future. (There's also multivariate-stats development branch with
unsquashed changes, but you don't want to look at that, trust me.)
The patches are not exactly small (being in the 50-100 kB range), but
that's mostly because of the amount of comments explaining the goals and
implementation details.
2) Where to start / documentation
---------------------------------
I strived to document all the pieces properly, mostly in the form of
comments. There's no sgml documentation at this point, which should
obviously change in the future.
Anyway, I'd suggest reading the first e-mail in this thread, explaining
the ideas, and then these comments:
1) functional dependencies (patch 0001)
- src/backend/utils/mvstats/dependencies.c
2) MCV lists (patch 0003)
- src/backend/utils/mvstats/mcv.c
3) histograms (patch 0004)
- src/backend/utils/mvstats/mcv.c
- also see clauselist_mv_selectivity_mcvlist() in clausesel.c
- also see clauselist_mv_selectivity_histogram() in clausesel.c
4) selectivity estimation (patches 0002-0004)
- all in src/backend/optimizer/path/clausesel.c
- clauselist_selectivity() - overview of how the stats are applied
- clauselist_apply_dependencies() - functional dependencies reduction
- clauselist_mv_selectivity_mcvlist() - MCV list estimation
- clauselist_mv_selectivity_histogram() - histogram estimation
3) State of the code
--------------------
I've spent a fair amount of time testing the patches, and while I
believe there are no segfaults or so, I know parts of the code need a
bit more love.
The part most in need of improvements / comments is probably the code in
clausesel.c - that seems a bit quirky. Reviews / comments regarding this
part of the code are very welcome - I'm sure there are many ways to
improve this part.
There are a few FIXMEs elsewhere (e.g. about memory allocation in the
(de)serialization code), but those are mostly well-defined issues that I
know how to address (at least I believe so).
4) Main changes/improvements
----------------------------
There are many significant improvements. The previous patch version was
in the 'proof of concept' category (missing pieces, knowingly broken in
some areas), the current patch should 'mostly work'.
The patch fixes two most annoying limitations of the first version:
(a) support for all data types (not just those passed by value)
(b) handles NULL values properly
(c) adds support for IS [NOT] NULL clauses
Aside from that the code was significantly improved, there are proper
regression tests and plenty of comments explaining the details.
5) Remaining limitations
------------------------
(a) limited to stats on 8 columns
This is mostly just a 'safeguard' restriction.
(b) only data types with '<' operator
I don't think this will change anytime soon, because all the
algorithms for building the stats rely on this. I don't see
this as a serious limitation though.
(c) not handling DROP COLUMN or DROP TABLE and so on
Currently this is not handled at all (so the regression tests
do an explicit DELETE from the pg_mv_statistic catalog).
Handling the DROP TABLE won't be difficult, it's similar to the
current stats. Handling ALTER TABLE ... DROP COLUMN will be much
more tricky I guess - should we drop all the stats referencing
that column, or should we just remove it from the stats? Or
should we keep it and treat it as NULL? Not sure what's the best
solution.
(d) limited list of compatible WHERE clauses
The initial patch handled only simple operator clauses
(Var op Constant)
where operator is one of ('<', '<=', '=', '>=', '>'). Now it also
handles IS [NOT] NULL clauses. Adding more clause types should
not be overly difficult - starting with more traditional
'BooleanTest' conditions, or even multi-column conditions
(Var op Var)
which are difficult to estimate using simple-column stats.
(e) optimizer uses single stats per table
This is still true and I don't think this will change soon. i do
have some ideas on how to merge multiple stats etc. but it's
certainly complex stuff, unlikely to happen within this CF. The
patch makes a lot of sense even without this particular feature,
because you can create multiple stats, each suitable for different
queries.
(f) no JOIN conditions
Similarly to the previous point, it's on the TODO but it's not
going to happen in this CF.
kind regards
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-shared-infrastructure-and-functional-dependencies.patchtext/x-diff; name=0001-shared-infrastructure-and-functional-dependencies.patchDownload
>From 2b8cbad288a0cd8fb5603af447c99f706ba7bbee Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 1/4] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate
stats, most importantly:
- adds a new system catalog (pg_mv_statistic)
- ALTER TABLE ... ADD STATISTICS syntax
- implementation of functional dependencies (the simplest
type of multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e.
it does not influence the query planning.
FIX: invalid assert in lookup_var_attr_stats()
The current implementation requires a valid 'ltopr'
so that we can sort the sample rows in various ways,
and the assert did verify this by checking that the
function is 'compute_scalar_stats'. This is however
private function in analyze.c, so the check failed
after moving the code into common.c.
Fixed by checking the 'ltopr' operator directly.
Eventually this will be removed, as ltopr is only
needed for histograms (functional dependencies and
MVC lists may be built without it).
FIX: improved comments about functional dependencies
FIX: add magic (MVSTAT_DEPS_MAGIC) into MVDependencies
FIX: improved analysis of functional dependencies
Changes:
- decreased minimum group size
- count contradicting rows ('not supporting' ones)
The algorithm is still rather simple and probably needs
other improvements.
FIX: add pg_mv_stats_dependencies_show() function
This function actually prints the rules, not just some basic
info (number of rules) as pg_mv_stats_dependencies_info().
FIX: (dependencies != NULL) in pg_mv_stats_dependencies_info()
STRICT is not a solution, because the deserialization may fail
for some reason (corrupted data, ...)
FIX: rename 'associative rules' to 'functional dependencies'
It's a more appropriate name as functional dependencies,
as defined in relational theory (esp. Normal Forms) are
tracking column-level dependencies.
Associative (or more correctly 'association') rules are
tracking dependencies between particular values, and not
necessarily in different columns (shopping bag analysis).
Also, did a bunch of comment improvements, minor fixes.
This does not include changes in clausesel.c!
FIX: remove obsolete Assert() enforcing typbyval types
---
src/backend/catalog/Makefile | 1 +
src/backend/catalog/system_views.sql | 10 +
src/backend/commands/analyze.c | 17 +-
src/backend/commands/tablecmds.c | 149 +++++++-
src/backend/nodes/copyfuncs.c | 15 +-
src/backend/parser/gram.y | 67 +++-
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/syscache.c | 12 +
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/common.c | 272 ++++++++++++++
src/backend/utils/mvstats/common.h | 70 ++++
src/backend/utils/mvstats/dependencies.c | 554 +++++++++++++++++++++++++++++
src/include/catalog/indexing.h | 5 +
src/include/catalog/pg_mv_statistic.h | 69 ++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/utils/mvstats.h | 86 +++++
src/include/utils/syscache.h | 1 +
src/test/regress/expected/rules.out | 8 +
src/test/regress/expected/sanity_check.out | 1 +
22 files changed, 1365 insertions(+), 9 deletions(-)
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index a403c64..d6c16f8 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 4bc874f..da957fc 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -152,6 +152,16 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 5de2b39..a02dcb2 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -54,7 +55,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Data structure for Algorithm S from Knuth 3.4.2 */
typedef struct
@@ -110,7 +115,6 @@ static void update_attstats(Oid relid, bool inh,
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
-
/*
* analyze_rel() -- analyze one relation
*/
@@ -472,6 +476,13 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For small number of dimensions it works, but
+ * for complex stats it'd be nice use sample proportional to
+ * the table (say, 0.5% - 1%) instead of a fixed size.
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -574,6 +585,9 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
@@ -2819,3 +2833,4 @@ compare_mcvs(const void *a, const void *b)
return da - db;
}
+
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 66d5083..3ec1a5a 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -35,6 +35,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_tablespace.h"
@@ -91,7 +92,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -139,8 +140,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
@@ -415,7 +417,8 @@ static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKM
static void ATExecGenericOptions(Relation rel, List *options);
static void ATExecEnableRowSecurity(Relation rel);
static void ATExecDisableRowSecurity(Relation rel);
-
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode);
static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
static const char *storage_name(char c);
@@ -2965,6 +2968,7 @@ AlterTableGetLockLevel(List *cmds)
* updates.
*/
case AT_SetStatistics: /* Uses MVCC in getTableAttrs() */
+ case AT_AddStatistics: /* XXX not sure if the right level */
case AT_ClusterOn: /* Uses MVCC in getIndexes() */
case AT_DropCluster: /* Uses MVCC in getIndexes() */
case AT_SetOptions: /* Uses MVCC in getTableAttrs() */
@@ -3119,6 +3123,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
pass = AT_PASS_ADD_CONSTR;
break;
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
+ case AT_AddStatistics: /* XXX maybe not the right place */
ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode);
/* Performs own permission checks */
ATPrepSetStatistics(rel, cmd->name, cmd->def, lockmode);
@@ -3414,6 +3419,9 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
ATExecSetStatistics(rel, cmd->name, cmd->def, lockmode);
break;
+ case AT_AddStatistics: /* ADD STATISTICS */
+ ATExecAddStatistics(tab, rel, (StatisticsDef *) cmd->def, lockmode);
+ break;
case AT_SetOptions: /* ALTER COLUMN SET ( options ) */
ATExecSetOptions(rel, cmd->name, cmd->def, false, lockmode);
break;
@@ -11605,3 +11613,136 @@ RangeVarCallbackForAlterRelation(const RangeVar *rv, Oid relid, Oid oldrelid,
ReleaseSysCache(tuple);
}
+
+/* used for sorting the attnums in ATExecAddStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the ALTER TABLE ... ADD STATISTICS (options) ON (columns).
+ *
+ * The code is an unholy mix of pieces that really belong to other parts
+ * of the source tree.
+ *
+ * FIXME Check that the types are pass-by-value and support sort,
+ * although maybe we can live without the sort (and only build
+ * MCV list / association rules).
+ *
+ * FIXME This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+
+ /* by default build everything */
+ bool build_dependencies = true;
+
+ Assert(IsA(def, StatisticsDef));
+
+ /* transform the column names to attnum values */
+
+ foreach(l, def->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, def->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ heap_freetuple(htup);
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ return;
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index f1a24f5..eb406ff 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3909,6 +3909,17 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static StatisticsDef *
+_copyStatisticsDef(const StatisticsDef *from)
+{
+ StatisticsDef *newnode = makeNode(StatisticsDef);
+
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4723,6 +4734,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_StatisticsDef:
+ retval = _copyStatisticsDef(from);
+ break;
case T_PrivGrantee:
retval = _copyPrivGrantee(from);
break;
@@ -4735,7 +4749,6 @@ copyObject(const void *from)
case T_XmlSerialize:
retval = _copyXmlSerialize(from);
break;
-
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
retval = 0; /* keep compiler quiet */
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 679e1bb..7a89f6c 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -366,6 +366,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
create_generic_options alter_generic_options
relation_expr_list dostmt_opt_list
+%type <list> OptStatsOptions
+%type <str> stats_options_name
+%type <node> stats_options_arg
+%type <defelt> stats_options_elem
+%type <list> stats_options_list
+
+
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -484,7 +491,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <keyword> unreserved_keyword type_func_name_keyword
%type <keyword> col_name_keyword reserved_keyword
-%type <node> TableConstraint TableLikeClause
+%type <node> TableConstraint TableLikeClause TableStatistics
%type <ival> TableLikeOptionList TableLikeOption
%type <list> ColQualList
%type <node> ColConstraint ColConstraintElem ConstraintAttr
@@ -2312,6 +2319,14 @@ alter_table_cmd:
n->subtype = AT_DisableRowSecurity;
$$ = (Node *)n;
}
+ /* ALTER TABLE <name> ADD STATISTICS (options) ON (columns) ... */
+ | ADD_P TableStatistics
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_AddStatistics;
+ n->def = $2;
+ $$ = (Node *)n;
+ }
| alter_generic_options
{
AlterTableCmd *n = makeNode(AlterTableCmd);
@@ -3382,6 +3397,56 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * ALTER TABLE relname ADD STATISTICS (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+TableStatistics:
+ STATISTICS OptStatsOptions ON '(' columnList ')'
+ {
+ StatisticsDef *n = makeNode(StatisticsDef);
+ n->keys = $5;
+ n->options = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
+OptStatsOptions:
+ '(' stats_options_list ')' { $$ = $2; }
+ | /*EMPTY*/ { $$ = NIL; }
+ ;
+
+stats_options_list:
+ stats_options_elem
+ {
+ $$ = list_make1($1);
+ }
+ | stats_options_list ',' stats_options_elem
+ {
+ $$ = lappend($1, $3);
+ }
+ ;
+
+stats_options_elem:
+ stats_options_name stats_options_arg
+ {
+ $$ = makeDefElem($1, $2);
+ }
+ ;
+
+stats_options_name:
+ NonReservedWord { $$ = $1; }
+ ;
+
+stats_options_arg:
+ opt_boolean_or_string { $$ = (Node *) makeString($1); }
+ | NumericOnly { $$ = (Node *) $1; }
+ | /* EMPTY */ { $$ = NULL; }
+ ;
+
/*****************************************************************************
*
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index bd27168..f61ef7e 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -499,6 +500,17 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 128
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..36757d5
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,272 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ MVStats mvstats;
+ int nmvstats;
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel), &nmvstats, false);
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ MVDependencies deps = NULL;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = mvstats[i].stakeys;
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, natts, vacattrstats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(mvstats[i].mvoid, deps);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+MVStats
+list_mv_stats(Oid relid, int *nstats, bool built_only)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ MVStats result;
+
+ /* start with 16 items, that should be enough for most cases */
+ int maxitems = 16;
+ result = (MVStats)palloc0(sizeof(MVStatsData) * maxitems);
+ *nstats = 0;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /*
+ * Skip statistics that were not computed yet (if only stats
+ * that were already built were requested)
+ */
+ if (built_only && (! stats->deps_built))
+ continue;
+
+ /* double the array size if needed */
+ if (*nstats == maxitems)
+ {
+ maxitems *= 2;
+ result = (MVStats)repalloc(result, sizeof(MVStatsData) * maxitems);
+ }
+
+ result[*nstats].mvoid = HeapTupleGetOid(htup);
+ result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ result[*nstats].deps_built = stats->deps_built;
+ *nstats += 1;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting Datum[] (row of Datums) when
+ * counting distinct values.
+ */
+int
+compare_scalars_memcmp(const void *a, const void *b, void *arg)
+{
+ Size len = *(Size*)arg;
+
+ return memcmp(a, b, len);
+}
+
+int
+compare_scalars_memcmp_2(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(Datum));
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..f511c4e
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,70 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/datum.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "access/sysattr.h"
+
+#include "utils/mvstats.h"
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+typedef struct
+{
+ int count; /* # of duplicates */
+ int first; /* values[] index of first occurrence */
+} ScalarMCVItem;
+
+typedef struct
+{
+ SortSupport ssup;
+ int *tupnoLink;
+} CompareScalarsContext;
+
+
+VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
+int compare_scalars_memcmp(const void *a, const void *b, void *arg);
+int compare_scalars_memcmp_2(const void *a, const void *b);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..b900efd
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,554 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+/*
+ * Mine functional dependencies between columns, in the form (A => B),
+ * meaning that a value in column 'A' determines value in 'B'. A simple
+ * artificial example may be a table created like this
+ *
+ * CREATE TABLE deptest (a INT, b INT)
+ * AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+ *
+ * Clearly, once we know the value for 'A' we can easily determine the
+ * value of 'B' by dividing (A/10). A more practical example may be
+ * addresses, where (ZIP code => city name), i.e. once we know the ZIP,
+ * we probably know which city it belongs to. Larger cities usually have
+ * multiple ZIP codes, so the dependency can't be reversed.
+ *
+ * Functional dependencies are a concept well described in relational
+ * theory, especially in definition of normalization and "normal forms".
+ * Wikipedia has a nice definition of a functional dependency [1]:
+ *
+ * In a given table, an attribute Y is said to have a functional
+ * dependency on a set of attributes X (written X -> Y) if and only
+ * if each X value is associated with precisely one Y value. For
+ * example, in an "Employee" table that includes the attributes
+ * "Employee ID" and "Employee Date of Birth", the functional
+ * dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ * It follows from the previous two sentences that each {Employee ID}
+ * is associated with precisely one {Employee Date of Birth}.
+ *
+ * [1] http://en.wikipedia.org/wiki/Database_normalization
+ *
+ * Most datasets might be normalized not to contain any such functional
+ * dependencies, but sometimes it's not practical. In some cases it's
+ * actually a conscious choice to model the dataset in denormalized way,
+ * either because of performance or to make querying easier.
+ *
+ * The current implementation supports only dependencies between two
+ * columns, but this is merely a simplification of the initial patch.
+ * It's certainly useful to mine for dependencies involving multiple
+ * columns on the 'left' side, i.e. a condition for the dependency.
+ * That is dependencies [A,B] => C and so on.
+ *
+ * Handling multiple columns on the right side is not necessary, as such
+ * dependencies may be decomposed into a set of dependencies with
+ * the same meaning, one for each column on the right side. For example
+ *
+ * A => [B,C]
+ *
+ * is exactly the same as
+ *
+ * (A => B) & (A => C).
+ *
+ * Of course, storing (A => [B, C]) may be more efficient thant storing
+ * the two dependencies (A => B) and (A => C) separately.
+ *
+ *
+ * Dependency mining (ANALYZE)
+ * ---------------------------
+ *
+ * FIXME Add more details about how build_mv_dependencies() works
+ * (minimum group size, supporting/contradicting etc.).
+ *
+ * Real-world datasets are imperfect - there may be errors (e.g. due to
+ * data-entry mistakes), or factually correct records, yet contradicting
+ * the dependency (e.g. when a city splits into two, but both keep the
+ * same ZIP code). A strict ANALYZE implementation (where the functional
+ * dependencies are identified) would ignore dependencies on such noisy
+ * data, making the approach unusable in practice.
+ *
+ * The proposed implementation attempts to handle such noisy cases
+ * gracefully, by tolerating small number of contradicting cases.
+ *
+ * In the future this might also perform some sort of test and decide
+ * whether it's worth building any other kind of multivariate stats,
+ * or whether the dependencies sufficiently describe the data. Or at
+ * least not build the MCV list / histogram on the implied columns.
+ * Such reduction would however make the 'verification' (see the next
+ * section) impossible.
+ *
+ *
+ * Clause reduction (planner/optimizer)
+ * ------------------------------------
+ *
+ * FIXME Explain how reduction works.
+ *
+ * The problem with the reduction is that the query may use conditions
+ * that are not redundant, but in fact contradictory - e.g. the user
+ * may search for a ZIP code and a city name not matching the ZIP code.
+ *
+ * In such cases, the condition on the city name is not actually
+ * redundant, but actually contradictory (making the result empty), and
+ * removing it while estimating the cardinality will make the estimate
+ * worse.
+ *
+ * The current estimation assuming independence (and multiplying the
+ * selectivities) works better in this case, but only by utter luck.
+ *
+ * In some cases this might be verified using the other multivariate
+ * statistics - MCV lists and histograms. For MCV lists the verification
+ * might be very simple - peek into the list if there are any items
+ * matching the clause on the 'A' column (e.g. ZIP code), and if such
+ * item is found, check that the 'B' column matches the other clause.
+ * If it does not, the clauses are contradictory. We can't really say
+ * if such item was not found, except maybe restricting the selectivity
+ * using the MCV data (e.g. using min/max selectivity, or something).
+ *
+ * With histograms, it might work similarly - we can't check the values
+ * directly (because histograms use buckets, unlike MCV lists, storing
+ * the actual values). So we can only observe the buckets matching the
+ * clauses - if those buckets have very low frequency, it probably means
+ * the two clauses are incompatible.
+ *
+ * It's unclear what 'low frequency' is, but if one of the clauses is
+ * implied (automatically true because of the other clause), then
+ *
+ * selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+ *
+ * So we might compute selectivity of the first clause (on the column
+ * A in dependency [A=>B]) - for example using regular statistics.
+ * And then check if the selectivity computed from the histogram is
+ * about the same (or significantly lower).
+ *
+ * The problem is that histograms work well only when the data ordering
+ * matches the natural meaning. For values that serve as labels - like
+ * city names or ZIP codes, or even generated IDs, histograms really
+ * don't work all that well. For example sorting cities by name won't
+ * match the sorting of ZIP codes, rendering the histogram unusable.
+ *
+ * The MCV are probably going to work much better, because they don't
+ * really assume any sort of ordering. And it's probably more appropriate
+ * for the label-like data.
+ *
+ * TODO Support dependencies with multiple columns on left/right.
+ *
+ * TODO Investigate using histogram and MCV list to confirm the
+ * functional dependencies.
+ *
+ * TODO Investigate statistical testing of the distribution (to decide
+ * whether it makes sense to build the histogram/MCV list).
+ *
+ * TODO Using a min/max of selectivities would probably make more sense
+ * for the associated columns.
+ *
+ * TODO Consider eliminating the implied columns from the histogram and
+ * MCV lists (but maybe that's not a good idea).
+ *
+ * FIXME Not sure if this handles NULL values properly (not sure how to
+ * do that). We assume that NULL means 0 for now, handling it just
+ * like any other value.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ bool isNull;
+ Size len = 2 * sizeof(Datum); /* only simple associations a => b */
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct columns in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we should expected to
+ * observe in the sample - we can then use the average group
+ * size as a threshold. That seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /* info for the interesting attributes only
+ *
+ * TODO Compute this only once and pass it to all the methods
+ * that need it.
+ */
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* We'll reuse the same array for all the combinations */
+ Datum * values = (Datum*)palloc0(numrows * 2 * sizeof(Datum));
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ Datum val_a, val_b;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ values[i*2] = heap_getattr(rows[i], attrs->values[dima], stats[dima]->tupDesc, &isNull);
+ values[i*2+1] = heap_getattr(rows[i], attrs->values[dimb], stats[dimb]->tupDesc, &isNull);
+ }
+
+ qsort_arg((void *) values, numrows, sizeof(Datum) * 2, compare_scalars_memcmp, &len);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ val_a = values[0];
+ val_b = values[1];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ if (values[2*i] != val_a) /* end of the group */
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ */
+ n_supporting += ((n_violations == 0) && (group_size >= min_group_size)) ? 1 : 0;
+ n_contradicting += (n_violations != 0) ? 1 : 0;
+
+ n_supporting_rows += ((n_violations == 0) && (group_size >= min_group_size)) ? group_size : 0;
+ n_contradicting_rows += (n_violations > 0) ? group_size : 0;
+
+ /* current values start a new group */
+ val_a = values[2*i];
+ val_b = values[2*i+1];
+ n_violations = 0;
+ group_size = 1;
+ }
+ else
+ {
+ if (values[2*i+1] != val_b) /* mismatch of a B value is contradicting */
+ {
+ val_b = values[2*i+1];
+ n_violations += 1;
+ }
+
+ group_size += 1;
+ }
+ }
+
+ /* handle the last group */
+ n_supporting += ((n_violations == 0) && (group_size >= min_group_size)) ? 1 : 0;
+ n_contradicting += (n_violations != 0) ? 1 : 0;
+ n_supporting_rows += ((n_violations == 0) && (group_size >= min_group_size)) ? group_size : 0;
+ n_contradicting_rows += (n_violations > 0) ? group_size : 0;
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means the columns have the same values (or one is a 'label'),
+ * making the conditions rather redundant. Although it's possible
+ * that the query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(values);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+bytea *
+fetch_mv_dependencies(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *stadeps = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ stadeps = DatumGetByteaP(deps);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return stadeps;
+}
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index a680229..048cd7c 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,11 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3277, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3277
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3278, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3278
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..76b7db7
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,69 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3281
+
+CATALOG(pg_mv_statistic,3281)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_attrdef
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 5
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_deps_enabled 2
+#define Anum_pg_mv_statistic_deps_built 3
+#define Anum_pg_mv_statistic_stakeys 4
+#define Anum_pg_mv_statistic_stadeps 5
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 9edfdb8..9fb118a 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2683,6 +2683,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s 2 0 16 "26 25" _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3284 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3285 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index cba4ae7..bf11005 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3279, 3280);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 97ef0fc..f1d79eb 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -413,6 +413,7 @@ typedef enum NodeTag
T_XmlSerialize,
T_WithClause,
T_CommonTableExpr,
+ T_StatisticsDef,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index b1dfa85..b8700dd 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -543,6 +543,14 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct StatisticsDef
+{
+ NodeTag type;
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+} StatisticsDef;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1340,7 +1348,8 @@ typedef enum AlterTableType
AT_ReplicaIdentity, /* REPLICA IDENTITY */
AT_EnableRowSecurity, /* ENABLE ROW SECURITY */
AT_DisableRowSecurity, /* DISABLE ROW SECURITY */
- AT_GenericOptions /* OPTIONS (...) */
+ AT_GenericOptions, /* OPTIONS (...) */
+ AT_AddStatistics /* add statistics */
} AlterTableType;
typedef struct ReplicaIdentityStmt
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..2b59c2d
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,86 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "commands/vacuum.h"
+
+/*
+ * Basic info about the stats, used when choosing what to use
+ *
+ * TODO Add info about what statistics is available (histogram, MCV,
+ * hashed MCV, functional dependencies).
+ */
+typedef struct MVStatsData {
+ Oid mvoid; /* OID of the stats in pg_mv_statistic */
+ int2vector *stakeys; /* attnums for columns in the stats */
+ bool deps_built; /* functional dependencies available */
+} MVStatsData;
+
+typedef struct MVStatsData *MVStats;
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+MVStats list_mv_stats(Oid relid, int *nstats, bool built_only);
+
+bytea * fetch_mv_dependencies(Oid mvoid);
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies);
+
+#endif
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index ba0b090..12147ab 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,7 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 80c3351..82c2659 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1353,6 +1353,14 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index c7be273..00f5fe7 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
--
2.0.5
0002-clause-reduction-using-functional-dependencies.patchtext/x-diff; name=0002-clause-reduction-using-functional-dependencies.patchDownload
>From 4881b97548ed6fa8acbef153562da617ea7e58cb Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 16 Jan 2015 22:33:41 +0100
Subject: [PATCH 2/4] clause reduction using functional dependencies
During planning, use functional dependencies to decide
which clauses to skip during cardinality estimation.
Initial and rather simplistic implementation.
FIX: second part of the rename to functional dependencies
FIX: don't build functional dependencies by default
FIX: build deps only when requested
FIX: use treat_as_join_clause() in clause_is_mv_compatible()
We don't want to process clauses that are used for joining,
but only simple WHERE clauses.
FIX: use planner_rt_fetch() to identify relation
The clause_is_mv_compatible() needs to identify the relation
(so that we can fetch the list of multivariate stats by OID).
planner_rt_fetch() seems like the appropriate way to get the
relation OID, but apparently it only works with simple vars.
Maybe examine_variable() would make this work with more
complex vars too?
FIX: comment about functional dependencies and transitivity
FIX: comment about multi-column functional dependencies
FIX: test: functional dependencies / ANALYZE
Test analyzing functional dependencies (part of ANALYZE)
on several datasets (no dependencies, no transitive
dependencies, ...).
FIX: test: clause reduction using function dependencies / EXPLAIN
Checks that a query with conditions on two columns, where one (B)
is functionally dependent on the other one (A), correctly ignores
the clause on (B) and chooses bitmap index scan instead of plain
index scan (which is what happens otherwise, thanks to assumption
of independence).
FIX: comment about building multi-column dependencies (TODO)
FIX: support functional dependencies on all data types
Until now build_mv_dependencies() only supported data types
passed by value (i.e. not varlena types or types passed by
reference). This commit adds support for these data types
by using SortSupport to do the sorting.
This however keeps the 'typbyval' assert in common.c:
Assert(stats[i]->attrtype->typbyval);
as that method is used for all types of multivariate stats
and we don't want to make this work for all of them. If
you want to play with functional dependencies on columns
with such data types, comment this assert out.
FIX: support NULL values in functional dependencies
FIX: typo in regression test of functional dependencies
FIX: added regression test for functional dependencies with TEXT
FIX: rework build_mv_dependencies() not to fail with mixed columns
FIX: readability improvement in build_mv_dependencies()
FIX: readability fixes in build_mv_dependencies()
FIX: regression test - dependencies with mix of data types / NULLs
FIX: minor formatting fixes in build_mv_dependencies()
FIX: comment about efficient building of multi-column dependencies
FIX: comment about proper NULL handling in build_mv_dependencies()
FIX: minor comment in build_mv_dependencies()
FIX: comment about handling NULLs like regular values (dependencies)
FIX: explanation of allocations in build_mv_dependencies()
FIX: move multisort typedefs/functions to common.h/c
FIX: check that at least some statistics were requested (dependencies)
FIX: comment about handling NULL values in dependencies
FIX: minor improvements in mvstat.h (functional dependencies)
FIX: add regression test for ADD STATISTICS options (dependencies)
FIX: comment about multivariate stats at clauselist_selectivity()
FIX: updated comments in clausesel.c (dependencies)
FIX: note in clauselist_selectivity()
FIX: fixed typo in tablecmds.c (comma -> semicolon)
FIX: make regression tests parallel-happy (functional dependencies)
---
src/backend/commands/tablecmds.c | 8 +-
src/backend/optimizer/path/clausesel.c | 476 +++++++++++++++++++++++++-
src/backend/utils/mvstats/common.c | 86 ++++-
src/backend/utils/mvstats/common.h | 22 ++
src/backend/utils/mvstats/dependencies.c | 170 +++++++--
src/include/utils/mvstats.h | 23 +-
src/test/regress/expected/mv_dependencies.out | 175 ++++++++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 153 +++++++++
10 files changed, 1076 insertions(+), 41 deletions(-)
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3ec1a5a..3c82b89 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11651,7 +11651,7 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
Relation mvstatrel;
/* by default build everything */
- bool build_dependencies = true;
+ bool build_dependencies = false;
Assert(IsA(def, StatisticsDef));
@@ -11713,6 +11713,12 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
opt->defname)));
}
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index dcac1c1..36e5bce 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -24,6 +24,12 @@
#include "utils/lsyscache.h"
#include "utils/selfuncs.h"
+#include "utils/mvstats.h"
+#include "catalog/pg_collation.h"
+#include "utils/typcache.h"
+
+#include "parser/parsetree.h"
+
/*
* Data structure for accumulating info about possible range-query
@@ -43,6 +49,16 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Oid *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+
+static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
+ Oid varRelid, Oid *relid, SpecialJoinInfo *sjinfo);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, int nmvstats, MVStats mvstats,
+ SpecialJoinInfo *sjinfo);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -61,7 +77,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -88,6 +104,76 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
*
* Of course this is all very dependent on the behavior of
* scalarltsel/scalargtsel; perhaps some day we can generalize the approach.
+ *
+ *
+ * Multivariate statististics
+ * --------------------------
+ * This also uses multivariate stats to estimate combinations of conditions,
+ * in a way attempting to minimize the overhead when there are no suitable
+ * multivariate stats.
+ *
+ * The following checks are performed (in this order), and the optimizer
+ * falls back to regular stats on the first 'false'.
+ *
+ * NOTE: This explains how this works with all the patches applied, not
+ * just the functional dependencies.
+ *
+ * (1) check that at least two columns are referenced from conditions
+ * compatible with multivariate stats
+ *
+ * If there are no conditions that might be handled by multivariate
+ * stats, or if the conditions reference just a single column, it
+ * makes no sense to use multivariate stats.
+ *
+ * What conditions are compatible with multivariate stats is decided
+ * by clause_is_mv_compatible(). At this moment, only simple conditions
+ * of the form "column operator constant" (for simple comparison
+ * operators), and IS NULL / IS NOT NULL are considered compatible
+ * with multivariate statistics.
+ *
+ * (2) reduce the clauses using functional dependencies
+ *
+ * This simply attempts to 'reduce' the clauses by applying functional
+ * dependencies. For example if there are two clauses:
+ *
+ * WHERE (a = 1) AND (b = 2)
+ *
+ * and we know that 'a' determines the value of 'b', we may remove
+ * the second condition (b = 2) when computing the selectivity.
+ * This is of course tricky - see mvstats/dependencies.c for details.
+ *
+ * After the reduction, step (1) is to be repeated.
+ *
+ * (3) check if there are multivariate stats built on the columns
+ *
+ * If there are no multivariate statistics, we have to fall back to
+ * the regular stats. We might perform checks (1) and (2) in reverse
+ * order, i.e. first check if there are multivariate statistics and
+ * then collect the attributes only if needed. The assumption is
+ * that checking the clauses is cheaper than querying the catalog,
+ * so this check is performed first.
+ *
+ * (4) choose the stats matching the most columns (at least two)
+ *
+ * If there are multiple instances of multivariate statistics (e.g.
+ * built on different sets of columns), we choose the stats covering
+ * the most columns from step (1). It may happen that all available
+ * stats match just a single column - for example with conditions
+ *
+ * WHERE a = 1 AND b = 2
+ *
+ * and statistics built on (a,c) and (b,c). In such case just fall
+ * back to the regular stats because it makes no sense to use the
+ * multivariate statistics.
+ *
+ * This selection criteria (the most columns) is certainly very
+ * simple and definitely not optimal - it's simple to come up with
+ * examples where other approaches work better. More about this
+ * at choose_mv_statistics().
+ *
+ * (5) use the multivariate stats to estimate matching clauses
+ *
+ * (6) estimate the remaining clauses using the regular statistics
*/
Selectivity
clauselist_selectivity(PlannerInfo *root,
@@ -100,6 +186,14 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+ int nmvstats = 0;
+ MVStats mvstats = NULL;
+
+ /* attributes in mv-compatible clauses */
+ Bitmapset *mvattnums = NULL;
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +202,28 @@ clauselist_selectivity(PlannerInfo *root,
return clause_selectivity(root, (Node *) linitial(clauses),
varRelid, jointype, sjinfo);
+ /* collect attributes referenced by mv-compatible clauses */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+
+ /*
+ * If there are mv-compatible clauses, referencing at least two
+ * different columns (otherwise it makes no sense to use mv stats),
+ * try to reduce the clauses using functional dependencies, and
+ * recollect the attributes from the reduced list.
+ *
+ * We don't need to select a single statistics for this - we can
+ * apply all the functional dependencies we have.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /* fetch info from the catalog (not the serialized stats yet) */
+ mvstats = list_mv_stats(relid, &nmvstats, true);
+
+ /* reduce clauses by applying functional dependencies rules */
+ clauses = clauselist_apply_dependencies(root, clauses, varRelid,
+ nmvstats, mvstats, sjinfo);
+ }
+
/*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
@@ -782,3 +898,361 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ *
+ */
+static Bitmapset *
+collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
+ Oid *relid, SpecialJoinInfo *sjinfo)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ *relid = InvalidOid;
+ }
+
+ return attnums;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Oid *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+{
+
+ if (IsA(clause, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* no support for OR clauses at this point */
+ if (rinfo->orclause)
+ return false;
+
+ /* get the actual clause from the RestrictInfo (it's not an OR clause) */
+ clause = (Node*)rinfo->clause;
+
+ /* only simple opclauses are compatible with multivariate stats */
+ if (! is_opclause(clause))
+ return false;
+
+ /* we don't support join conditions at this moment */
+ if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
+ return false;
+
+ /* is it 'variable op constant' ? */
+ if (list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+ RangeTblEntry * rte;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspet this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ rte = planner_rt_fetch(var->varno, root);
+ *relid = rte->relid;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ case F_EQSEL:
+ *attnum = var->varattno;
+ return true;
+ }
+ }
+ }
+ }
+
+ return false;
+
+}
+
+/*
+ * Performs reduction of clauses using functional dependencies, i.e.
+ * removes clauses that are considered redundant. It simply walks
+ * through dependencies, and checks whether the dependency 'matches'
+ * the clauses, i.e. if there's a clause matching the condition. If yes,
+ * all clauses matching the implied part of the dependency are removed
+ * from the list.
+ *
+ * This simply looks at attnums references by the clauses, not at the
+ * type of the operator (equality, inequality, ...). This may not be the
+ * right way to do - it certainly works best for equalities, which is
+ * naturally consistent with functional dependencies (implications).
+ * It's not clear that other operators are handled sensibly - for
+ * example for inequalities, like
+ *
+ * WHERE (A >= 10) AND (B <= 20)
+ *
+ * and a trivial case where [A == B], resulting in symmetric pair of
+ * rules [A => B], [B => A], it's rather clear we can't remove either of
+ * those clauses.
+ *
+ * That only highlights that functional dependencies are most suitable
+ * for label-like data, where using non-equality operators is very rare.
+ * Using the common city/zipcode example, clauses like
+ *
+ * (zipcode <= 12345)
+ *
+ * or
+ *
+ * (cityname >= 'Washington')
+ *
+ * are rare. So restricting the reduction to equality should not harm
+ * the usefulness / applicability.
+ *
+ * The other assumption is that this assumes 'compatible' clauses. For
+ * example by using mismatching zip code and city name, this is unable
+ * to identify the discrepancy and eliminates one of the clauses. The
+ * usual approach (multiplying both selectivities) thus produces a more
+ * accurate estimate, although mostly by luck - the multiplication
+ * comes from assumption of statistical independence of the two
+ * conditions (which is not not valid in this case), but moves the
+ * estimate in the right direction (towards 0%).
+ *
+ * This might be somewhat improved by cross-checking the selectivities
+ * against MCV and/or histogram.
+ *
+ * The implementation needs to be careful about cyclic rules, i.e. rules
+ * like [A => B] and [B => A] at the same time. This must not reduce
+ * clauses on both attributes at the same time.
+ *
+ * Technically we might consider selectivities here too, somehow. E.g.
+ * when (A => B) and (B => A), we might use the clauses with minimum
+ * selectivity.
+ *
+ * TODO Consider restricting the reduction to equality clauses. Or maybe
+ * use equality classes somehow?
+ *
+ * TODO Merge this docs to dependencies.c, as it's saying mostly the
+ * same things as the comments there.
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses, Oid varRelid,
+ int nmvstats, MVStats mvstats, SpecialJoinInfo *sjinfo)
+{
+ int i;
+ ListCell *lc;
+ List * reduced_clauses = NIL;
+ Oid relid;
+
+ /*
+ * preallocate space for all clauses, including non-mv-compatible,
+ * so that we don't need to reallocate the arrays repeatedly
+ */
+ bool *reduced = (bool*)palloc0(list_length(clauses) * sizeof(bool));
+ AttrNumber *mvattnums = (AttrNumber*)palloc0(list_length(clauses) * sizeof(AttrNumber));
+ Node **mvclauses = (Node**)palloc0(list_length(clauses) * sizeof(Node*));
+ int nmvclauses = 0; /* number of mv-compatible clauses */
+
+ /*
+ * Walk through the clauses - clauses that are not mv-compatible copy
+ * directly into the result list, and mv-compatible ones store into
+ * an array of clauses (and remember the attnumb in another array).
+ */
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(lc);
+ if (! clause_is_mv_compatible(root, clause, varRelid, &relid, &attnum, sjinfo))
+ lappend(reduced_clauses, clause);
+ else
+ {
+ mvclauses[nmvclauses] = clause;
+ mvattnums[nmvclauses] = attnum;
+ nmvclauses++;
+ }
+ }
+
+ Assert(nmvclauses >= 2);
+
+ /* walk through all the mvstats, and try to apply all the rules */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int j;
+ MVDependencies dependencies = NULL;
+
+ /* skip stats without functional dependencies built */
+ if (! mvstats[i].deps_built)
+ continue;
+
+ /* fetch dependencies */
+ dependencies = deserialize_mv_dependencies(fetch_mv_dependencies(mvstats->mvoid));
+ if (dependencies == NULL)
+ continue;
+
+ /*
+ * Walk through the dependencies and eliminate all the implied
+ * clauses, i.e. when there's a rule [A => B], and if we find
+ * a clause referencing column A (not yet eliminated), eliminate
+ * all clauses referencing "B".
+ *
+ * This is imperfect for a number of reasons. First, this greedy
+ * approach does not guarantee eliminating the most clauses.
+ * For example consider dependency [A => B] and [B => A], and
+ * three clauses referencing A, A and B, i.e. something like
+ *
+ * WHERE (A >= 10) AND (A <= 20) AND (B = 20)
+ *
+ * Then by considering the dependency [A => B] a single clause
+ * on B is eliminated, while by considering [B => A], both
+ * clauses on A are eliminated.
+ *
+ * The order of the dependencies may be either due to ordering
+ * within a single pg_mv_statistics record, or due to rules
+ * placed in different records.
+ *
+ * Possible solutions:
+ *
+ * (a) backtracking/recursion, with tracking of how many clauses
+ * were eliminated
+ *
+ * (b) building adjacency matrix (where A and B are adjacent
+ * when [A => B]), and multiplying it to construct
+ * transitive implications. I.e. by having [A=>B] and [B=>C]
+ * this also results in [A=>C]. Then we can simply choose
+ * the attribute that eliminates the most clauses (and
+ * repeat).
+ *
+ * We don't expect to have many clauses enough to result in long
+ * runtimes.
+ *
+ * This may also merge all the dependencies, possibly leading to
+ * longer sequences of transitive dependencies.
+ *
+ * E.g. rule [A=>B] in one pg_mv_statistic record and [B=>C] in
+ * another one results in [A=>C], which can't be deduced if the
+ * records are considered separately.
+ */
+ for (j = 0; j < dependencies->ndeps; j++)
+ {
+ int k;
+ bool applicable = false;
+
+ /* are clauses on 'A' already eliminated */
+ for (k = 0; k < nmvclauses; k++)
+ {
+ /* clause on 'A' and not yet eliminated */
+ if ((! reduced[k]) && (mvattnums[k] == dependencies->deps[j]->a))
+ {
+ applicable = true; /* we can apply this rule */
+ break;
+ }
+ }
+
+ /* if the rule is not applicable, skip to the next one */
+ if (! applicable)
+ continue;
+
+ /* eliminate all clauses on 'B ' */
+ for (k = 0; k < nmvclauses; k++)
+ {
+ if (mvattnums[k] == dependencies->deps[j]->b)
+ reduced[k] = true;
+ }
+ }
+ }
+
+ /* now walk through the clauses, and keep those that were not reduced */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ if (! reduced[i])
+ reduced_clauses = lappend(reduced_clauses, mvclauses[i]);
+ }
+
+ pfree(reduced);
+ pfree(mvclauses);
+ pfree(mvattnums);
+
+ return reduced_clauses;
+}
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index 36757d5..0edaaa6 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -50,7 +50,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, natts, vacattrstats);
+ if (mvstats->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, natts, vacattrstats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(mvstats[i].mvoid, deps);
@@ -147,6 +148,7 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
result[*nstats].mvoid = HeapTupleGetOid(htup);
result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ result[*nstats].deps_enabled = stats->deps_enabled;
result[*nstats].deps_built = stats->deps_built;
*nstats += 1;
}
@@ -270,3 +272,85 @@ compare_scalars_memcmp_2(const void *a, const void *b)
{
return memcmp(a, b, sizeof(Datum));
}
+
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index f511c4e..b98ceb7 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -59,6 +59,28 @@ typedef struct
int *tupnoLink;
} CompareScalarsContext;
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts, VacAttrStats **vacattrstats);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index b900efd..93a2fa6 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -15,6 +15,7 @@
*/
#include "common.h"
+#include "utils/lsyscache.h"
/*
* Mine functional dependencies between columns, in the form (A => B),
@@ -56,6 +57,20 @@
* columns on the 'left' side, i.e. a condition for the dependency.
* That is dependencies [A,B] => C and so on.
*
+ * TODO The implementation may/should be smart enough not to mine both
+ * [A => B] and [A,C => B], because the second dependency is a
+ * consequence of the first one (if values of A determine values
+ * of B, adding another column won't change that). The ANALYZE
+ * should first analyze 1:1 dependencies, then 2:1 dependencies
+ * (and skip the already identified ones), etc.
+ *
+ * For example the dependency [city name => zip code] is much weaker
+ * than [city name, state name => zip code], because there may be
+ * multiple cities with the same name in various states. It's not
+ * perfect though - there are probably cities with the same name within
+ * the same state, but this is relatively rare occurence hopefully.
+ * More about this in the section about dependency mining.
+ *
* Handling multiple columns on the right side is not necessary, as such
* dependencies may be decomposed into a set of dependencies with
* the same meaning, one for each column on the right side. For example
@@ -163,19 +178,61 @@
* FIXME Not sure if this handles NULL values properly (not sure how to
* do that). We assume that NULL means 0 for now, handling it just
* like any other value.
+ *
+ * FIXME This builds a complete set of dependencies, i.e. including
+ * transitive dependencies - if we identify [A => B] and [B => C],
+ * we're likely to identify [A => C] too. It might be better to
+ * keep only the minimal set of dependencies, i.e. prune all the
+ * dependencies that we can recreate by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may
+ * be recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check
+ * before actually doing the work
+ *
+ * The second option has the advantage that we don't really need
+ * to perform the sort/count. It's not sufficient alone, though,
+ * because we may discover the dependencies in the wrong order.
+ * For example [A => B], [A => C] and then [B => C]. None of those
+ * dependencies is a combination of the already known ones, yet
+ * [A => C] is a combination of [A => B] and [B => C].
+ *
+ * TODO Not sure the current NULL handling makes much sense. It's
+ * handled like a regular value (NULL == NULL), so all NULLs in
+ * a single column form a single group. Maybe that's not the right
+ * thing to do, especially with equality conditions - in that case
+ * NULLs are irrelevant. So maybe the right solution would be to
+ * just ignore NULL values here?
+ *
+ * However simply "ignoring" the NULL values does not seem like
+ * a good idea - imagine columns A and B, where for each value of
+ * A, values in B are constant (same for the whole group) or NULL.
+ * Let's say only 10% of B values in each group is not NULL. Then
+ * ignoring the NULL values will result in 10x misestimate (and
+ * it's trivial to construct arbitrary errors). So maybe handling
+ * NULL values just like a regular value is the right thing here.
+ *
+ * Or maybe NULL values should be treated differently on each side
+ * of the dependency? E.g. as ignored on the left (condition) and
+ * as regular values on the right - this seems consistent with how
+ * equality clauses work, as equality clause means 'NOT NULL'.
+ * So if we say [A => B] then it may also imply "NOT NULL" on the
+ * right side.
*/
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
int natts, VacAttrStats **vacattrstats)
{
int i;
- bool isNull;
- Size len = 2 * sizeof(Datum); /* only simple associations a => b */
int numattrs = attrs->dim1;
/* result */
int ndeps = 0;
MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
/* TODO Maybe this should be somehow related to the number of
* distinct columns in the two columns we're currently analyzing.
@@ -195,8 +252,24 @@ build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
*/
VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
- /* We'll reuse the same array for all the combinations */
- Datum * values = (Datum*)palloc0(numrows * 2 * sizeof(Datum));
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
Assert(numattrs >= 2);
@@ -213,9 +286,12 @@ build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
*/
for (dima = 0; dima < numattrs; dima++)
{
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, vacattrstats);
+
for (dimb = 0; dimb < numattrs; dimb++)
{
- Datum val_a, val_b;
+ SortItem current;
/* number of groups supporting / contradicting the dependency */
int n_supporting = 0;
@@ -232,14 +308,27 @@ build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
if (dima == dimb)
continue;
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, vacattrstats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
/* accumulate all the data for both columns into an array and sort it */
for (i = 0; i < numrows; i++)
{
- values[i*2] = heap_getattr(rows[i], attrs->values[dima], stats[dima]->tupDesc, &isNull);
- values[i*2+1] = heap_getattr(rows[i], attrs->values[dimb], stats[dimb]->tupDesc, &isNull);
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
}
- qsort_arg((void *) values, numrows, sizeof(Datum) * 2, compare_scalars_memcmp, &len);
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
/*
* Walk through the array, split it into rows according to
@@ -254,13 +343,13 @@ build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
*/
/* start with values from the first row */
- val_a = values[0];
- val_b = values[1];
+ current = items[0];
group_size = 1;
for (i = 1; i < numrows; i++)
{
- if (values[2*i] != val_a) /* end of the group */
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
{
/*
* If there are no contradicting rows, count it as
@@ -271,36 +360,49 @@ build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
* impossible to identify [unique,unique] cases, but
* that's probably a different case. This is more
* about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
*/
- n_supporting += ((n_violations == 0) && (group_size >= min_group_size)) ? 1 : 0;
- n_contradicting += (n_violations != 0) ? 1 : 0;
-
- n_supporting_rows += ((n_violations == 0) && (group_size >= min_group_size)) ? group_size : 0;
- n_contradicting_rows += (n_violations > 0) ? group_size : 0;
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
/* current values start a new group */
- val_a = values[2*i];
- val_b = values[2*i+1];
n_violations = 0;
- group_size = 1;
+ group_size = 0;
}
- else
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
{
- if (values[2*i+1] != val_b) /* mismatch of a B value is contradicting */
- {
- val_b = values[2*i+1];
- n_violations += 1;
- }
-
- group_size += 1;
+ n_violations += 1;
}
+
+ current = items[i];
+ group_size += 1;
}
- /* handle the last group */
- n_supporting += ((n_violations == 0) && (group_size >= min_group_size)) ? 1 : 0;
- n_contradicting += (n_violations != 0) ? 1 : 0;
- n_supporting_rows += ((n_violations == 0) && (group_size >= min_group_size)) ? group_size : 0;
- n_contradicting_rows += (n_violations > 0) ? group_size : 0;
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
/*
* See if the number of rows supporting the association is at least
@@ -338,7 +440,11 @@ build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
}
}
+ pfree(items);
pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
return dependencies;
}
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 2b59c2d..a074253 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -18,24 +18,34 @@
/*
* Basic info about the stats, used when choosing what to use
- *
- * TODO Add info about what statistics is available (histogram, MCV,
- * hashed MCV, functional dependencies).
*/
typedef struct MVStatsData {
Oid mvoid; /* OID of the stats in pg_mv_statistic */
int2vector *stakeys; /* attnums for columns in the stats */
+
+ /* statistics requested in ALTER TABLE ... ADD STATISTICS */
+ bool deps_enabled; /* analyze functional dependencies */
+
+ /* available statistics (computed by ANALYZE) */
bool deps_built; /* functional dependencies available */
} MVStatsData;
typedef struct MVStatsData *MVStats;
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
int16 a;
@@ -61,6 +71,7 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
MVStats list_mv_stats(Oid relid, int *nstats, bool built_only);
+bytea * fetch_mv_rules(Oid mvoid);
bytea * fetch_mv_dependencies(Oid mvoid);
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..dbfb5cf
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,175 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE functional_dependencies ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c, d);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index e0ae2f2..c41762c 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -109,3 +109,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7f762bd..3845b0f 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -152,3 +152,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..5d1ad52
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,153 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (unknown_column);
+
+-- single column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a);
+
+-- single column, duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE functional_dependencies ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- correct command
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c, d);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
--
2.0.5
0003-multivariate-MCV-lists.patchtext/x-diff; name=0003-multivariate-MCV-lists.patchDownload
>From d6d169988a8ccef1e41e9620599bdfc83a192433 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:15:37 +0100
Subject: [PATCH 3/4] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
FIX: don't build MCV list by default
FIX: analyze MCV lists only when requested
FIX: improvements in clauselist_mv_selectivity_mcvlist()
FIX: comment about upper selectivity boundaries for MCV lists
FIX: switch MCV build to multi_sort functions (from dependencies)
This adds support for proper sorting (per data type), arbitrary
data types (as long as they have '<' operator) and NULL values
to building MCV lists. It still needs a fair amount of love, and
does nothing to serializing/deserializing the MCV lists.
FIX: comment about using max_mcv_items (ADD STATISTICS option)
FIX: initial support for all data types and NULL in MCV lists
This changes the serialize_mcvlist/update_mv_stats in a bit
strange way (passing VacAttrStats all over the place). This
needs to be improved, somehow, before rebasing into the MCV
part. Otherwise it'll cause needless conflicts.
FIX: fixed MCV build / removed debugging WARNING log message
FIX: refactoring lookup_var_attr_stats() - moving to common.c, static
This only includes changes in the common part + functional dependencies.
FIX: refactoring lookup_var_attr_stats() / MCV lists
FIX: a set of regression tests for MCV lists
This is mostly equal to a combination of all the regression tests
for functional dependencies.
One of the tests (EXPLAIN with TEXT columns) currently fails, and
produces Index Scan instead of Bitmap Index Scan. Will investigate.
FIX: comment about memory corruption in deserializing MCV list
FIX: correct MCV spelling on a few places (MVC -> MCV)
FIX: get rid of the custom comparators in mcv.c
FIX: use USE_ASSERT_CHECKING for assert-only variable (MCV)
FIX: check 'mcv' and 'mcv_max_items' options in ADD STATISTICS
FIX: proper handling of 'mcv_max_items' options (constants etc.)
FIX: check that either dependencies or MCV were requested
FIX: improved comments / docs for MCV lists
FIX: move DimensionInfo to common.h
FIX: move MCV list definitions after functional dependencies
FIX: incorrect memcpy() when building MCV list, causing segfaults
FIX: replace variables by macros in MCV serialize/deserialize
FIX: rework clauselist_mv_split() to call clause_is_mv_compatible()
Mostly duplicated the code, making it difficult to add more clause
types etc.
FIX: fixed estimation of equality clauses using MCV lists
FIX: add support for 'IS [NOT] NULL' support to MCV lists
FIX: add regression test for ADD STATISTICS options (MCV list)
FIX: added regression test to test IS [NOT] NULL with MCV lists
FIX: updated comments in clausesel.c (mcv)
FIX: obsolete Assert in mcv code (indexes -> ITEM_INDEXES)
FIX: make regression tests parallel-happy (MCV lists)
---
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/tablecmds.c | 47 +-
src/backend/optimizer/path/clausesel.c | 788 ++++++++++++++++++++++-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 65 +-
src/backend/utils/mvstats/common.h | 12 +-
src/backend/utils/mvstats/dependencies.c | 13 +-
src/backend/utils/mvstats/mcv.c | 1002 ++++++++++++++++++++++++++++++
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 2 +
src/include/utils/mvstats.h | 68 +-
src/test/regress/expected/mv_mcv.out | 210 +++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 181 ++++++
16 files changed, 2370 insertions(+), 49 deletions(-)
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index da957fc..8acf160 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,7 +158,9 @@ CREATE VIEW pg_mv_stats AS
C.relname AS tablename,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3c82b89..1f08c1c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11651,7 +11651,13 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
Relation mvstatrel;
/* by default build everything */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(def, StatisticsDef));
@@ -11706,6 +11712,29 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -11714,10 +11743,16 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -11733,9 +11768,13 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
- nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 36e5bce..1446fa0 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -59,6 +59,18 @@ static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, int nmvstats, MVStats mvstats,
SpecialJoinInfo *sjinfo);
+static int choose_mv_statistics(int nmvstats, MVStats mvstats,
+ Bitmapset *attnums);
+static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid,
+ List **mvclauses, MVStats mvstats);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStats mvstats);
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStats mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -195,8 +207,8 @@ clauselist_selectivity(PlannerInfo *root,
Bitmapset *mvattnums = NULL;
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
@@ -222,6 +234,46 @@ clauselist_selectivity(PlannerInfo *root,
/* reduce clauses by applying functional dependencies rules */
clauses = clauselist_apply_dependencies(root, clauses, varRelid,
nmvstats, mvstats, sjinfo);
+
+ /*
+ * recollect attributes from mv-compatible clauses (maybe we've
+ * removed so many clauses we have a single mv-compatible attnum)
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+ }
+
+ /*
+ * If there still are at least two columns, we'll try to select
+ * a suitable multivariate stats.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /* fetch info from the catalog (not the serialized stats yet) */
+ mvstats = list_mv_stats(relid, &nmvstats, true);
+
+ /* see choose_mv_statistics() for details */
+ if (nmvstats > 0)
+ {
+ int idx = choose_mv_statistics(nmvstats, mvstats, mvattnums);
+
+ if (idx >= 0) /* we have a matching stats */
+ {
+ MVStats mvstat = &mvstats[idx];
+
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
}
/*
@@ -899,6 +951,192 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Estimate selectivity for the list of MV-compatible clauses, using that
+ * particular histogram.
+ *
+ * When we hit a single bucket, we don't know what portion of it actually
+ * matches the clauses (e.g. equality), and we use 1/2 the bucket by
+ * default. However, the MV histograms are usually less detailed than
+ * the per-column ones, meaning the sum of buckets is often quite high
+ * (thanks to combining a lot of "partially hit" buckets).
+ *
+ * There are several ways to improve this, usually with cases when it
+ * won't really help. Also, the more complex the process, the worse
+ * the failures (i.e. misestimates).
+ *
+ * (1) Use the MV histogram only as a way to combine multiple
+ * per-column histograms, essentially rewriting
+ *
+ * P(A & B) = P(A) * P(B|A)
+ *
+ * where P(B|A) may be computed using a proper "slice" of the
+ * histogram, by first selecting only buckets where A is true, and
+ * then using the boundaries to 'restrict' the per-colunm histogram.
+ *
+ * With more clauses, it gets more complicated, of course
+ *
+ * P(A & B & C) = P(A & C) * P(B|A & C)
+ * = P(A) * P(C|A) * P(B|A & C)
+ *
+ * and so on.
+ *
+ * Of course, the question is how well and efficiently we can
+ * compute the conditional probabilities - whether this approach
+ * can improve the estimates (instead of amplifying the errors).
+ *
+ * Also, this does not eliminate the need for histogram on [A,B,C].
+ *
+ * (2) Use multiple smaller (and more accurate) histograms, and combine
+ * them using a process similar to the above. E.g. by assuming that
+ * B and C are independent, we can rewrite
+ *
+ * P(B|A & C) = P(B|A)
+ *
+ * so we can rewrite the whole formula to
+ *
+ * P(A & B & C) = P(A) * P(C|A) * P(B|A)
+ *
+ * and we're OK with two 2D histograms [A,C] and [A,B].
+ *
+ * It'd be nice to perform some sort of statistical test (Fisher
+ * or another chi-squared test) to identify independent components
+ * and automatically separate them into smaller histograms.
+ *
+ * (3) Using the estimated number of distinct values in a bucket to
+ * decide the selectivity of equality in the bucket (instead of
+ * blindly using 1/2 of the bucket, we may use 1/ndistinct).
+ * Of course, if the ndistinct estimate is way off, or when the
+ * distribution is not uniform (one distict items get much more
+ * items), this will fail. Also, we currently don't have ndistinct
+ * estimate available at this moment (but it shouldn't be that
+ * difficult to compute as ndistinct and ntuples should be available).
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities
+ * (i.e. the selectivity of the most restrictive clause), because
+ * that's the maximum we can ever get from ANDed list of clauses.
+ * This may probably prevent issues with hitting too many buckets
+ * and low precision histograms.
+ *
+ * TODO We may support some additional conditions, most importantly
+ * those matching multiple columns (e.g. "a = b" or "a < b").
+ * Ultimately we could track multi-table histograms for join
+ * cardinality estimation.
+ *
+ * TODO Currently this is only estimating all clauses, or clauses
+ * matching varRelid (when it's not 0). I'm not sure what's the
+ * purpose of varRelid, but my assumption is this is used for
+ * join conditions and such. In that case we can use those clauses
+ * to restrict the other (i.e. filter the histogram buckets first,
+ * before estimating the other clauses). This is essentially equal
+ * to computing P(A|B) where "B" are the clauses not matching the
+ * varRelid.
+ *
+ * TODO Further thoughts on processing equality clauses - maybe it'd be
+ * better to look for stats (with MCV) covered by the equality
+ * clauses, because then we have a chance to find an exact match
+ * in the MCV list, which is pretty much the best we can do. We may
+ * also look at the least frequent MCV item, and use it as a upper
+ * boundary for the selectivity (had there been a more frequent
+ * item, it'd be in the MCV list).
+ *
+ * These conditions may then be used as a condition for the other
+ * selectivities, i.e. we may estimate P(A,B) first, and then
+ * compute P(C|A,B) from another histogram. This may be useful when
+ * we can estimate P(A,B) accurately (e.g. because it's a complete
+ * equality match evaluated on MCV list), and then compute the
+ * conditional probability P(C|A,B), giving us the requested stats
+ *
+ * P(A,B,C) = P(A,B) * P(C|A,B)
+ *
+ * TODO There are several options for 'sanity clamping' the estimates.
+ *
+ * First, if we have selectivities for each condition, then
+ *
+ * P(A,B) <= MIN(P(A), P(B))
+ *
+ * Because additional conditions (connected by AND) can only lower
+ * the probability.
+ *
+ * So we can do some basic sanity checks using the single-variate
+ * stats (the ones we have right now).
+ *
+ * Second, when we have multivariate stats with a MCV list, then
+ *
+ * (a) if we have a full equality condition (one equality condition
+ * on each column) and we found a match in the MCV list, this is
+ * the selectivity (and it's supposed to be exact)
+ *
+ * (b) if we have a full equality condition and we haven't found a
+ * match in the MCV list, then the selectivity is below the
+ * lowest selectivity in the MCV list
+ *
+ * (c) if we have a equality condition (not full), we can still
+ * search the MCV for matches and use the sum of probabilities
+ * as a lower boundary for the histogram (if there are no
+ * matches in the MCV list, then we have no boundary)
+ *
+ * Third, if there are multiple multivariate stats for a set of
+ * clauses, we may compute all of them and then somehow aggregate
+ * them - e.g. by choosing the minimum, median or average. The
+ * multi-variate stats are susceptible to overestimation (because
+ * we take 50% of the bucket for partial matches). Some stats may
+ * give better estimates than others, but it's very difficult to
+ * say determine that in advance which one is the best (it depends
+ * on the number of buckets, number of additional columns not
+ * referenced in the clauses etc.) so we may compute all and then
+ * choose a sane aggregation (minimum seems like a good approach).
+ * Of course, this may result in longer / more expensive estimation
+ * (CPU-wise), but it may be worth it.
+ *
+ * There are ways to address this, though. First, it's possible to
+ * add a GUC choosing whether to do a 'simple' (using a single
+ * stats expected to give the best estimate) and 'complex' (combining
+ * the multiple estimates).
+ *
+ * multivariate_estimates = (simple|full)
+ *
+ * Also, this might be enabled at a table level, by something like
+ *
+ * ALTER TABLE ... SET STATISTICS (simple|full)
+ *
+ * Which would make it possible to use this only for the tables
+ * where the simple approach does not work.
+ *
+ * Also, there are ways to optimize this algorithmically. E.g. we
+ * may try to get an estimate from a matching MCV list first, and
+ * if we happen to get a "full equality match" we may stop computing
+ * the estimates from other stats (for this condition) because
+ * that's probably the best estimate we can really get.
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive).
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Collect attributes from mv-compatible clauses.
*
@@ -945,6 +1183,175 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
}
/*
+ * We're looking for statistics matching at least 2 attributes,
+ * referenced in the clauses compatible with multivariate statistics.
+ * The current selection criteria is very simple - we choose the
+ * statistics referencing the most attributes.
+ *
+ * If there are multiple statistics referencing the same number of
+ * columns (from the clauses), the one with less source columns
+ * (as listed in the ADD STATISTICS when creating the statistics) wins.
+ * Other wise the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns,
+ * but one has 100 buckets and the other one has 1000 buckets (thus
+ * likely providing better estimates), this is not currently
+ * considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list,
+ * another one with just a histogram and a third one with both,
+ * this is not considered.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts,
+ * so if there are multiple clauses on a single attribute, this
+ * still counts as a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example
+ * equality clauses probably work better with MCV lists than with
+ * histograms. But IS [NOT] NULL conditions may often work better
+ * with histograms (thanks to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
+ * selected as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list
+ * of clauses into two parts - conditions that are compatible with the
+ * selected stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while
+ * the last condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting
+ * conditions instead of just referenced attributes), but eventually
+ * the best option should be to combine multiple statistics. But that's
+ * much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing
+ * the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses,
+ * because 'dependencies' will probably work only with equality
+ * clauses.
+ */
+static int
+choose_mv_statistics(int nmvstats, MVStats mvstats, Bitmapset *attnums)
+{
+ int i, j;
+
+ int choice = -1;
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements)
+ * and for each one count the referenced attributes (encoded in
+ * the 'attnums' bitmap).
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = mvstats[i].stakeys;
+ int numattrs = mvstats[i].stakeys->dim1;
+
+ /* count columns covered by the histogram */
+ for (j = 0; j < numattrs; j++)
+ if (bms_is_member(attrs->values[j], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = i;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen histogram, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid, List **mvclauses,
+ MVStats mvstats)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(root, clause, varRelid, NULL, &attnum, sjinfo))
+ {
+ /* Is the attribute part of the selected stats? */
+ for (i = 0; i < numattrs; i++)
+ if (attrs->values[i] == attnum)
+ match = true;
+ }
+
+ if (match)
+ {
+ /*
+ * The clause matches the selected stats, so extract the
+ * clause from the RestrictInfo and put it to the
+ * multivariate list. We'll use it directly.
+ */
+ RestrictInfo * rinfo = (RestrictInfo *) clause;
+ *mvclauses = lappend(*mvclauses, (Node*)rinfo->clause);
+ }
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
* into simple_rte_array) and a bitmap of attributes. This is then
@@ -981,21 +1388,23 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
/* get the actual clause from the RestrictInfo (it's not an OR clause) */
clause = (Node*)rinfo->clause;
- /* only simple opclauses are compatible with multivariate stats */
- if (! is_opclause(clause))
- return false;
-
/* we don't support join conditions at this moment */
if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
return false;
- /* is it 'variable op constant' ? */
- if (list_length(((OpExpr *) clause)->args) == 2)
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
{
OpExpr *expr = (OpExpr *) clause;
bool varonleft = true;
bool ok;
+ /* is it 'variable op constant' ? */
+
ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
(is_pseudo_constant_clause_relids(lsecond(expr->args),
rinfo->right_relids) ||
@@ -1032,8 +1441,11 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
return false;
/* Lookup info about the base relation (we need to pass the OID out) */
- rte = planner_rt_fetch(var->varno, root);
- *relid = rte->relid;
+ if (relid != NULL)
+ {
+ rte = planner_rt_fetch(var->varno, root);
+ *relid = rte->relid;
+ }
/*
* If it's not a "<" or ">" or "=" operator, just ignore the
@@ -1051,6 +1463,45 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
}
}
}
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ RangeTblEntry * rte;
+ Var * var = (Var*)((NullTest*)clause)->arg;
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspet this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
+ {
+ rte = planner_rt_fetch(var->varno, root);
+ *relid = rte->relid;
+ }
+ *attnum = var->varattno;
+
+ return true;
+ }
}
return false;
@@ -1256,3 +1707,320 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses, Oid varRelid,
return reduced_clauses;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStats mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ ListCell * l;
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ char * matches = NULL; /* match/mismatch for each MCV item */
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = deserialize_mv_mcvlist(fetch_mv_mcvlist(mvstats->mvoid));
+
+ Assert(mcvlist != NULL);
+ Assert(clauses != NIL);
+ Assert(mcvlist->nitems > 0);
+ Assert(list_length(clauses) >= 2);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* frequency of the lowest MCV item */
+ *lowsel = 1.0;
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ *
+ * FIXME This would probably deserve a refactoring, I guess. Unify
+ * the two loops and put the checks inside, or something like
+ * that.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if there are no remaining matches possible, we can stop */
+ if (nmatches == 0)
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ /* operator */
+ FmgrInfo opproc;
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo ltproc, gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype,
+ TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, mvstats->stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool tmp;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* TODO consider bsearch here (list is sorted by values)
+ * TODO handle other operators too (LT, GT)
+ * TODO identify "full match" when the clauses fully
+ * match the whole MCV list (so that checking the
+ * histogram is not needed)
+ */
+ if (get_oprrest(expr->opno) == F_EQSEL)
+ {
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+ if (! tmp)
+ matches[i] = MVSTATS_MATCH_NONE;
+ else
+ eqmatches = bms_add_member(eqmatches, idx);
+ }
+ else if (get_oprrest(expr->opno) == F_SCALARLTSEL) /* column < constant */
+ {
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (tmp)
+ {
+ matches[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARLTSEL) */
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ matches[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+ }
+ }
+ else if (get_oprrest(expr->opno) == F_SCALARGTSEL) /* column > constant */
+ {
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+ if (tmp)
+ {
+ matches[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ matches[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARGTSEL) */
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, mvstats->stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! mcvlist->items[i]->isnull[idx]))
+ matches[i] = MVSTATS_MATCH_NONE;
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (mcvlist->items[i]->isnull[idx]))
+ matches[i] = MVSTATS_MATCH_NONE;
+ }
+ }
+ }
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s;
+}
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..3c0aff4 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o mcv.o dependencies.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index 0edaaa6..69ab805 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,6 +16,10 @@
#include "common.h"
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts,
+ VacAttrStats **vacattrstats);
+
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -40,10 +44,15 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
for (i = 0; i < nmvstats; i++)
{
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
/* int2 vector of attnums the stats should be computed on */
int2vector * attrs = mvstats[i].stakeys;
+ /* filter only the interesting vacattrstats records */
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
/* check allowed number of dimensions */
Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
@@ -51,10 +60,14 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
* Analyze functional dependencies of columns.
*/
if (mvstats->deps_enabled)
- deps = build_mv_dependencies(numrows, rows, attrs, natts, vacattrstats);
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* build the MCV list */
+ if (mvstats->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
/* store the histogram / MCV list in the catalog */
- update_mv_stats(mvstats[i].mvoid, deps);
+ update_mv_stats(mvstats[i].mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -63,7 +76,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
* matching the attrs vector (to make it easy to work with when
* computing multivariate stats).
*/
-VacAttrStats **
+static VacAttrStats **
lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
{
int i, j;
@@ -136,7 +149,7 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
* Skip statistics that were not computed yet (if only stats
* that were already built were requested)
*/
- if (built_only && (! stats->deps_built))
+ if (built_only && (! (stats->mcv_built || stats->deps_built)))
continue;
/* double the array size if needed */
@@ -149,7 +162,9 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
result[*nstats].mvoid = HeapTupleGetOid(htup);
result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
result[*nstats].deps_enabled = stats->deps_enabled;
+ result[*nstats].mcv_enabled = stats->mcv_enabled;
result[*nstats].deps_built = stats->deps_built;
+ result[*nstats].mcv_built = stats->mcv_built;
*nstats += 1;
}
@@ -164,7 +179,9 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
}
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -189,15 +206,26 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
/* Is there already a pg_mv_statistic tuple for this attribute? */
oldtup = SearchSysCache1(MVSTATOID,
@@ -225,6 +253,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -235,11 +278,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index b98ceb7..fca2782 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -59,6 +59,14 @@ typedef struct
int *tupnoLink;
} CompareScalarsContext;
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -82,10 +90,8 @@ int multi_sort_compare(const void *a, const void *b, void *arg);
int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
-VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
-
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
int compare_scalars_memcmp(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 93a2fa6..0543690 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -224,7 +224,7 @@
*/
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
- int natts, VacAttrStats **vacattrstats)
+ VacAttrStats **stats)
{
int i;
int numattrs = attrs->dim1;
@@ -245,13 +245,6 @@ build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
/* dimension indexes we'll check for associations [a => b] */
int dima, dimb;
- /* info for the interesting attributes only
- *
- * TODO Compute this only once and pass it to all the methods
- * that need it.
- */
- VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
-
/*
* We'll reuse the same array for all the 2-column combinations.
*
@@ -287,7 +280,7 @@ build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
for (dima = 0; dima < numattrs; dima++)
{
/* prepare the sort function for the first dimension */
- multi_sort_add_dimension(mss, 0, dima, vacattrstats);
+ multi_sort_add_dimension(mss, 0, dima, stats);
for (dimb = 0; dimb < numattrs; dimb++)
{
@@ -309,7 +302,7 @@ build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
continue;
/* prepare the sort function for the second dimension */
- multi_sort_add_dimension(mss, 1, dimb, vacattrstats);
+ multi_sort_add_dimension(mss, 1, dimb, stats);
/* reset the values and isnull flags */
memset(values, 0, sizeof(Datum) * numrows * 2);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..2b3d171
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1002 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+/*
+ * Multivariate MCVs (most-common values lists) are a straightforward
+ * extension of regular MCV list by tracking combinations of values for
+ * several attributes (columns), including NULL flags, and frequency
+ * of the combination.
+ *
+ * For columns small number of distinct values, this works quite well
+ * and may represent the distribution pretty exactly. For columns with
+ * large number of distinct values (e.g. stored as FLOAT), this does
+ * not work that well.
+ *
+ * If we can represent the distribution as a MCV list, we can estimate
+ * some clauses (e.g. equality clauses) much accurately than using
+ * histograms for example.
+ *
+ * Discrete distributions are also easier to combine into a larger
+ * distribution (but this is not yet implemented).
+ *
+ *
+ * TODO For types that don't reasonably support ordering (either because
+ * the type does not support that or when the user adds some option
+ * to the ADD STATISTICS command - e.g. UNSORTED_STATS), building
+ * the histogram may be pointless and inefficient. This is esp.
+ * true for varlena types that may be quite large and a large MCV
+ * list may be a better choice, because it makes equality estimates
+ * more accurate. Due to the unsorted nature, range queries on those
+ * attributes are rather useless anyway.
+ *
+ * Another thing is that by restricting to MCV list and equality
+ * conditions, we can use hash values instead of long varlena values.
+ * The equality estimation will be very accurate.
+ *
+ * This however complicates matching the columns to available
+ * statistics, as it will require matching clauses (not columns) to
+ * stats. And it may get quite complex - e.g. what if there are
+ * multiple clauses, each compatible with different stats subset?
+ *
+ *
+ * Selectivity estimation
+ * ----------------------
+ * The estimation, implemented in clauselist_mv_selectivity_mcvlist(),
+ * is quite simple in principle - walk through the MCV items and sum
+ * frequencies of all the items that match all the clauses.
+ *
+ * The current implementation uses MCV lists to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ *
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (a) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ *
+ * (b) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ *
+ * Estimating equality clauses
+ * ---------------------------
+ * When computing selectivity estimate for equality clauses
+ *
+ * (a = 1) AND (b = 2)
+ *
+ * we can do this estimate pretty exactly assuming that two conditions
+ * are met:
+ *
+ * (1) there's an equality condition on each attribute
+ *
+ * (2) we find a matching item in the MCV list
+ *
+ * In that case we know the MCV item represents all the tuples matching
+ * the clauses, and the selectivity estimate is complete. This is what
+ * we call 'full match'.
+ *
+ * When only (1) holds, but there's no matching MCV item, we don't know
+ * whether there are no such rows or just are not very frequent. We can
+ * however use the frequency of the least frequent MCV item as an upper
+ * bound for the selectivity.
+ *
+ * If the equality conditions match only a subset of the attributes
+ * the MCV list is built on (i.e. we can't get a full match - we may get
+ * multiple MCV items matching the clauses, but even if we get a single
+ * match there may be items that did not get into the MCV list. But in
+ * this case we can still use the frequency of the last MCV item to clam
+ * the 'additional' selectivity not accounted for by the matching items.
+ *
+ * If there's no histogram, because the MCV list approximates the
+ * distribution accurately (not because the histogram was disabled),
+ * it does not really matter whether there are equality conditions on
+ * all the columns - we can do pretty accurate estimation using the MCV.
+ *
+ * TODO For a combination of equality conditions (not full-match case)
+ * we probably can clamp the selectivity by the minimum of
+ * selectivities for each condition. For example if we know the
+ * number of distinct values for each column, we can use 1/ndistinct
+ * as a per-column estimate. Or rather 1/ndistinct + selectivity
+ * derived from the MCV list.
+ *
+ * If we know the estimate of number of combinations of the columns
+ * (i.e. ndistinct(A,B)), we may estimate the average frequency of
+ * items in the remaining 10% as [10% / ndistinct(A,B)].
+ *
+ *
+ * Bounding estimates
+ * ------------------
+ * In general the MCV lists may not provide estimates as accurate as
+ * for the full-match equality case, but may provide some useful
+ * lower/upper boundaries for the estimation error.
+ *
+ * With equality clauses we can do a few more tricks to narrow this
+ * error range (see the previous section and TODO), but with inequality
+ * clauses (or generally non-equality clauses), it's rather dificult.
+ * There's nothing like a 'full match' - we have to consider both the
+ * MCV items and the remaining part every time. We can't use the minimum
+ * selectivity of MCV items, as the clauses may match multiple items.
+ *
+ * For example with a MCV list on columns (A, B), covering 90% of the
+ * table (computed while building the MCV list), about ~10% of the table
+ * is not represented by the MCV list. So even if the conditions match
+ * all the remaining rows (not represented by the MCV items), we can't
+ * get selectivity higher than those 10%. We may use 1/2 the remaining
+ * selectivity as an estimate (minimizing average error).
+ *
+ * TODO Most of these ideas (error limiting) are not yet implemented.
+ *
+ *
+ * General TODO
+ * ------------
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * TODO Add support for IS [NOT] NULL clauses, and clauses referencing
+ * multiple columns (a < b).
+ *
+ * TODO It's possible to build a special case of MCV list, storing not
+ * the actual values but only 32/64-bit hash. This is only useful
+ * for estimating equality clauses and for large varlena types,
+ * which are very impractical for plain MCV list because of size.
+ * But for those data types we really want just the equality
+ * clauses, so it's actually a good solution.
+ *
+ * TODO Currently there's no logic to consider building only a MCV list
+ * (and not building the histogram at all), except for doing this
+ * decision manually in ADD STATISTICS.
+ */
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(int32))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(int32) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(int32) + sizeof(bool)) + sizeof(double))
+
+/* pointers into a flat serialized item of ITEM_SIZE(n) bytes */
+#define ITEM_INDEXES(item) ((int32*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+/*
+ * Builds MCV list from sample rows, and removes rows represented by
+ * the MCV list from the sample (the number of remaining sample rows is
+ * returned by the numrows_filtered parameter).
+ *
+ * The method is quite simple - in short it does about these steps:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * For more details, see the comments in the code.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we
+ * want to check the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or
+ * float4). Maybe we could save some space here, but the bytea
+ * compression should handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed
+ * from the table, but rather estimate the number of distinct
+ * values in the table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * Preallocate space for all the items as a single chunk, and point
+ * the items to the appropriate parts of the array.
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ /* keep all the rows by default (as if there was no MCV list) */
+ *numrows_filtered = numrows;
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ items[j].values[i] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Count the number of distinct groups - just walk through the
+ * sorted list and count the number of key changes. We use this to
+ * determine the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array. We'll always
+ * require at least 4 rows per group.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e.
+ * if there are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS),
+ * we'll require only 2 rows per group.
+ *
+ * TODO For now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ * for the multivariate case.
+ *
+ * TODO We can do this only if we believe we got all the distinct
+ * values of the table.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog)
+ * instead of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /*
+ * Walk through the sorted data again, and see how many groups
+ * reach the mcv_threshold (and become an item in the MCV list).
+ */
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or new group, so check if we exceed mcv_threshold */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* group hits the threshold, count the group as MCV item */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* within group, so increase the number of items */
+ count += 1;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as
+ * we'll pass this outside this method and thus it needs to be
+ * easy to pfree() the data (and we wouldn't know where the
+ * arrays start).
+ *
+ * TODO Maybe the reasoning that we can't allocate a single
+ * piece because we're passing it out is bogus? Who'd
+ * free a single item of the MCV list, anyway?
+ *
+ * TODO Maybe with a proper encoding (stuffing all the values
+ * into a list-level array, this will be untrue)?
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /*
+ * Repeat the same loop as above, but this time copy the data
+ * into the MCV list (for items exceeding the threshold).
+ *
+ * TODO Maybe we could simply remember indexes of the last item
+ * in each group (from the previous loop)?
+ */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[nitems];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, items[(i-1)].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, items[(i-1)].isnull, sizeof(bool) * numattrs);
+
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ /* next item */
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows
+ * that are not represented by the MCV list).
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ *
+ * A better option would be keeping the ID of the row in
+ * the sort item, and then just walk through the items and
+ * mark rows to remove (in a bitmap of the same size).
+ * There's not space for that in SortItem at this moment,
+ * but it's trivial to add 'private' pointer, or just
+ * using another structure with extra field (starting with
+ * SortItem, so that the comparators etc. still work).
+ *
+ * Another option is to use the sorted array of items
+ * (because that's how we sorted the source data), and
+ * simply do a bsearch() into it. If we find a matching
+ * item, the row belongs to the MCV list.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* used for the searches */
+ SortItem item, mcvitem;;
+
+ item.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ item.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * FIXME we don't need to allocate this, we can reference
+ * the MCV item directly ...
+ */
+ mcvitem.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ mcvitem.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ item.values[j] = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &item.isnull[j]);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /*
+ * TODO Create a SortItem/MCVItem comparator so that
+ * we don't need to do memcpy() like crazy.
+ */
+ memcpy(mcvitem.values, mcvlist->items[j]->values,
+ numattrs * sizeof(Datum));
+ memcpy(mcvitem.isnull, mcvlist->items[j]->isnull,
+ numattrs * sizeof(bool));
+
+ if (multi_sort_compare(&item, &mcvitem, mss) == 0)
+ {
+ match = true;
+ break;
+ }
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the rows and remember how many rows we kept */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(rows_filtered);
+ pfree(item.values);
+ pfree(item.isnull);
+ pfree(mcvitem.values);
+ pfree(mcvitem.isnull);
+ }
+ }
+
+ pfree(values);
+ pfree(items);
+ pfree(isnull);
+
+ return mcvlist;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+bytea *
+fetch_mv_mcvlist(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *mcvlist = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum tmp = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ mcvlist = DatumGetByteaP(tmp);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return mcvlist;
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Serialize MCV list into a bytea value. The basic algorithm is simple:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use 32-bit values for the indexes in step (3), although we
+ * could probably use just 16 bits as we don't allow more than 8k
+ * items in the MCV list max_mcv_items (well, we might increase this to
+ * 32k and still fit into signed 16-bits). But let's be lazy and rely
+ * on the varlena compression to kick in. If most bytes will be 0x00
+ * so it should work nicely.
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider using 16-bit values for the indexes in step (3).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ Size total_length = 0;
+
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for all values, including NULLs (won't use them) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ if (! mcvlist->items[j]->isnull[i]) /* skip NULL values */
+ {
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval)
+ /*
+ * passed by value, so just Datum array (int4, int8, ...)
+ *
+ * TODO Might save a few bytes here, by storing just typlen
+ * bytes instead of whole Datum (8B) on 64-bits.
+ */
+ info[i].nbytes = info[i].nvalues * sizeof(Datum);
+ else if (info[i].typlen > 0)
+ /* pased by reference, but fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data.
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized MCV exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'ptr' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], sizeof(Datum));
+ data += sizeof(Datum);
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the values for each item */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! mcvlist->items[i]->isnull[j])
+ {
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ v = (Datum*)bsearch(&mcvlist->items[i]->values[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims),
+ mcvlist->items[i]->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims),
+ &mcvlist->items[i]->frequency, sizeof(double));
+
+ /* copy the item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/* inverse to serialize_mv_mcvlist() - see the comment there */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ int32 *indexes = NULL;
+ Datum **values = NULL;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert(nitems > 0);
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MCVListData,items) +
+ ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* let's parse the value arrays */
+ values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+
+ /* allocate space for the MCV items */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)palloc0(sizeof(MCVItemData));
+
+ item->values = (Datum*)palloc0(sizeof(Datum)*ndims);
+ item->isnull = (bool*) palloc0(sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ return mcvlist;
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 76b7db7..f88e200 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -35,15 +35,21 @@ CATALOG(pg_mv_statistic,3281)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -59,11 +65,15 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-#define Natts_pg_mv_statistic 5
+#define Natts_pg_mv_statistic 9
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_deps_enabled 2
-#define Anum_pg_mv_statistic_deps_built 3
-#define Anum_pg_mv_statistic_stakeys 4
-#define Anum_pg_mv_statistic_stadeps 5
+#define Anum_pg_mv_statistic_mcv_enabled 3
+#define Anum_pg_mv_statistic_mcv_max_items 4
+#define Anum_pg_mv_statistic_deps_built 5
+#define Anum_pg_mv_statistic_mcv_built 6
+#define Anum_pg_mv_statistic_stakeys 7
+#define Anum_pg_mv_statistic_stadeps 8
+#define Anum_pg_mv_statistic_stamcv 9
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 9fb118a..b4e7b4f 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2687,6 +2687,8 @@ DATA(insert OID = 3284 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3285 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3283 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index a074253..e11aefc 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -25,9 +25,11 @@ typedef struct MVStatsData {
/* statistics requested in ALTER TABLE ... ADD STATISTICS */
bool deps_enabled; /* analyze functional dependencies */
+ bool mcv_enabled; /* analyze MCV lists */
/* available statistics (computed by ANALYZE) */
bool deps_built; /* functional dependencies available */
+ bool mcv_built; /* MCV list is already available */
} MVStatsData;
typedef struct MVStatsData *MVStats;
@@ -66,6 +68,47 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -74,24 +117,39 @@ MVStats list_mv_stats(Oid relid, int *nstats, bool built_only);
bytea * fetch_mv_rules(Oid mvoid);
bytea * fetch_mv_dependencies(Oid mvoid);
+bytea * fetch_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..fa298ea
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,210 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE mcv_list ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+ALTER TABLE mcv_list ADD STATISTICS (dependencies, max_mcv_items 200) ON (a, b, c);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10) ON (a, b, c);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10000) ON (a, b, c);
+ERROR: max number of MCV items is 8192
+-- correct command
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c, d);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 82c2659..80375b8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1357,7 +1357,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
c.relname AS tablename,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index c41762c..78c9b04 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -111,4 +111,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 3845b0f..3f9884f 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -153,3 +153,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..090731e
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,181 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (unknown_column);
+
+-- single column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a);
+
+-- single column, duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE mcv_list ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- missing MCV statistics
+ALTER TABLE mcv_list ADD STATISTICS (dependencies, max_mcv_items 200) ON (a, b, c);
+
+-- invalid mcv_max_items value / too low
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10) ON (a, b, c);
+
+-- invalid mcv_max_items value / too high
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10000) ON (a, b, c);
+
+-- correct command
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c, d);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
--
2.0.5
0004-multivariate-histograms.patchtext/x-diff; name=0004-multivariate-histograms.patchDownload
>From 397a9b96670097df72f95af04687b1874fb6ae31 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 4/4] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
FIX: don't build histogram by default
FIX: analyze histogram only when requested
FIX: improvements in clauselist_mv_selectivity_histogram()
FIX: minor cleanup in multivariate histograms
- move BUCKET_SIZE_SERIALIZED() macro to histogram.c
- rename MVHIST_ constants to MVSTAT_HIST
FIX: comment about building histograms on too many columns
FIX: added comment about check in ALTER TABLE ... ADD STATISTICS
FIX: added comment about handling DROP TABLE / DROP COLUMN
FIX: added comment about ALTER TABLE ... DROP STATISTICS
FIX: comment about building NULL-buckets for a histogram
FIX: initial support for all data types and NULL in MV histograms
This changes the serialize_histogram/update_mv_stats in a bit
strange way (passing VacAttrStats all over the place). This
needs to be improved, somehow, before rebasing into the
histogram part. Otherwise it'll cause needless conflicts.
FIX: refactoring lookup_var_attr_stats() / histograms
FIX: a set of regression tests for MV histograms
This is mostly equal to a combination of all the regression tests
for functional dependencies / MCV lists.
The last test fails due to Assert(!isNull) in partition_bucket()
which prevents NULL values in histograms.
FIX: remove the two memcmp-based comparators (used for histograms)
FIX: comment about memory corruption in deserializing histograms
FIX: remove CompareScalarsContext/ScalarMCVItem from common.h
FIX: fix the lookup_vac_attr() refactoring in histograms
FIX: get rid of the custom comparators in histogram.c
FIX: building NULL-buckets - buckets with just NULLs in some dimension(s)
FIX: fixed bugs in serialize/deserialize methods for histogram
When serializing, BUCKET_MIN_INDEXES were set twice (once instead
of BUCKET_MAX_INDEXES, which were not set at all).
When deserializing, the 'tmp' pointer was not advanced, so just
the first bucket was ever deserialized (and copied into all the
histogram buckets).
Added a few asserts into deserialize method, similarly to how
it's done in serialize.
FIX: formatting issues in the histogram regression test
FIX: remove sample-dependent results from histogram regression test
FIX: add USE_ASSERT_CHECKING to assert-only variable (histogram)
FIX: check ADD STATISTICS options (histograms)
FIX: improved comments/docs for the multivariate histograms
FIX: reuse DimensionInfo (after move to common.h)
FIX: remove obsolete TODO about NULL-buckets, improve comments
FIX: move multivariate histogram definitions after MCV lists
FIX: correct variable names in error message (dimension index 'j')
FIX: add support for 'IS [NOT] NULL' support to histograms
FIX: add regression test for ADD STATISTICS options (histograms)
FIX: added regression test to test IS [NOT] NULL with histograms
FIX: make regression tests parallel-happy (histograms)
---
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/tablecmds.c | 55 +-
src/backend/optimizer/path/clausesel.c | 391 +++++-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 67 +-
src/backend/utils/mvstats/common.h | 14 -
src/backend/utils/mvstats/histogram.c | 1778 ++++++++++++++++++++++++++++
src/backend/utils/mvstats/mcv.c | 1 +
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 2 +
src/include/utils/mvstats.h | 99 +-
src/test/regress/expected/mv_histogram.out | 210 ++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 179 +++
16 files changed, 2776 insertions(+), 57 deletions(-)
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8acf160..3aa7d2b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -160,7 +160,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1f08c1c..2bd3884 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11635,6 +11635,16 @@ static int compare_int16(const void *a, const void *b)
* multiple stats on the same columns with different options
* (say, a detailed MCV-only stats for some queries, histogram
* for others, etc.)
+ *
+ * FIXME Check that at least one of the statistic types is enabled, and
+ * that only compatible options are used. For example if 'mcv' is
+ * not selected, then 'mcv_max_items' can't be used (alternative
+ * might be to enable it automatically).
+ *
+ * TODO It might be useful to have ALTER TABLE DROP STATISTICS too, but
+ * it's tricky because there may be multiple kinds of stats for the
+ * same list of columns, with different options (e.g. one just MCV
+ * list, another with histogram, etc.).
*/
static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
StatisticsDef *def, LOCKMODE lockmode)
@@ -11652,12 +11662,15 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
/* by default build everything */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(def, StatisticsDef));
@@ -11735,6 +11748,29 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -11743,10 +11779,10 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -11754,6 +11790,11 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -11771,10 +11812,14 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 1446fa0..a4e6d16 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -70,6 +70,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStats mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStats mvstats);
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
@@ -1119,6 +1121,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -1132,9 +1135,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -2024,3 +2042,372 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
return s;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all buckets, and increase the match level
+ * for the clauses (and skip buckets that are 'full match').
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStats mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ ListCell * l;
+ char *matches = NULL;
+ MVHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = deserialize_mv_histogram(fetch_mv_histogram(mvstats->mvoid));
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ *
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR
+ | TYPECACHE_GT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, mvstats->stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ bool tmp;
+ MVBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ *
+ * FIXME Once the min/max values are deduplicated, we can easily minimize
+ * the number of calls to the comparator (assuming we keep the
+ * deduplicated structure). See the note on compression at MVBucket
+ * serialize/deserialize methods.
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL: /* column < constant */
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+ if (tmp)
+ {
+ matches[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->max[idx]));
+
+ if (tmp)
+ matches[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ matches[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->min[idx],
+ cst->constvalue));
+
+ if (tmp)
+ matches[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+ break;
+
+ case F_SCALARGTSEL: /* column > constant */
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->max[idx]));
+ if (tmp)
+ {
+ matches[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+
+ if (tmp)
+ matches[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->min[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ matches[i] = MVSTATS_MATCH_NONE; /* no match */
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+
+ if (tmp)
+ matches[i] = MVSTATS_MATCH_PARTIAL; /* partial match */
+ }
+
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the lt/gt
+ * operators fetched from type cache.
+ *
+ * TODO We'll use the default 50% estimate, but that's probably way off
+ * if there are multiple distinct values. Consider tweaking this a
+ * somehow, e.g. using only a part inversely proportional to the
+ * estimated number of distinct values in the bucket.
+ *
+ * TODO This does not handle inclusion flags at the moment, thus counting
+ * some buckets twice (when hitting the boundary).
+ *
+ * TODO Optimization is that if max[i] == min[i], it's effectively a MCV
+ * item and we can count the whole bucket as a complete match (thus
+ * using 100% bucket selectivity and not just 50%).
+ *
+ * TODO Technically some buckets may "degenerate" into single-value
+ * buckets (not necessarily for all the dimensions) - maybe this
+ * is better than keeping a separate MCV list (multi-dimensional).
+ * Update: Actually, that's unlikely to be better than a separate
+ * MCV list for two reasons - first, it requires ~2x the space
+ * (because of storing lower/upper boundaries) and second because
+ * the buckets are ranges - depending on the partitioning algorithm
+ * it may not even degenerate into (min=max) bucket. For example the
+ * the current partitioning algorithm never does that.
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+
+ if (tmp)
+ {
+ matches[i] = MVSTATS_MATCH_NONE; /* constvalue < min */
+ continue;
+ }
+
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+
+ if (tmp)
+ {
+ matches[i] = MVSTATS_MATCH_NONE; /* constvalue > max */
+ continue;
+ }
+
+ /* partial match */
+ matches[i] = MVSTATS_MATCH_PARTIAL;
+
+ break;
+ }
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, mvstats->stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ matches[i] = MVSTATS_MATCH_NONE;
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ matches[i] = MVSTATS_MATCH_NONE;
+ }
+ }
+ }
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+ return s;
+}
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 3c0aff4..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o mcv.o dependencies.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index 69ab805..f6edb2f 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -45,7 +45,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
{
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
/* int2 vector of attnums the stats should be computed on */
int2vector * attrs = mvstats[i].stakeys;
@@ -66,8 +67,23 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (mvstats->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /*
+ * Build a multivariate histogram on the columns.
+ *
+ * FIXME remove the rows used to build the MCV from the histogram.
+ * Another option might be subtracting the MCV selectivities
+ * from the histogram, but I'm not sure whether that works
+ * accurately (maybe it introduces additional errors).
+ */
+ if ((numrows_filtered > 0) && (mvstats->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(mvstats[i].mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(mvstats[i].mvoid, deps, mcvlist, histogram, attrs, stats);
+
+#ifdef MVSTATS_DEBUG
+ print_mv_histogram_info(histogram);
+#endif
}
}
@@ -149,7 +165,7 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
* Skip statistics that were not computed yet (if only stats
* that were already built were requested)
*/
- if (built_only && (! (stats->mcv_built || stats->deps_built)))
+ if (built_only && (! (stats->mcv_built || stats->deps_built || stats->hist_built)))
continue;
/* double the array size if needed */
@@ -161,10 +177,15 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
result[*nstats].mvoid = HeapTupleGetOid(htup);
result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+
result[*nstats].deps_enabled = stats->deps_enabled;
result[*nstats].mcv_enabled = stats->mcv_enabled;
+ result[*nstats].hist_enabled = stats->hist_enabled;
+
result[*nstats].deps_built = stats->deps_built;
result[*nstats].mcv_built = stats->mcv_built;
+ result[*nstats].hist_built = stats->hist_built;
+
*nstats += 1;
}
@@ -178,9 +199,16 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
return result;
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -213,19 +241,31 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
/* Is there already a pg_mv_statistic tuple for this attribute? */
oldtup = SearchSysCache1(MVSTATOID,
@@ -302,25 +342,6 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
-/*
- * qsort_arg comparator for sorting Datum[] (row of Datums) when
- * counting distinct values.
- */
-int
-compare_scalars_memcmp(const void *a, const void *b, void *arg)
-{
- Size len = *(Size*)arg;
-
- return memcmp(a, b, len);
-}
-
-int
-compare_scalars_memcmp_2(const void *a, const void *b)
-{
- return memcmp(a, b, sizeof(Datum));
-}
-
-
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index fca2782..f4309f7 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -47,18 +47,6 @@ typedef struct
int tupno; /* position index for tuple it came from */
} ScalarItem;
-typedef struct
-{
- int count; /* # of duplicates */
- int first; /* values[] index of first occurrence */
-} ScalarMCVItem;
-
-typedef struct
-{
- SortSupport ssup;
- int *tupnoLink;
-} CompareScalarsContext;
-
/* (de)serialization info */
typedef struct DimensionInfo {
int nvalues; /* number of deduplicated values */
@@ -94,5 +82,3 @@ int multi_sort_compare_dim(int dim, const SortItem *a,
int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
-int compare_scalars_memcmp(const void *a, const void *b, void *arg);
-int compare_scalars_memcmp_2(const void *a, const void *b);
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..3acbea2
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,1778 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+/*
+ * Multivariate histograms
+ *
+ * Histograms are a collection of buckets, represented by n-dimensional
+ * rectangles. Each rectangle is delimited by a min/max value in each
+ * dimension, stored in an array, so that the bucket includes values
+ * fulfilling condition
+ *
+ * min[i] <= value[i] <= max[i]
+ *
+ * where 'i' is the dimension. In 1D this corresponds to a simple
+ * interval, in 2D to a rectangle, and in 3D to a block. If you can
+ * imagine this in 4D, congrats!
+ *
+ * In addition to the bounaries, each bucket tracks additional details:
+ *
+ * * frequency (fraction of tuples it matches)
+ * * whether the boundaries are inclusive or exclusive
+ * * whether the dimension contains only NULL values
+ * * number of distinct values in each dimension (for building)
+ *
+ * and possibly some additional information.
+ *
+ * We do expect to support multiple histogram types, with different
+ * features etc. The 'type' field is used to identify those types.
+ * Technically some histogram types might use completely different
+ * bucket representation, but that's not expected at the moment.
+ *
+ * Although the current implementation builds non-overlapping buckets,
+ * the code does not rely on the non-overlapping nature - there are
+ * interesting types of histograms / histogram building algorithms
+ * producing overlapping buckets.
+ *
+ * TODO Currently the histogram does not include information about what
+ * part of the table it covers (because the frequencies are
+ * computed from the rows that may be filtered by MCV list). Seems
+ * wrong, possibly causing misestimates (when not matching the MCV
+ * list, we'll probably get much higher selectivity).
+ *
+ *
+ * Estimating selectivity
+ * ----------------------
+ * With histograms, we always "match" a whole bucket, not indivitual
+ * rows (or values), irrespectedly of the type of clause. Therefore we
+ * can't use the optimizations for equality clauses, as in MCV lists.
+ *
+ * The current implementation uses histograms to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (a) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ * (b) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ * When used on low-cardinality data, histograms usually perform
+ * considerably worse than MCV lists (which are a good fit for this
+ * kind of data). This is especially true on categorical data, where
+ * ordering of the values is only loosely related to meaning of the
+ * data, as proper ordering is crucial for histograms.
+ *
+ * On high-cardinality data the histograms are usually a better choice,
+ * because MCV lists can't accurately represent the distribution.
+ *
+ * By evaluating a clause on a bucket, we may get one of three results:
+ *
+ * (a) FULL_MATCH - The bucket definitely matches the clause.
+ *
+ * (b) PARTIAL_MATCH - The bucket matches the clause, but not
+ * necessarily all the tuples it represents.
+ *
+ * (c) NO_MATCH - The bucket definitely does not match the clause.
+ *
+ * This may be illustrated using a range [1, 5], which is essentially
+ * a 1D bucket. With clause
+ *
+ * WHERE (a < 10) => FULL_MATCH (all range values are below
+ * 10, so the whole bucket matches)
+ *
+ * WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ * the clause, but we don't know how many)
+ *
+ * WHERE (a < 0) => NO_MATCH (all range values are above 1, so
+ * no values from the bucket match)
+ *
+ * Some clauses may produce only some of those results - for example
+ * equality clauses may never produce FULL_MATCH as we always hit only
+ * part of the bucket, not all the values. This results in less accurate
+ * estimates compared to MCV lists, where we can hit a MCV items exactly
+ * (an extreme case of that is 'full match').
+ *
+ * There are clauses that may not produce any PARTIAL_MATCH results.
+ * A nice example of that is 'IS [NOT] NULL' clause, which either
+ * matches the bucket completely (FULL_MATCH) or not at all (NO_MATCH),
+ * thanks to how the NULL-buckets are constructed.
+ *
+ * TODO The IS [NOT] NULL clause is not yet implemented, but should be
+ * rather trivial to.
+ *
+ * Computing the total selectivity estimate is trivial - simply sum
+ * selectivities from all the FULL_MATCH and PARTIAL_MATCH buckets, but
+ * multiply the PARTIAL_MATCH buckets by 0.5 to minimize average error.
+ *
+ *
+ * NULL handling
+ * -------------
+ * Buckets may not contain tuples with NULL and non-NULL values in
+ * a single dimension (attribute). To handle this, the histogram may
+ * contain NULL-buckets, i.e. buckets with one or more NULL-only
+ * dimensions.
+ *
+ * The maximum number of NULL-buckets is determined by the number of
+ * attributes the histogram is built on. For N-dimensional histogram,
+ * the maximum number of NULL-buckets is 2^N. So for 8 attributes
+ * (which is the current value of MVSTATS_MAX_DIMENSIONS), there may be
+ * up to 256 NULL-buckets.
+ *
+ * Those buckets are only built if needed - if there are no NULL values
+ * in the data, no such buckets are built.
+ *
+ *
+ * Serialization
+ * -------------
+ * After serialization, the histograms are marked with 'magic' constant.
+ * to make sure the bytea really is a histogram in serialized form.
+ *
+ * FIXME info about deduplication
+ *
+ *
+ * TODO This structure is used both when building the histogram, and
+ * then when using it to compute estimates. That's why the last
+ * few elements are not used once the histogram is built.
+ *
+ * Add pointer to 'private' data, meant for private data for
+ * other algorithms for building the histogram. It also removes
+ * the bogus / unnecessary fields.
+ *
+ * TODO The limit on number of buckets is quite arbitrary, aiming for
+ * sufficient accuracy while still being fast. Probably should be
+ * replaced with a dynamic limit dependent on statistics target,
+ * number of attributes (dimensions) and statistics target
+ * associated with the attributes. Also, this needs to be related
+ * to the number of sampled rows, by either clamping it to a
+ * reasonable number (after seeing the number of rows) or using
+ * it when computing the number of rows to sample. Something like
+ * 10 rows per bucket seems reasonable.
+ *
+ * TODO Add MVSTAT_HIST_ROWS_PER_BUCKET tracking minimal number of
+ * tuples per bucket (also, see the previous TODO).
+ *
+ * TODO We may replace the bool arrays with a suitably large data type
+ * (say, uint16 or uint32) and get rid of the allocations. It's
+ * unlikely we'll ever support more than 32 columns as that'd
+ * result in poor precision, huge histograms (splitting each
+ * dimension once would mean 2^32 buckets), and very expensive
+ * estimation. MCVItem already does it this way.
+ *
+ * Update: Actually, this is not 100% true, because we're splitting
+ * a single bucket, not all the buckets at the same time. So each
+ * split simply adds one new bucket, and we choose the bucket that
+ * is most in need of a slit. So even with 32 columns this might
+ * give reasonable accuracy, maybe? After 1000 splits we'll get
+ * about 1001 buckets, and some may be quite large (if that area
+ * frequency has low frequency of tuples).
+ *
+ * There are other challenges though - e.g. with this many columns
+ * it's more likely to reference both label/non-label columns,
+ * which is rather quirky (especially with histograms).
+ *
+ * However, while this would save some space for histograms built
+ * on many columns, it won't save anything for up to 4 columns
+ * (actually, on less than 3 columns it's probably wasteful).
+ *
+ * TODO Maybe the distinct stats (both for combination of all columns
+ * and for combinations of various subsets of columns) should be
+ * moved to a separate structure (next to histogram/MCV/...) to
+ * make it useful even without a histogram computed etc.
+ */
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(int32))
+ * - max boundary indexes (2 * ndim * sizeof(int32))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(int32) + 3 * sizeof(bool)) +
+ * 2 * sizeof(float)
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(int32) + 3 * sizeof(bool)) + 2 * sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) ((float*)b)
+#define BUCKET_NDISTINCT(b) ((float*)(b + sizeof(float)))
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + 2 * sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((int32*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* some debugging methods */
+#ifdef MVSTATS_DEBUG
+static void print_mv_histogram_info(MVHistogram histogram);
+#endif
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, by looking at the number of
+ * distinct values (combination of column values for bucket, column
+ * values for a dimension). This is somehow naive, but seems to work
+ * quite well. See the discussion at select_bucket_to_partition and
+ * partition_bucket for more details about alternative algorithms.
+ *
+ * So the current algorithm looks like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (max distinct combinations)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (max distinct values)
+ * split the bucket into two buckets
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int ndistinct;
+ int numattrs = attrs->dim1;
+ int *ndistincts = (int*)palloc0(sizeof(int) * numattrs);
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * We may use this to limit number of buckets too - there can never
+ * be more than ndistinct buckets (or ndistinct/k if we require at
+ * least k tuples per bucket.
+ *
+ * With NULL buckets it's a bit more complicated, because there may
+ * be 2^ndims NULL buckets, and if each contains a single tuple then
+ * there may be up to
+ *
+ * (ndistinct - 2^ndims)/k + 2^ndims
+ *
+ * buckets. But of course, it may happen that (ndistinct < 2^ndims)
+ * which needs to be checked.
+ *
+ * TODO Use this for alternative estimate of number of buckets.
+ */
+ ndistinct = histogram->buckets[0]->ndistinct;
+
+ /*
+ * The initial bucket may contain NULL values, so we have to create
+ * buckets with NULL-only dimensions.
+ *
+ * FIXME We may need up to 2^ndims buckets - check that there are
+ * enough buckets (MVSTAT_HIST_MAX_BUCKETS >= 2^ndims).
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ /* keep the global ndistinct values */
+ for (i = 0; i < numattrs; i++)
+ ndistincts[i] = histogram->buckets[0]->ndistincts[i];
+
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets]
+ = partition_bucket(bucket, attrs, stats);
+
+ histogram->nbuckets += 1;
+ }
+
+ /* finalize the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ int d;
+ histogram->buckets[i]->ntuples
+ = (histogram->buckets[i]->numrows * 1.0) / numrows_total;
+ histogram->buckets[i]->ndistinct
+ = (histogram->buckets[i]->ndistinct * 1.0) / ndistinct;
+
+ for (d = 0; d < numattrs; d++)
+ histogram->buckets[i]->ndistincts[d]
+ = (histogram->buckets[i]->ndistincts[d] * 1.0) / ndistincts[d];
+ }
+
+ pfree(ndistincts);
+
+ return histogram;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+bytea *
+fetch_mv_histogram(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *stahist = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum hist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ stahist = DatumGetByteaP(hist);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /*
+ * TODO Maybe save the histogram into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?.
+ */
+
+ return stahist;
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm
+ * is simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use 32-bit values for the indexes in step (3), although we
+ * could probably use just 16 bits as we don't allow more than 8k
+ * buckets in the histogram max_buckets (well, we might increase this
+ * to 16k and still fit into signed 16-bits). But let's be lazy and rely
+ * on the varlena compression to kick in. If most bytes will be 0x00
+ * so it should work nicely.
+ *
+ *
+ * Deduplication in serialization
+ * ------------------------------
+ * The deduplication is very effective and important here, because every
+ * time we split a bucket, we keep all the boundary values, except for
+ * the dimension that was used for the split. Another way to look at
+ * this is that each split introduces 1 new value (the value used to do
+ * the split). A histogram with M buckets was created by (M-1) splits
+ * of the initial bucket, and each bucket has 2*N boundary values. So
+ * assuming the initial bucket does not have any 'collapsed' dimensions,
+ * the number of distinct values is
+ *
+ * (2*N + (M-1))
+ *
+ * but the total number of boundary values is
+ *
+ * 2*N*M
+ *
+ * which is clearly much higher. For a histogram on two columns, with
+ * 1024 buckets, it's 1027 vs. 4096. Of course, we're not saving all
+ * the difference (because we'll use 32-bit indexes into the values).
+ * But with large values (e.g. stored as varlena), this saves a lot.
+ *
+ * An interesting feature is that the total number of distinct values
+ * does not really grow with the number of dimensions, except for the
+ * size of the initial bucket. After that it only depends on number of
+ * buckets (i.e. number of splits).
+ *
+ * XXX Of course this only holds for the current histogram building
+ * algorithm. Algorithms doing the splits differently (e.g.
+ * producing overlapping buckets) may behave differently.
+ *
+ * TODO This only confirms we can use the uint16 indexes. The worst
+ * that could happen is if all the splits happened by a single
+ * dimension. To exhaust the uint16 this would require ~64k
+ * splits (needs to be reflected in MVSTAT_HIST_MAX_BUCKETS).
+ *
+ * TODO We don't need to use a separate boolean for each flag, instead
+ * use a single char and set bits.
+ *
+ * TODO We might get a bit better compression by considering the actual
+ * data type length. The current implementation treats all data
+ * types passed by value as requiring 8B, but for INT it's actually
+ * just 4B etc.
+ *
+ * OTOH this is only related to the lookup table, and most of the
+ * space is occupied by the buckets (with int16 indexes).
+ *
+ *
+ * Varlena compression
+ * -------------------
+ * This encoding may prevent automatic varlena compression (similarly
+ * to JSONB), because first part of the serialized bytea will be an
+ * array of unique values (although sorted), and pglz decides whether
+ * to compress by trying to compress the first part (~1kB or so). Which
+ * is likely to be poor, due to the lack of repetition.
+ *
+ * One possible cure to that might be storing the buckets first, and
+ * then the deduplicated arrays. The buckets might be better suited
+ * for compression.
+ *
+ * On the other hand the encoding scheme is a context-aware compression,
+ * usually compressing to ~30% (or less, with large data types). So the
+ * lack of pglz compression may be OK.
+ *
+ * XXX But maybe we don't really want to compress this, to save on
+ * planning time?
+ *
+ * TODO Try storing the buckets / deduplicated arrays in reverse order,
+ * measure impact on compression.
+ *
+ *
+ * Deserialization
+ * ---------------
+ * The deserialization is currently implemented so that it reconstructs
+ * the histogram back into the same structures - this involves quite
+ * a few of memcpy() and palloc(), but maybe we could create a special
+ * structure for the serialized histogram, and access the data directly,
+ * without the unpacking.
+ *
+ * Not only it would save some memory and CPU time, but might actually
+ * work better with CPU caches (not polluting the caches).
+ *
+ * TODO Try to keep the compressed form, instead of deserializing it to
+ * MVHistogram/MVBucket.
+ *
+ *
+ * General TODOs
+ * -------------
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs
+ * (we won't use them, but we don't know how many are there),
+ * and then collect all non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval)
+ /*
+ * passed by value, so just Datum array (int4, int8, ...)
+ *
+ * TODO Might save a few bytes here, by storing just typlen
+ * bytes instead of whole Datum (8B) on 64-bits.
+ */
+ info[i].nbytes = info[i].nvalues * sizeof(Datum);
+ else if (info[i].typlen > 0)
+ /* pased by reference, but fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized histogram exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], sizeof(Datum));
+ data += sizeof(Datum);
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the histogram buckets */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ *BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+ *BUCKET_NDISTINCT(bucket) = histogram->buckets[i]->ndistinct;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ int idx;
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ /* min boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* FIXME free the values/counts arrays here */
+
+ return output;
+}
+
+/*
+ * Reverse to serialize histogram. This essentially expands the serialized
+ * form back to MVHistogram / MVBucket.
+ */
+MVHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+ Datum **values = NULL;
+
+ MVHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVHistogramData, buckets));
+ tmp += offsetof(MVHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* let's parse the value arrays */
+ values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+
+ /* allocate space for the buckets */
+ histogram->buckets = (MVBucket*)palloc0(sizeof(MVBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ bucket->nullsonly = (bool*) palloc0(sizeof(bool) * ndims);
+ bucket->min_inclusive = (bool*) palloc0(sizeof(bool) * ndims);
+ bucket->max_inclusive = (bool*) palloc0(sizeof(bool) * ndims);
+
+ bucket->min = (Datum*) palloc0(sizeof(Datum) * ndims);
+ bucket->max = (Datum*) palloc0(sizeof(Datum) * ndims);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+ bucket->ndistinct = *BUCKET_NDISTINCT(tmp);
+
+ memcpy(bucket->nullsonly, BUCKET_NULLS_ONLY(tmp, ndims),
+ sizeof(bool) * ndims);
+
+ memcpy(bucket->min_inclusive, BUCKET_MIN_INCL(tmp, ndims),
+ sizeof(bool) * ndims);
+
+ memcpy(bucket->max_inclusive, BUCKET_MAX_INCL(tmp, ndims),
+ sizeof(bool) * ndims);
+
+ /* translate the indexes to values */
+ for (j = 0; j < ndims; j++)
+ {
+ if (! bucket->nullsonly[j])
+ {
+ bucket->min[j] = values[j][BUCKET_MIN_INDEXES(tmp, ndims)[j]];
+ bucket->max[j] = values[j][BUCKET_MAX_INDEXES(tmp, ndims)[j]];
+ }
+ }
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller
+ * buckets.
+ *
+ * TODO Add ndistinct estimation, probably the one described in "Towards
+ * Estimation Error Guarantees for Distinct Values, PODS 2000,
+ * p. 268-279" (the ones called GEE, or maybe AE).
+ *
+ * TODO The "combined" ndistinct is more likely to scale with the number
+ * of rows (in the table), because a single column behaving this
+ * way is sufficient for such behavior.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+ bucket->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* all the sample rows fall into the initial bucket */
+ bucket->numrows = numrows;
+ bucket->ntuples = numrows;
+ bucket->rows = rows;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ /*
+ * The initial bucket was not split at all, so we'll start with the
+ * first dimension in the next round (index = 0).
+ */
+ bucket->last_split_dimension = -1;
+
+ return bucket;
+}
+
+/*
+ * TODO Fix to handle arbitrarily-sized histograms (not just 2D ones)
+ * and call the right output procedures (for the particular type).
+ *
+ * TODO This should somehow fetch info about the data types, and use
+ * the appropriate output functions to print the boundary values.
+ * Right now this prints the 8B value as an integer.
+ *
+ * TODO Also, provide a special function for 2D histogram, printing
+ * a gnuplot script (with rectangles).
+ *
+ * TODO For string types (once supported) we can sort the strings first,
+ * assign them a sequence of integers and use the original values
+ * as labels.
+ */
+#ifdef MVSTATS_DEBUG
+static void
+print_mv_histogram_info(MVHistogram histogram)
+{
+ int i = 0;
+
+ elog(WARNING, "histogram nbuckets=%d", histogram->nbuckets);
+
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ MVBucket bucket = histogram->buckets[i];
+ elog(WARNING, " bucket %d : ndistinct=%f ntuples=%d min=[%ld, %ld], max=[%ld, %ld] distinct=[%d,%d]",
+ i, bucket->ndistinct, bucket->numrows,
+ bucket->min[0], bucket->min[1], bucket->max[0], bucket->max[1],
+ bucket->ndistincts[0], bucket->ndistincts[1]);
+ }
+}
+#endif
+
+/*
+ * A very simple partitioning selection criteria - choose the bucket
+ * with the highest number of distinct values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int ndistinct = 1; /* if ndistinct=1, we can't split the bucket */
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* if the ndistinct count is higher, use this bucket */
+ if (buckets[i]->ndistinct > ndistinct) {
+ bucket = buckets[i];
+ ndistinct = buckets[i]->ndistinct;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - splits the dimensions in
+ * a round-robin manner (considering only those with ndistinct>1). That
+ * is first a dimension 0 is split, then 1, 2, ... until reaching the
+ * end of attribute list, and then wrapping back to 0. Of course,
+ * dimensions with a single distinct value are skipped.
+ *
+ * This is essentially what Muralikrishna/DeWitt described in their SIGMOD
+ * article (M. Muralikrishna, David J. DeWitt: Equi-Depth Histograms For
+ * Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD
+ * Conference 1988: 28-36).
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * This splits the bucket by tweaking the existing one, and returning the
+ * new bucket (essentially shrinking the existing one in-place and returning
+ * the other "half" as a new bucket). The caller is responsible for adding
+ * the new bucket into the list of buckets.
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case of
+ * strongly dependent columns - e.g. y=x).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g. to
+ * split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(bucket->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = bucket->rows;
+ int oldnrows = bucket->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(bucket->ndistinct > 1);
+ Assert(bucket->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split, in a round robin manner.
+ * We'll use the first one with (ndistinct > 1).
+ *
+ * If we happen to wrap around, something clearly went wrong (we
+ * can't mess with the last_split_dimension directly, because we
+ * couldn't do this check).
+ */
+ dimension = bucket->last_split_dimension;
+ while (true)
+ {
+ dimension = (dimension + 1) % numattrs;
+
+ if (bucket->ndistincts[dimension] > 1)
+ break;
+
+ /* if we ran the previous split dimension, it's infinite loop */
+ Assert(dimension != bucket->last_split_dimension);
+ }
+
+ /* Remember the dimension for the next split of this bucket. */
+ bucket->last_split_dimension = dimension;
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < bucket->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(bucket->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ split_value = values[0].value;
+ for (i = 1; i < bucket->numrows; i++)
+ {
+ /* count distinct values */
+ if (values[i].value != values[i-1].value)
+ ndistinct += 1;
+
+ /* once we've seen 1/2 distinct values (and use the value) */
+ if (ndistinct > bucket->ndistincts[dimension] / 2)
+ {
+ split_value = values[i].value;
+ break;
+ }
+
+ /* keep track how many rows belong to the first bucket */
+ nrows += 1;
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < bucket->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ bucket->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_bucket->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ bucket->numrows = nrows;
+ new_bucket->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&bucket->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_bucket->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ new_bucket->last_split_dimension = bucket->last_split_dimension;
+
+ /* allocate the per-dimension arrays */
+ new_bucket->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numrows = bucket->numrows;
+ int numattrs = attrs->dim1;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(bucket->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ bucket->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ bucket->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ Datum * values = (Datum*)palloc0(bucket->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ bucket->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < bucket->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(bucket->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ bucket->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ bucket->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and
+ * non-NULL values in a single dimension. Each dimension may either be
+ * marked as 'nulls only', and thus containing only NULL values, or
+ * it must not contain any NULL values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns,
+ * it's necessary to build those NULL-buckets. This is done in an
+ * iterative way using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL
+ * and non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not
+ * marked as NULL-only, mark it as NULL-only and run the
+ * algorithm again (on this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the
+ * bucket into two parts - one with NULL values, one with
+ * non-NULL values (replacing the current one). Then run
+ * the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions
+ * should be quite low - limited by the number of NULL-buckets. Also,
+ * in each branch the number of nested calls is limited by the number
+ * of dimensions (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The
+ * number of buckets produced by this algorithm is rather limited - with
+ * N dimensions, there may be only 2^N such buckets (each dimension may
+ * be either NULL or non-NULL). So with 8 dimensions (current value of
+ * MVSTATS_MAX_DIMENSIONS) there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further
+ * optimizing the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+
+ numrows = bucket->numrows;
+ oldrows = bucket->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL
+ * in a dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < bucket->numrows; i++)
+ {
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(bucket->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < bucket->numrows; i++)
+ {
+ if (heap_attisnull(bucket->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= bucket->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only,
+ * but is not yet marked like that. It's enough to mark it and
+ * repeat the process recursively (until we run out of dimensions).
+ */
+ if (null_count == bucket->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in
+ * the dimension, one with non-NULL values. We don't need to sort
+ * the data or anything, but otherwise it's similar to what's done
+ * in partition_bucket().
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+
+ /* remember the current array info */
+ oldrows = bucket->rows;
+ numrows = bucket->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ bucket->numrows = (numrows - null_count);
+ bucket->rows
+ = (HeapTuple*)palloc0(bucket->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_bucket->numrows = null_count;
+ null_bucket->rows
+ = (HeapTuple*)palloc0(null_bucket->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_bucket->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&bucket->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each
+ * bucket (NULL is not a value, so 0, and the other bucket got
+ * all the ndistinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
index 2b3d171..b0cea61 100644
--- a/src/backend/utils/mvstats/mcv.c
+++ b/src/backend/utils/mvstats/mcv.c
@@ -961,6 +961,7 @@ MCVList deserialize_mv_mcvlist(bytea * data)
for (i = 0; i < nitems; i++)
{
+ /* FIXME allocate as a single chunk (minimize palloc overhead) */
MCVItem item = (MCVItem)palloc0(sizeof(MCVItemData));
item->values = (Datum*)palloc0(sizeof(Datum)*ndims);
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index f88e200..08424bd 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -36,13 +36,16 @@ CATALOG(pg_mv_statistic,3281)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -50,6 +53,7 @@ CATALOG(pg_mv_statistic,3281)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -65,15 +69,19 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-#define Natts_pg_mv_statistic 9
+#define Natts_pg_mv_statistic 13
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_deps_enabled 2
#define Anum_pg_mv_statistic_mcv_enabled 3
-#define Anum_pg_mv_statistic_mcv_max_items 4
-#define Anum_pg_mv_statistic_deps_built 5
-#define Anum_pg_mv_statistic_mcv_built 6
-#define Anum_pg_mv_statistic_stakeys 7
-#define Anum_pg_mv_statistic_stadeps 8
-#define Anum_pg_mv_statistic_stamcv 9
+#define Anum_pg_mv_statistic_hist_enabled 4
+#define Anum_pg_mv_statistic_mcv_max_items 5
+#define Anum_pg_mv_statistic_hist_max_buckets 6
+#define Anum_pg_mv_statistic_deps_built 7
+#define Anum_pg_mv_statistic_mcv_built 8
+#define Anum_pg_mv_statistic_hist_built 9
+#define Anum_pg_mv_statistic_stakeys 10
+#define Anum_pg_mv_statistic_stadeps 11
+#define Anum_pg_mv_statistic_stamcv 12
+#define Anum_pg_mv_statistic_stahist 13
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index b4e7b4f..448e76a 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2689,6 +2689,8 @@ DATA(insert OID = 3285 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies show");
DATA(insert OID = 3283 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3282 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index e11aefc..028a634 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -26,10 +26,12 @@ typedef struct MVStatsData {
/* statistics requested in ALTER TABLE ... ADD STATISTICS */
bool deps_enabled; /* analyze functional dependencies */
bool mcv_enabled; /* analyze MCV lists */
+ bool hist_enabled; /* analyze histogram */
/* available statistics (computed by ANALYZE) */
bool deps_built; /* functional dependencies available */
bool mcv_built; /* MCV list is already available */
+ bool hist_built; /* histogram is already available */
} MVStatsData;
typedef struct MVStatsData *MVStats;
@@ -109,6 +111,91 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+ float ndistinct; /* frequency of distinct values */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized), but
+ * it could be useful for estimating ndistinct for combinations of
+ * columns.
+ *
+ * It would mean tracking 2^N values for each bucket, and even if
+ * those values might be stores in 1B it's still a lot of space
+ * (considering the expected number of buckets).
+ *
+ * TODO Consider tracking ndistincts for all attribute combinations.
+ */
+ uint32 *ndistincts;
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /*
+ * Sample tuples falling into this bucket, index of the dimension
+ * the bucket was split by in the last step.
+ *
+ * XXX These fields are needed only while building the histogram,
+ * and are not serialized at all.
+ */
+ HeapTuple *rows;
+ uint32 numrows;
+ int last_split_dimension;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -118,14 +205,18 @@ bytea * fetch_mv_rules(Oid mvoid);
bytea * fetch_mv_dependencies(Oid mvoid);
bytea * fetch_mv_mcvlist(Oid mvoid);
+bytea * fetch_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
@@ -137,6 +228,7 @@ int mv_get_index(AttrNumber varattno, int2vector * stakeys);
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -146,10 +238,15 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..a0cf37f
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,210 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE mv_histogram ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+ALTER TABLE mv_histogram ADD STATISTICS (dependencies, max_buckets 200) ON (a, b, c);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 10) ON (a, b, c);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 100000) ON (a, b, c);
+ERROR: minimum number of buckets is 16384
+-- correct command
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=10000
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=1001
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=1001
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=10000
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=3492
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=3433
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c, d);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 80375b8..07896b4 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1359,7 +1359,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 78c9b04..d9864b7 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -111,4 +111,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 3f9884f..d901a78 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -154,3 +154,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..a693e35
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,179 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (unknown_column);
+
+-- single column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a);
+
+-- single column, duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE mv_histogram ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- missing histogram statistics
+ALTER TABLE mv_histogram ADD STATISTICS (dependencies, max_buckets 200) ON (a, b, c);
+
+-- invalid max_buckets value / too low
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 10) ON (a, b, c);
+
+-- invalid max_buckets value / too high
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 100000) ON (a, b, c);
+
+-- correct command
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c, d);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
--
2.0.5
Hello,
Patch 0001 needs changes for OIDs since my patch was
committed. The attached is compatible with current master.
And I tried this like this, and got the following error on
analyze. But unfortunately I don't have enough time to
investigate it now.
postgres=# create table t1 (a int, b int, c int);
insert into t1 (select a/ 10000, a / 10000, a / 10000 from generate_series(0, 99999) a);
postgres=# analyze t1;
ERROR: invalid memory alloc request size 1485176862
regards,
At Sat, 24 Jan 2015 21:21:39 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <54C3FED3.1060600@2ndquadrant.com>
Show quoted text
Hi,
attached is an updated version of the multivariate stats patch. This is
going to be a bit longer mail, so I'll put here a small ToC ;-)1) patch split into 4 parts
2) where to start / documentation
3) state of the code
4) main changes/improvements
5) remaining limitationsThe motivation and design ideas, explained in the first message of this
thread are still valid. It might be a good idea to read it first:/messages/by-id/543AFA15.4080608@fuzzy.cz
BTW if you happen to go to FOSDEM [PGDay], I'll gladly give you an intro
into the patch in person, or discuss the patch in general.1) Patch split into 4 parts
---------------------------
Firstly, the patch got broken into the following four pieces, to make
the reviews somewhat easier:1) 0001-shared-infrastructure-and-functional-dependencies.patch
- infrastructure, shared by all the kinds of stats added
in the following patches (catalog, ALTER TABLE, ANALYZE ...)- implementation of a simple statistics, tracking functional
dependencies between columns (previously called "associative
rules", but that's incorrect for several reasons)- this does not modify the optimizer in any way
2) 0002-clause-reduction-using-functional-dependencies.patch- applies the functional dependencies to optimizer (i.e. considers
the rules in clauselist_selectivity())3) 0003-multivariate-MCV-lists.patch
- multivariate MCV lists (both ANALYZE and optimizer parts)
4) 0004-multivariate-histograms.patch
- multivariate histograms (both ANALYZE and optimizer parts)
You may look at the patches at github here:
https://github.com/tvondra/postgres/tree/multivariate-stats-squashed
The branch is not stable, i.e. I'll rebase / squash / force-push changes
in the future. (There's also multivariate-stats development branch with
unsquashed changes, but you don't want to look at that, trust me.)The patches are not exactly small (being in the 50-100 kB range), but
that's mostly because of the amount of comments explaining the goals and
implementation details.2) Where to start / documentation
---------------------------------
I strived to document all the pieces properly, mostly in the form of
comments. There's no sgml documentation at this point, which should
obviously change in the future.Anyway, I'd suggest reading the first e-mail in this thread, explaining
the ideas, and then these comments:1) functional dependencies (patch 0001)
- src/backend/utils/mvstats/dependencies.c2) MCV lists (patch 0003)
- src/backend/utils/mvstats/mcv.c3) histograms (patch 0004)
- src/backend/utils/mvstats/mcv.c- also see clauselist_mv_selectivity_mcvlist() in clausesel.c
- also see clauselist_mv_selectivity_histogram() in clausesel.c4) selectivity estimation (patches 0002-0004)
- all in src/backend/optimizer/path/clausesel.c
- clauselist_selectivity() - overview of how the stats are applied
- clauselist_apply_dependencies() - functional dependencies reduction
- clauselist_mv_selectivity_mcvlist() - MCV list estimation
- clauselist_mv_selectivity_histogram() - histogram estimation3) State of the code
--------------------
I've spent a fair amount of time testing the patches, and while I
believe there are no segfaults or so, I know parts of the code need a
bit more love.The part most in need of improvements / comments is probably the code in
clausesel.c - that seems a bit quirky. Reviews / comments regarding this
part of the code are very welcome - I'm sure there are many ways to
improve this part.There are a few FIXMEs elsewhere (e.g. about memory allocation in the
(de)serialization code), but those are mostly well-defined issues that I
know how to address (at least I believe so).4) Main changes/improvements
----------------------------
There are many significant improvements. The previous patch version was
in the 'proof of concept' category (missing pieces, knowingly broken in
some areas), the current patch should 'mostly work'.The patch fixes two most annoying limitations of the first version:
(a) support for all data types (not just those passed by value)
(b) handles NULL values properly
(c) adds support for IS [NOT] NULL clausesAside from that the code was significantly improved, there are proper
regression tests and plenty of comments explaining the details.5) Remaining limitations
------------------------(a) limited to stats on 8 columns
This is mostly just a 'safeguard' restriction.
(b) only data types with '<' operator
I don't think this will change anytime soon, because all the
algorithms for building the stats rely on this. I don't see
this as a serious limitation though.(c) not handling DROP COLUMN or DROP TABLE and so on
Currently this is not handled at all (so the regression tests
do an explicit DELETE from the pg_mv_statistic catalog).Handling the DROP TABLE won't be difficult, it's similar to the
current stats. Handling ALTER TABLE ... DROP COLUMN will be much
more tricky I guess - should we drop all the stats referencing
that column, or should we just remove it from the stats? Or
should we keep it and treat it as NULL? Not sure what's the best
solution.(d) limited list of compatible WHERE clauses
The initial patch handled only simple operator clauses
(Var op Constant)
where operator is one of ('<', '<=', '=', '>=', '>'). Now it also
handles IS [NOT] NULL clauses. Adding more clause types should
not be overly difficult - starting with more traditional
'BooleanTest' conditions, or even multi-column conditions
(Var op Var)which are difficult to estimate using simple-column stats.
(e) optimizer uses single stats per table
This is still true and I don't think this will change soon. i do
have some ideas on how to merge multiple stats etc. but it's
certainly complex stuff, unlikely to happen within this CF. The
patch makes a lot of sense even without this particular feature,
because you can create multiple stats, each suitable for different
queries.(f) no JOIN conditions
Similarly to the previous point, it's on the TODO but it's not
going to happen in this CF.kind regards
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-shared-infrastructure-and-functional-dependencies.patchtext/x-patch; charset=us-asciiDownload
>From 9ebfadb5d6cd9b55dd2707cfc8c789884dafa7fa Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 1/4] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate
stats, most importantly:
- adds a new system catalog (pg_mv_statistic)
- ALTER TABLE ... ADD STATISTICS syntax
- implementation of functional dependencies (the simplest
type of multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e.
it does not influence the query planning.
FIX: invalid assert in lookup_var_attr_stats()
The current implementation requires a valid 'ltopr'
so that we can sort the sample rows in various ways,
and the assert did verify this by checking that the
function is 'compute_scalar_stats'. This is however
private function in analyze.c, so the check failed
after moving the code into common.c.
Fixed by checking the 'ltopr' operator directly.
Eventually this will be removed, as ltopr is only
needed for histograms (functional dependencies and
MVC lists may be built without it).
FIX: improved comments about functional dependencies
FIX: add magic (MVSTAT_DEPS_MAGIC) into MVDependencies
FIX: improved analysis of functional dependencies
Changes:
- decreased minimum group size
- count contradicting rows ('not supporting' ones)
The algorithm is still rather simple and probably needs
other improvements.
FIX: add pg_mv_stats_dependencies_show() function
This function actually prints the rules, not just some basic
info (number of rules) as pg_mv_stats_dependencies_info().
FIX: (dependencies != NULL) in pg_mv_stats_dependencies_info()
STRICT is not a solution, because the deserialization may fail
for some reason (corrupted data, ...)
FIX: rename 'associative rules' to 'functional dependencies'
It's a more appropriate name as functional dependencies,
as defined in relational theory (esp. Normal Forms) are
tracking column-level dependencies.
Associative (or more correctly 'association') rules are
tracking dependencies between particular values, and not
necessarily in different columns (shopping bag analysis).
Also, did a bunch of comment improvements, minor fixes.
This does not include changes in clausesel.c!
FIX: remove obsolete Assert() enforcing typbyval types
---
src/backend/catalog/Makefile | 1 +
src/backend/catalog/system_views.sql | 10 +
src/backend/commands/analyze.c | 17 +-
src/backend/commands/tablecmds.c | 149 +++++++-
src/backend/nodes/copyfuncs.c | 15 +-
src/backend/parser/gram.y | 67 +++-
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/syscache.c | 12 +
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/common.c | 272 ++++++++++++++
src/backend/utils/mvstats/common.h | 70 ++++
src/backend/utils/mvstats/dependencies.c | 554 +++++++++++++++++++++++++++++
src/include/catalog/indexing.h | 5 +
src/include/catalog/pg_mv_statistic.h | 69 ++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/utils/mvstats.h | 86 +++++
src/include/utils/syscache.h | 1 +
src/test/regress/expected/rules.out | 8 +
src/test/regress/expected/sanity_check.out | 1 +
22 files changed, 1365 insertions(+), 9 deletions(-)
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index a403c64..d6c16f8 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2800f73..d05a716 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -150,6 +150,16 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 75b45f7..da98d54 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -54,7 +55,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Data structure for Algorithm S from Knuth 3.4.2 */
typedef struct
@@ -110,7 +115,6 @@ static void update_attstats(Oid relid, bool inh,
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
-
/*
* analyze_rel() -- analyze one relation
*/
@@ -472,6 +476,13 @@ do_analyze_rel(Relation onerel, int options, List *va_cols,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For small number of dimensions it works, but
+ * for complex stats it'd be nice use sample proportional to
+ * the table (say, 0.5% - 1%) instead of a fixed size.
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -574,6 +585,9 @@ do_analyze_rel(Relation onerel, int options, List *va_cols,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
@@ -2825,3 +2839,4 @@ compare_mcvs(const void *a, const void *b)
return da - db;
}
+
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 623e6bf..0df7f03 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -35,6 +35,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_tablespace.h"
@@ -92,7 +93,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -140,8 +141,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
@@ -416,7 +418,8 @@ static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKM
static void ATExecGenericOptions(Relation rel, List *options);
static void ATExecEnableRowSecurity(Relation rel);
static void ATExecDisableRowSecurity(Relation rel);
-
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode);
static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
static const char *storage_name(char c);
@@ -2989,6 +2992,7 @@ AlterTableGetLockLevel(List *cmds)
* updates.
*/
case AT_SetStatistics: /* Uses MVCC in getTableAttrs() */
+ case AT_AddStatistics: /* XXX not sure if the right level */
case AT_ClusterOn: /* Uses MVCC in getIndexes() */
case AT_DropCluster: /* Uses MVCC in getIndexes() */
case AT_SetOptions: /* Uses MVCC in getTableAttrs() */
@@ -3145,6 +3149,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
pass = AT_PASS_ADD_CONSTR;
break;
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
+ case AT_AddStatistics: /* XXX maybe not the right place */
ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode);
/* Performs own permission checks */
ATPrepSetStatistics(rel, cmd->name, cmd->def, lockmode);
@@ -3440,6 +3445,9 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
ATExecSetStatistics(rel, cmd->name, cmd->def, lockmode);
break;
+ case AT_AddStatistics: /* ADD STATISTICS */
+ ATExecAddStatistics(tab, rel, (StatisticsDef *) cmd->def, lockmode);
+ break;
case AT_SetOptions: /* ALTER COLUMN SET ( options ) */
ATExecSetOptions(rel, cmd->name, cmd->def, false, lockmode);
break;
@@ -11638,3 +11646,136 @@ RangeVarCallbackForAlterRelation(const RangeVar *rv, Oid relid, Oid oldrelid,
ReleaseSysCache(tuple);
}
+
+/* used for sorting the attnums in ATExecAddStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the ALTER TABLE ... ADD STATISTICS (options) ON (columns).
+ *
+ * The code is an unholy mix of pieces that really belong to other parts
+ * of the source tree.
+ *
+ * FIXME Check that the types are pass-by-value and support sort,
+ * although maybe we can live without the sort (and only build
+ * MCV list / association rules).
+ *
+ * FIXME This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+
+ /* by default build everything */
+ bool build_dependencies = true;
+
+ Assert(IsA(def, StatisticsDef));
+
+ /* transform the column names to attnum values */
+
+ foreach(l, def->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, def->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ heap_freetuple(htup);
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ return;
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 029761e..df230d6 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3918,6 +3918,17 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static StatisticsDef *
+_copyStatisticsDef(const StatisticsDef *from)
+{
+ StatisticsDef *newnode = makeNode(StatisticsDef);
+
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4744,7 +4755,9 @@ copyObject(const void *from)
case T_RoleSpec:
retval = _copyRoleSpec(from);
break;
-
+ case T_StatisticsDef:
+ retval = _copyStatisticsDef(from);
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
retval = 0; /* keep compiler quiet */
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 82405b9..0346a00 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -367,6 +367,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
create_generic_options alter_generic_options
relation_expr_list dostmt_opt_list
+%type <list> OptStatsOptions
+%type <str> stats_options_name
+%type <node> stats_options_arg
+%type <defelt> stats_options_elem
+%type <list> stats_options_list
+
+
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -486,7 +493,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <keyword> unreserved_keyword type_func_name_keyword
%type <keyword> col_name_keyword reserved_keyword
-%type <node> TableConstraint TableLikeClause
+%type <node> TableConstraint TableLikeClause TableStatistics
%type <ival> TableLikeOptionList TableLikeOption
%type <list> ColQualList
%type <node> ColConstraint ColConstraintElem ConstraintAttr
@@ -2311,6 +2318,14 @@ alter_table_cmd:
n->subtype = AT_DisableRowSecurity;
$$ = (Node *)n;
}
+ /* ALTER TABLE <name> ADD STATISTICS (options) ON (columns) ... */
+ | ADD_P TableStatistics
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_AddStatistics;
+ n->def = $2;
+ $$ = (Node *)n;
+ }
| alter_generic_options
{
AlterTableCmd *n = makeNode(AlterTableCmd);
@@ -3381,6 +3396,56 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * ALTER TABLE relname ADD STATISTICS (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+TableStatistics:
+ STATISTICS OptStatsOptions ON '(' columnList ')'
+ {
+ StatisticsDef *n = makeNode(StatisticsDef);
+ n->keys = $5;
+ n->options = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
+OptStatsOptions:
+ '(' stats_options_list ')' { $$ = $2; }
+ | /*EMPTY*/ { $$ = NIL; }
+ ;
+
+stats_options_list:
+ stats_options_elem
+ {
+ $$ = list_make1($1);
+ }
+ | stats_options_list ',' stats_options_elem
+ {
+ $$ = lappend($1, $3);
+ }
+ ;
+
+stats_options_elem:
+ stats_options_name stats_options_arg
+ {
+ $$ = makeDefElem($1, $2);
+ }
+ ;
+
+stats_options_name:
+ NonReservedWord { $$ = $1; }
+ ;
+
+stats_options_arg:
+ opt_boolean_or_string { $$ = (Node *) makeString($1); }
+ | NumericOnly { $$ = (Node *) $1; }
+ | /* EMPTY */ { $$ = NULL; }
+ ;
+
/*****************************************************************************
*
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index bd27168..f61ef7e 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -499,6 +500,17 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 128
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..36757d5
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,272 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ MVStats mvstats;
+ int nmvstats;
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel), &nmvstats, false);
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ MVDependencies deps = NULL;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = mvstats[i].stakeys;
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, natts, vacattrstats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(mvstats[i].mvoid, deps);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+MVStats
+list_mv_stats(Oid relid, int *nstats, bool built_only)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ MVStats result;
+
+ /* start with 16 items, that should be enough for most cases */
+ int maxitems = 16;
+ result = (MVStats)palloc0(sizeof(MVStatsData) * maxitems);
+ *nstats = 0;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /*
+ * Skip statistics that were not computed yet (if only stats
+ * that were already built were requested)
+ */
+ if (built_only && (! stats->deps_built))
+ continue;
+
+ /* double the array size if needed */
+ if (*nstats == maxitems)
+ {
+ maxitems *= 2;
+ result = (MVStats)repalloc(result, sizeof(MVStatsData) * maxitems);
+ }
+
+ result[*nstats].mvoid = HeapTupleGetOid(htup);
+ result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ result[*nstats].deps_built = stats->deps_built;
+ *nstats += 1;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting Datum[] (row of Datums) when
+ * counting distinct values.
+ */
+int
+compare_scalars_memcmp(const void *a, const void *b, void *arg)
+{
+ Size len = *(Size*)arg;
+
+ return memcmp(a, b, len);
+}
+
+int
+compare_scalars_memcmp_2(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(Datum));
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..f511c4e
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,70 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/datum.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "access/sysattr.h"
+
+#include "utils/mvstats.h"
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+typedef struct
+{
+ int count; /* # of duplicates */
+ int first; /* values[] index of first occurrence */
+} ScalarMCVItem;
+
+typedef struct
+{
+ SortSupport ssup;
+ int *tupnoLink;
+} CompareScalarsContext;
+
+
+VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
+int compare_scalars_memcmp(const void *a, const void *b, void *arg);
+int compare_scalars_memcmp_2(const void *a, const void *b);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..b900efd
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,554 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+/*
+ * Mine functional dependencies between columns, in the form (A => B),
+ * meaning that a value in column 'A' determines value in 'B'. A simple
+ * artificial example may be a table created like this
+ *
+ * CREATE TABLE deptest (a INT, b INT)
+ * AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+ *
+ * Clearly, once we know the value for 'A' we can easily determine the
+ * value of 'B' by dividing (A/10). A more practical example may be
+ * addresses, where (ZIP code => city name), i.e. once we know the ZIP,
+ * we probably know which city it belongs to. Larger cities usually have
+ * multiple ZIP codes, so the dependency can't be reversed.
+ *
+ * Functional dependencies are a concept well described in relational
+ * theory, especially in definition of normalization and "normal forms".
+ * Wikipedia has a nice definition of a functional dependency [1]:
+ *
+ * In a given table, an attribute Y is said to have a functional
+ * dependency on a set of attributes X (written X -> Y) if and only
+ * if each X value is associated with precisely one Y value. For
+ * example, in an "Employee" table that includes the attributes
+ * "Employee ID" and "Employee Date of Birth", the functional
+ * dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ * It follows from the previous two sentences that each {Employee ID}
+ * is associated with precisely one {Employee Date of Birth}.
+ *
+ * [1] http://en.wikipedia.org/wiki/Database_normalization
+ *
+ * Most datasets might be normalized not to contain any such functional
+ * dependencies, but sometimes it's not practical. In some cases it's
+ * actually a conscious choice to model the dataset in denormalized way,
+ * either because of performance or to make querying easier.
+ *
+ * The current implementation supports only dependencies between two
+ * columns, but this is merely a simplification of the initial patch.
+ * It's certainly useful to mine for dependencies involving multiple
+ * columns on the 'left' side, i.e. a condition for the dependency.
+ * That is dependencies [A,B] => C and so on.
+ *
+ * Handling multiple columns on the right side is not necessary, as such
+ * dependencies may be decomposed into a set of dependencies with
+ * the same meaning, one for each column on the right side. For example
+ *
+ * A => [B,C]
+ *
+ * is exactly the same as
+ *
+ * (A => B) & (A => C).
+ *
+ * Of course, storing (A => [B, C]) may be more efficient thant storing
+ * the two dependencies (A => B) and (A => C) separately.
+ *
+ *
+ * Dependency mining (ANALYZE)
+ * ---------------------------
+ *
+ * FIXME Add more details about how build_mv_dependencies() works
+ * (minimum group size, supporting/contradicting etc.).
+ *
+ * Real-world datasets are imperfect - there may be errors (e.g. due to
+ * data-entry mistakes), or factually correct records, yet contradicting
+ * the dependency (e.g. when a city splits into two, but both keep the
+ * same ZIP code). A strict ANALYZE implementation (where the functional
+ * dependencies are identified) would ignore dependencies on such noisy
+ * data, making the approach unusable in practice.
+ *
+ * The proposed implementation attempts to handle such noisy cases
+ * gracefully, by tolerating small number of contradicting cases.
+ *
+ * In the future this might also perform some sort of test and decide
+ * whether it's worth building any other kind of multivariate stats,
+ * or whether the dependencies sufficiently describe the data. Or at
+ * least not build the MCV list / histogram on the implied columns.
+ * Such reduction would however make the 'verification' (see the next
+ * section) impossible.
+ *
+ *
+ * Clause reduction (planner/optimizer)
+ * ------------------------------------
+ *
+ * FIXME Explain how reduction works.
+ *
+ * The problem with the reduction is that the query may use conditions
+ * that are not redundant, but in fact contradictory - e.g. the user
+ * may search for a ZIP code and a city name not matching the ZIP code.
+ *
+ * In such cases, the condition on the city name is not actually
+ * redundant, but actually contradictory (making the result empty), and
+ * removing it while estimating the cardinality will make the estimate
+ * worse.
+ *
+ * The current estimation assuming independence (and multiplying the
+ * selectivities) works better in this case, but only by utter luck.
+ *
+ * In some cases this might be verified using the other multivariate
+ * statistics - MCV lists and histograms. For MCV lists the verification
+ * might be very simple - peek into the list if there are any items
+ * matching the clause on the 'A' column (e.g. ZIP code), and if such
+ * item is found, check that the 'B' column matches the other clause.
+ * If it does not, the clauses are contradictory. We can't really say
+ * if such item was not found, except maybe restricting the selectivity
+ * using the MCV data (e.g. using min/max selectivity, or something).
+ *
+ * With histograms, it might work similarly - we can't check the values
+ * directly (because histograms use buckets, unlike MCV lists, storing
+ * the actual values). So we can only observe the buckets matching the
+ * clauses - if those buckets have very low frequency, it probably means
+ * the two clauses are incompatible.
+ *
+ * It's unclear what 'low frequency' is, but if one of the clauses is
+ * implied (automatically true because of the other clause), then
+ *
+ * selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+ *
+ * So we might compute selectivity of the first clause (on the column
+ * A in dependency [A=>B]) - for example using regular statistics.
+ * And then check if the selectivity computed from the histogram is
+ * about the same (or significantly lower).
+ *
+ * The problem is that histograms work well only when the data ordering
+ * matches the natural meaning. For values that serve as labels - like
+ * city names or ZIP codes, or even generated IDs, histograms really
+ * don't work all that well. For example sorting cities by name won't
+ * match the sorting of ZIP codes, rendering the histogram unusable.
+ *
+ * The MCV are probably going to work much better, because they don't
+ * really assume any sort of ordering. And it's probably more appropriate
+ * for the label-like data.
+ *
+ * TODO Support dependencies with multiple columns on left/right.
+ *
+ * TODO Investigate using histogram and MCV list to confirm the
+ * functional dependencies.
+ *
+ * TODO Investigate statistical testing of the distribution (to decide
+ * whether it makes sense to build the histogram/MCV list).
+ *
+ * TODO Using a min/max of selectivities would probably make more sense
+ * for the associated columns.
+ *
+ * TODO Consider eliminating the implied columns from the histogram and
+ * MCV lists (but maybe that's not a good idea).
+ *
+ * FIXME Not sure if this handles NULL values properly (not sure how to
+ * do that). We assume that NULL means 0 for now, handling it just
+ * like any other value.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ bool isNull;
+ Size len = 2 * sizeof(Datum); /* only simple associations a => b */
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct columns in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we should expected to
+ * observe in the sample - we can then use the average group
+ * size as a threshold. That seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /* info for the interesting attributes only
+ *
+ * TODO Compute this only once and pass it to all the methods
+ * that need it.
+ */
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* We'll reuse the same array for all the combinations */
+ Datum * values = (Datum*)palloc0(numrows * 2 * sizeof(Datum));
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ Datum val_a, val_b;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ values[i*2] = heap_getattr(rows[i], attrs->values[dima], stats[dima]->tupDesc, &isNull);
+ values[i*2+1] = heap_getattr(rows[i], attrs->values[dimb], stats[dimb]->tupDesc, &isNull);
+ }
+
+ qsort_arg((void *) values, numrows, sizeof(Datum) * 2, compare_scalars_memcmp, &len);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ val_a = values[0];
+ val_b = values[1];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ if (values[2*i] != val_a) /* end of the group */
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ */
+ n_supporting += ((n_violations == 0) && (group_size >= min_group_size)) ? 1 : 0;
+ n_contradicting += (n_violations != 0) ? 1 : 0;
+
+ n_supporting_rows += ((n_violations == 0) && (group_size >= min_group_size)) ? group_size : 0;
+ n_contradicting_rows += (n_violations > 0) ? group_size : 0;
+
+ /* current values start a new group */
+ val_a = values[2*i];
+ val_b = values[2*i+1];
+ n_violations = 0;
+ group_size = 1;
+ }
+ else
+ {
+ if (values[2*i+1] != val_b) /* mismatch of a B value is contradicting */
+ {
+ val_b = values[2*i+1];
+ n_violations += 1;
+ }
+
+ group_size += 1;
+ }
+ }
+
+ /* handle the last group */
+ n_supporting += ((n_violations == 0) && (group_size >= min_group_size)) ? 1 : 0;
+ n_contradicting += (n_violations != 0) ? 1 : 0;
+ n_supporting_rows += ((n_violations == 0) && (group_size >= min_group_size)) ? group_size : 0;
+ n_contradicting_rows += (n_violations > 0) ? group_size : 0;
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means the columns have the same values (or one is a 'label'),
+ * making the conditions rather redundant. Although it's possible
+ * that the query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(values);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+bytea *
+fetch_mv_dependencies(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *stadeps = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ stadeps = DatumGetByteaP(deps);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return stadeps;
+}
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index a680229..f69eb7c 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,11 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3286, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3286
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3287, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3287
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..76b7db7
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,69 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3281
+
+CATALOG(pg_mv_statistic,3281)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_attrdef
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 5
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_deps_enabled 2
+#define Anum_pg_mv_statistic_deps_built 3
+#define Anum_pg_mv_statistic_stakeys 4
+#define Anum_pg_mv_statistic_stadeps 5
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 6a757f3..4b7ae1f 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2693,6 +2693,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s 2 0 16 "26 25" _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3284 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3285 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index cba4ae7..45d3b5a 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3288, 3289);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 38469ef..3a0e7c4 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -414,6 +414,7 @@ typedef enum NodeTag
T_WithClause,
T_CommonTableExpr,
T_RoleSpec,
+ T_StatisticsDef,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index ec0d0ea..b256162 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -570,6 +570,14 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct StatisticsDef
+{
+ NodeTag type;
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+} StatisticsDef;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1362,7 +1370,8 @@ typedef enum AlterTableType
AT_ReplicaIdentity, /* REPLICA IDENTITY */
AT_EnableRowSecurity, /* ENABLE ROW SECURITY */
AT_DisableRowSecurity, /* DISABLE ROW SECURITY */
- AT_GenericOptions /* OPTIONS (...) */
+ AT_GenericOptions, /* OPTIONS (...) */
+ AT_AddStatistics /* add statistics */
} AlterTableType;
typedef struct ReplicaIdentityStmt
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..2b59c2d
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,86 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "commands/vacuum.h"
+
+/*
+ * Basic info about the stats, used when choosing what to use
+ *
+ * TODO Add info about what statistics is available (histogram, MCV,
+ * hashed MCV, functional dependencies).
+ */
+typedef struct MVStatsData {
+ Oid mvoid; /* OID of the stats in pg_mv_statistic */
+ int2vector *stakeys; /* attnums for columns in the stats */
+ bool deps_built; /* functional dependencies available */
+} MVStatsData;
+
+typedef struct MVStatsData *MVStats;
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+MVStats list_mv_stats(Oid relid, int *nstats, bool built_only);
+
+bytea * fetch_mv_dependencies(Oid mvoid);
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies);
+
+#endif
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index ba0b090..12147ab 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,7 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 1788270..f0117ca 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1353,6 +1353,14 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index c7be273..00f5fe7 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
--
2.1.0.GIT
Hello,
On 20.3.2015 09:33, Kyotaro HORIGUCHI wrote:
Hello,
Patch 0001 needs changes for OIDs since my patch was
committed. The attached is compatible with current master.
Thanks. I plan to submit a new version of the patch in a few days, with
significant progress in various directions. I'll have to rebase to
current master before submitting the new version anyway (which includes
fixing duplicate OIDs).
And I tried this like this, and got the following error on
analyze. But unfortunately I don't have enough time to
investigate it now.postgres=# create table t1 (a int, b int, c int);
insert into t1 (select a/ 10000, a / 10000, a / 10000 from
generate_series(0, 99999) a);
postgres=# analyze t1;
ERROR: invalid memory alloc request size 1485176862
Interesting - particularly because this does not involve any
multivariate stats. I can't reproduce it with the current version of the
patch, so either it's unrelated, or I've fixed it since posting the last
version.
regards
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
Patch 0001 needs changes for OIDs since my patch was
committed. The attached is compatible with current master.Thanks. I plan to submit a new version of the patch in a few days, with
significant progress in various directions. I'll have to rebase to
current master before submitting the new version anyway (which includes
fixing duplicate OIDs).And I tried this like this, and got the following error on
analyze. But unfortunately I don't have enough time to
investigate it now.postgres=# create table t1 (a int, b int, c int);
insert into t1 (select a/ 10000, a / 10000, a / 10000 from
generate_series(0, 99999) a);
postgres=# analyze t1;
ERROR: invalid memory alloc request size 1485176862Interesting - particularly because this does not involve any
multivariate stats. I can't reproduce it with the current version of the
patch, so either it's unrelated, or I've fixed it since posting the last
version.
Sorry, not shown above, the *previous* t1 had been done "alter
table t1 add statistics (a, b, c)". Removing t1 didn't remove the
setting. reiniting cluster let me do that without error.
The steps throughout was as following.
===
create table t1 (a int, b int, c int);
alter table t1 add statistics (histogram) on (a, b, c);
drop table t1; -- This does not remove the above setting.
create table t1 (a int, b int, c int);
insert into t1 (select a/ 10000, a / 10000, a / 10000 from generate_series(0, 99999) a);insert into t1 ...
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
On 03/24/15 06:34, Kyotaro HORIGUCHI wrote:
Sorry, not shown above, the *previous* t1 had been done "alter table
t1 add statistics (a, b, c)". Removing t1 didn't remove the setting.
reiniting cluster let me do that without error.
OK, thanks. My guess is this issue got already fixed in my working copy,
but I will double-check that.
Admittedly, the management of the stats (e.g. removing stats when the
table is dropped) is one of the incomplete parts. You have to delete the
rows manually from pg_mv_statistic.
--
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
attached is a new version of the patch series. Aside from fixing various
issues (crashes, memory leaks). The patches are rebased to current
master, and I also attach a few SQL scripts I used for testing (nothing
fancy, just stress-testing all the parts the patch touches).
The main changes in the patches (requiring plenty of changes in the
other parts) are about these:
(1) combining multiple statistics on a table
--------------------------------------------
In the previous version of the patch, it was only possible to use a
single statistics on a table - when there was a statistics "covering"
all the conditions it worked fine, but that's not always the case.
The new patch is able to combine multiple statistics by decomposing the
probability (=selectivity) into conditional probabilities. Imagine
estimating selectivity of clauses
WHERE (a=1) AND (b=1) AND (c=1) AND (d=1)
with statistics on [a,b,c] and [b,c,d]. The selectivity may be split for
example like this:
P(a=1,b=1,c=1,d=1) = P(a=1,b=1,c=1) * P(d=1|a=1,b=1,c=1)
where P(a=1,b=1,c=1) may be estimated using statistics [a,b,c], and the
second may be simplified like this:
P(d=1|a=1,b=1,c=1) = P(d=1|b=1,c=1)
using the assumption "no multivariate stats => independent". Both these
probabilities match the existing statistics.
The idea is described a bit more in the part #5 of the patch.
(2) choosing the best combination of statistics
-----------------------------------------------
There may be more statistics on a table, and multiple possible ways to
use them to estimate the clauses (different ordering, overlapping
statistics, etc.).
The patch formulates this as an optimization task with two goals.
(a) cover as many clauses as possible
(b) reuse as many conditions (i.e. dependencies) as possible
and implements two algorithms to solve this: (a) exhaustive, walking
through all possible states (using dynamic programming), and (b) greedy,
choosing the best local solution in each step.
The time requirements for the exhaustive solution grows pretty quickly
with the number of clauses and statistics on a table (~ O(N!)). The
greedy is much faster, as it's ~O(N) and in fact much more time is spent
in actually processing the selected statistics (walking through the
histograms etc.).
I assume the exhaustive search may find a better solution in some cases
(that the greedy algorithm misses), but so far I've been unable to come
up with such example.
To make this easier to test, I've added GUC to switch between these
algorithms easily (set to 'greedy' by default)
mvstat_search = {'greedy', 'exhaustive'}
I assume this GUC will be removed eventually, after we figure out which
algorithm is the right one.
(3) estimation of more complex conditions (AND/OR clauses)
----------------------------------------------------------
I've added ability to estimate more complex clauses - combinations of
AND/OR clauses and such. It's somewhat incomplete at the moment, but
hopefully the ideas will be clear from the TODOs/FIXMEs along the way.
Let me know if you have any questions about this version of the patch,
or about the ideas it implements in general.
I also welcome real-world examples of poorly estimated queries, so that
I can test if these patches improve that particular case situation.
regards
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-shared-infrastructure-and-functional-dependencies.patchtext/x-diff; name=0001-shared-infrastructure-and-functional-dependencies.patchDownload
>From 7c8f0ce0017beea314219c24146cbb64d0d37a3d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 1/5] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate
stats, most importantly:
- adds a new system catalog (pg_mv_statistic)
- ALTER TABLE ... ADD STATISTICS syntax
- implementation of functional dependencies (the simplest
type of multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e.
it does not influence the query planning (subject to
follow-up patches).
The current implementation requires a valid 'ltopr' for
the columns, so that we can sort the sample rows in various
ways, both in this patch and other kinds of statistics.
Maybe this restriction could be relaxed in the future,
requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
The algorithm detecting the dependencies is rather simple
and probably needs improvements.
The name 'functional dependencies' is more correct (than
'association rules') as it's exactly the name used in
relational theory (esp. Normal Forms) for tracking
column-level dependencies.
---
src/backend/catalog/Makefile | 1 +
src/backend/catalog/system_views.sql | 10 +
src/backend/commands/analyze.c | 20 +-
src/backend/commands/tablecmds.c | 149 ++++++-
src/backend/nodes/copyfuncs.c | 14 +
src/backend/parser/gram.y | 67 ++-
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/syscache.c | 12 +
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/common.c | 342 +++++++++++++++
src/backend/utils/mvstats/common.h | 75 ++++
src/backend/utils/mvstats/dependencies.c | 680 +++++++++++++++++++++++++++++
src/include/catalog/indexing.h | 5 +
src/include/catalog/pg_mv_statistic.h | 69 +++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 11 +-
src/include/utils/mvstats.h | 86 ++++
src/include/utils/syscache.h | 1 +
src/test/regress/expected/rules.out | 8 +
src/test/regress/expected/sanity_check.out | 1 +
22 files changed, 1569 insertions(+), 8 deletions(-)
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index a403c64..d6c16f8 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2800f73..d05a716 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -150,6 +150,16 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index d4d1914..f82fcf5 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -54,7 +55,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Data structure for Algorithm S from Knuth 3.4.2 */
typedef struct
@@ -110,7 +115,6 @@ static void update_attstats(Oid relid, bool inh,
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
-
/*
* analyze_rel() -- analyze one relation
*/
@@ -471,6 +475,17 @@ do_analyze_rel(Relation onerel, int options, List *va_cols,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, and in some cases samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -573,6 +588,9 @@ do_analyze_rel(Relation onerel, int options, List *va_cols,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 002319e..a321755 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -35,6 +35,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_tablespace.h"
@@ -92,7 +93,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -140,8 +141,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
@@ -416,7 +418,8 @@ static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKM
static void ATExecGenericOptions(Relation rel, List *options);
static void ATExecEnableRowSecurity(Relation rel);
static void ATExecDisableRowSecurity(Relation rel);
-
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode);
static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
static const char *storage_name(char c);
@@ -2999,6 +3002,7 @@ AlterTableGetLockLevel(List *cmds)
* updates.
*/
case AT_SetStatistics: /* Uses MVCC in getTableAttrs() */
+ case AT_AddStatistics: /* XXX not sure if the right level */
case AT_ClusterOn: /* Uses MVCC in getIndexes() */
case AT_DropCluster: /* Uses MVCC in getIndexes() */
case AT_SetOptions: /* Uses MVCC in getTableAttrs() */
@@ -3155,6 +3159,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
pass = AT_PASS_ADD_CONSTR;
break;
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
+ case AT_AddStatistics: /* XXX maybe not the right place */
ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode);
/* Performs own permission checks */
ATPrepSetStatistics(rel, cmd->name, cmd->def, lockmode);
@@ -3457,6 +3462,9 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
address = ATExecSetStatistics(rel, cmd->name, cmd->def, lockmode);
break;
+ case AT_AddStatistics: /* ADD STATISTICS */
+ ATExecAddStatistics(tab, rel, (StatisticsDef *) cmd->def, lockmode);
+ break;
case AT_SetOptions: /* ALTER COLUMN SET ( options ) */
address = ATExecSetOptions(rel, cmd->name, cmd->def, false, lockmode);
break;
@@ -11854,3 +11862,136 @@ RangeVarCallbackForAlterRelation(const RangeVar *rv, Oid relid, Oid oldrelid,
ReleaseSysCache(tuple);
}
+
+/* used for sorting the attnums in ATExecAddStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the ALTER TABLE ... ADD STATISTICS (options) ON (columns).
+ *
+ * The code is an unholy mix of pieces that really belong to other parts
+ * of the source tree.
+ *
+ * FIXME Check that the types are pass-by-value and support sort,
+ * although maybe we can live without the sort (and only build
+ * MCV list / association rules).
+ *
+ * FIXME This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+
+ /* by default build everything */
+ bool build_dependencies = true;
+
+ Assert(IsA(def, StatisticsDef));
+
+ /* transform the column names to attnum values */
+
+ foreach(l, def->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, def->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ heap_freetuple(htup);
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ return;
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 029761e..a4ce2c9 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3918,6 +3918,17 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static StatisticsDef *
+_copyStatisticsDef(const StatisticsDef *from)
+{
+ StatisticsDef *newnode = makeNode(StatisticsDef);
+
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4732,6 +4743,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_StatisticsDef:
+ retval = _copyStatisticsDef(from);
+ break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3aa9e42..17183ef 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -367,6 +367,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
create_generic_options alter_generic_options
relation_expr_list dostmt_opt_list
+%type <list> OptStatsOptions
+%type <str> stats_options_name
+%type <node> stats_options_arg
+%type <defelt> stats_options_elem
+%type <list> stats_options_list
+
+
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -486,7 +493,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <keyword> unreserved_keyword type_func_name_keyword
%type <keyword> col_name_keyword reserved_keyword
-%type <node> TableConstraint TableLikeClause
+%type <node> TableConstraint TableLikeClause TableStatistics
%type <ival> TableLikeOptionList TableLikeOption
%type <list> ColQualList
%type <node> ColConstraint ColConstraintElem ConstraintAttr
@@ -2311,6 +2318,14 @@ alter_table_cmd:
n->subtype = AT_DisableRowSecurity;
$$ = (Node *)n;
}
+ /* ALTER TABLE <name> ADD STATISTICS (options) ON (columns) ... */
+ | ADD_P TableStatistics
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_AddStatistics;
+ n->def = $2;
+ $$ = (Node *)n;
+ }
| alter_generic_options
{
AlterTableCmd *n = makeNode(AlterTableCmd);
@@ -3385,6 +3400,56 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * ALTER TABLE relname ADD STATISTICS (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+TableStatistics:
+ STATISTICS OptStatsOptions ON '(' columnList ')'
+ {
+ StatisticsDef *n = makeNode(StatisticsDef);
+ n->keys = $5;
+ n->options = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
+OptStatsOptions:
+ '(' stats_options_list ')' { $$ = $2; }
+ | /*EMPTY*/ { $$ = NIL; }
+ ;
+
+stats_options_list:
+ stats_options_elem
+ {
+ $$ = list_make1($1);
+ }
+ | stats_options_list ',' stats_options_elem
+ {
+ $$ = lappend($1, $3);
+ }
+ ;
+
+stats_options_elem:
+ stats_options_name stats_options_arg
+ {
+ $$ = makeDefElem($1, $2);
+ }
+ ;
+
+stats_options_name:
+ NonReservedWord { $$ = $1; }
+ ;
+
+stats_options_arg:
+ opt_boolean_or_string { $$ = (Node *) makeString($1); }
+ | NumericOnly { $$ = (Node *) $1; }
+ | /* EMPTY */ { $$ = NULL; }
+ ;
+
/*****************************************************************************
*
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index bd27168..f61ef7e 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -499,6 +500,17 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 128
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..8efc5ba
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,342 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ int i;
+ MVStats mvstats;
+ int nmvstats;
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel), &nmvstats, false);
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ MVDependencies deps = NULL;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = mvstats[i].stakeys;
+
+ /* filter only the interesting vacattrstats records */
+ VacAttrStats **stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(mvstats[i].mvoid, deps);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+MVStats
+list_mv_stats(Oid relid, int *nstats, bool built_only)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ MVStats result;
+
+ /* start with 16 items, that should be enough for most cases */
+ int maxitems = 16;
+ result = (MVStats)palloc0(sizeof(MVStatsData) * maxitems);
+ *nstats = 0;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /*
+ * Skip statistics that were not computed yet (if only stats
+ * that were already built were requested)
+ */
+ if (built_only && (! stats->deps_built))
+ continue;
+
+ /* double the array size if needed */
+ if (*nstats == maxitems)
+ {
+ maxitems *= 2;
+ result = (MVStats)repalloc(result, sizeof(MVStatsData) * maxitems);
+ }
+
+ result[*nstats].mvoid = HeapTupleGetOid(htup);
+ result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ result[*nstats].deps_built = stats->deps_built;
+ *nstats += 1;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..6d5465b
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/datum.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "access/sysattr.h"
+
+#include "utils/mvstats.h"
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..9e7f294
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,680 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ * Mine functional dependencies between columns, in the form (A => B),
+ * meaning that a value in column 'A' determines value in 'B'. A simple
+ * artificial example may be a table created like this
+ *
+ * CREATE TABLE deptest (a INT, b INT)
+ * AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+ *
+ * Clearly, once we know the value for 'A' we can easily determine the
+ * value of 'B' by dividing (A/10). A more practical example may be
+ * addresses, where (ZIP code => city name), i.e. once we know the ZIP,
+ * we probably know which city it belongs to. Larger cities usually have
+ * multiple ZIP codes, so the dependency can't be reversed.
+ *
+ * Functional dependencies are a concept well described in relational
+ * theory, especially in definition of normalization and "normal forms".
+ * Wikipedia has a nice definition of a functional dependency [1]:
+ *
+ * In a given table, an attribute Y is said to have a functional
+ * dependency on a set of attributes X (written X -> Y) if and only
+ * if each X value is associated with precisely one Y value. For
+ * example, in an "Employee" table that includes the attributes
+ * "Employee ID" and "Employee Date of Birth", the functional
+ * dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ * It follows from the previous two sentences that each {Employee ID}
+ * is associated with precisely one {Employee Date of Birth}.
+ *
+ * [1] http://en.wikipedia.org/wiki/Database_normalization
+ *
+ * Most datasets might be normalized not to contain any such functional
+ * dependencies, but sometimes it's not practical. In some cases it's
+ * actually a conscious choice to model the dataset in denormalized way,
+ * either because of performance or to make querying easier.
+ *
+ * The current implementation supports only dependencies between two
+ * columns, but this is merely a simplification of the initial patch.
+ * It's certainly useful to mine for dependencies involving multiple
+ * columns on the 'left' side, i.e. a condition for the dependency.
+ * That is dependencies [A,B] => C and so on.
+ *
+ * TODO The implementation may/should be smart enough not to mine both
+ * [A => B] and [A,C => B], because the second dependency is a
+ * consequence of the first one (if values of A determine values
+ * of B, adding another column won't change that). The ANALYZE
+ * should first analyze 1:1 dependencies, then 2:1 dependencies
+ * (and skip the already identified ones), etc.
+ *
+ * For example the dependency [city name => zip code] is much weaker
+ * than [city name, state name => zip code], because there may be
+ * multiple cities with the same name in various states. It's not
+ * perfect though - there are probably cities with the same name within
+ * the same state, but this is relatively rare occurence hopefully.
+ * More about this in the section about dependency mining.
+ *
+ * Handling multiple columns on the right side is not necessary, as such
+ * dependencies may be decomposed into a set of dependencies with
+ * the same meaning, one for each column on the right side. For example
+ *
+ * A => [B,C]
+ *
+ * is exactly the same as
+ *
+ * (A => B) & (A => C).
+ *
+ * Of course, storing (A => [B, C]) may be more efficient thant storing
+ * the two dependencies (A => B) and (A => C) separately.
+ *
+ *
+ * Dependency mining (ANALYZE)
+ * ---------------------------
+ *
+ * The current build algorithm is rather simple - for each pair [A,B] of
+ * columns, the data are sorted lexicographically (first by A, then B),
+ * and then a number of metrics is computed by walking the sorted data.
+ *
+ * In general the algorithm counts distict values of A (forming groups
+ * thanks to the sorting), supporting or contradicting the hypothesis
+ * that A => B (i.e. that values of B are predetermined by A). If there
+ * are multiple values of B for a single value of A, it's counted as
+ * contradicting.
+ *
+ * A group may be neither supporting nor contradicting. To be counted as
+ * supporting, the group has to have at least min_group_size(=3) rows.
+ * Smaller 'supporting' groups are counted as neutral.
+ *
+ * Finally, the number of rows in supporting and contradicting groups is
+ * compared, and if there is at least 10x more supporting rows, the
+ * dependency is considered valid.
+ *
+ *
+ * Real-world datasets are imperfect - there may be errors (e.g. due to
+ * data-entry mistakes), or factually correct records, yet contradicting
+ * the dependency (e.g. when a city splits into two, but both keep the
+ * same ZIP code). A strict ANALYZE implementation (where the functional
+ * dependencies are identified) would ignore dependencies on such noisy
+ * data, making the approach unusable in practice.
+ *
+ * The proposed implementation attempts to handle such noisy cases
+ * gracefully, by tolerating small number of contradicting cases.
+ *
+ * In the future this might also perform some sort of test and decide
+ * whether it's worth building any other kind of multivariate stats,
+ * or whether the dependencies sufficiently describe the data. Or at
+ * least not build the MCV list / histogram on the implied columns.
+ * Such reduction would however make the 'verification' (see the next
+ * section) impossible.
+ *
+ *
+ * Clause reduction (planner/optimizer)
+ * ------------------------------------
+ *
+ * Apllying the dependencies is quite simple - given a list of clauses,
+ * try to apply all the dependencies. For example given clause list
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d < 100)
+ *
+ * and dependencies [a=>b] and [a=>d], this may be reduced to
+ *
+ * (a = 1) AND (c = 1) AND (d < 100)
+ *
+ * The (d<100) can't be reduced as it's not an equality clause, so the
+ * dependency [a=>d] can't be applied.
+ *
+ * See clauselist_apply_dependencies() for more details.
+ *
+ * The problem with the reduction is that the query may use conditions
+ * that are not redundant, but in fact contradictory - e.g. the user
+ * may search for a ZIP code and a city name not matching the ZIP code.
+ *
+ * In such cases, the condition on the city name is not actually
+ * redundant, but actually contradictory (making the result empty), and
+ * removing it while estimating the cardinality will make the estimate
+ * worse.
+ *
+ * The current estimation assuming independence (and multiplying the
+ * selectivities) works better in this case, but only by utter luck.
+ *
+ * In some cases this might be verified using the other multivariate
+ * statistics - MCV lists and histograms. For MCV lists the verification
+ * might be very simple - peek into the list if there are any items
+ * matching the clause on the 'A' column (e.g. ZIP code), and if such
+ * item is found, check that the 'B' column matches the other clause.
+ * If it does not, the clauses are contradictory. We can't really say
+ * if such item was not found, except maybe restricting the selectivity
+ * using the MCV data (e.g. using min/max selectivity, or something).
+ *
+ * With histograms, it might work similarly - we can't check the values
+ * directly (because histograms use buckets, unlike MCV lists, storing
+ * the actual values). So we can only observe the buckets matching the
+ * clauses - if those buckets have very low frequency, it probably means
+ * the two clauses are incompatible.
+ *
+ * It's unclear what 'low frequency' is, but if one of the clauses is
+ * implied (automatically true because of the other clause), then
+ *
+ * selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+ *
+ * So we might compute selectivity of the first clause (on the column
+ * A in dependency [A=>B]) - for example using regular statistics.
+ * And then check if the selectivity computed from the histogram is
+ * about the same (or significantly lower).
+ *
+ * The problem is that histograms work well only when the data ordering
+ * matches the natural meaning. For values that serve as labels - like
+ * city names or ZIP codes, or even generated IDs, histograms really
+ * don't work all that well. For example sorting cities by name won't
+ * match the sorting of ZIP codes, rendering the histogram unusable.
+ *
+ * The MCV are probably going to work much better, because they don't
+ * really assume any sort of ordering. And it's probably more appropriate
+ * for the label-like data.
+ *
+ * TODO Support dependencies with multiple columns on left/right.
+ *
+ * TODO Investigate using histogram and MCV list to confirm the
+ * functional dependencies.
+ *
+ * TODO Investigate statistical testing of the distribution (to decide
+ * whether it makes sense to build the histogram/MCV list).
+ *
+ * TODO Using a min/max of selectivities would probably make more sense
+ * for the associated columns.
+ *
+ * TODO Consider eliminating the implied columns from the histogram and
+ * MCV lists (but maybe that's not a good idea, because that'd make
+ * it impossible to use these stats for non-equality clauses and
+ * also it wouldn't be possible to use the stats for verification
+ * of the dependencies as proposed in another TODO).
+ *
+ * TODO This builds a complete set of dependencies, i.e. including
+ * transitive dependencies - if we identify [A => B] and [B => C],
+ * we're likely to identify [A => C] too. It might be better to
+ * keep only the minimal set of dependencies, i.e. prune all the
+ * dependencies that we can recreate by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may
+ * be recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check
+ * before actually doing the work
+ *
+ * The second option has the advantage that we don't really need
+ * to perform the sort/count. It's not sufficient alone, though,
+ * because we may discover the dependencies in the wrong order.
+ * For example [A => B], [A => C] and then [B => C]. None of those
+ * dependencies is a combination of the already known ones, yet
+ * [A => C] is a combination of [A => B] and [B => C].
+ *
+ * FIXME Not sure the current NULL handling makes much sense. We assume
+ * that NULL is 0, so it's handled like a regular value
+ * (NULL == NULL), so all NULLs in a single column form a single
+ * group. Maybe that's not the right thing to do, especially with
+ * equality conditions - in that case NULLs are irrelevant. So
+ * maybe the right solution would be to just ignore NULL values?
+ *
+ * However simply "ignoring" the NULL values does not seem like
+ * a good idea - imagine columns A and B, where for each value of
+ * A, values in B are constant (same for the whole group) or NULL.
+ * Let's say only 10% of B values in each group is not NULL. Then
+ * ignoring the NULL values will result in 10x misestimate (and
+ * it's trivial to construct arbitrary errors). So maybe handling
+ * NULL values just like a regular value is the right thing here.
+ *
+ * Or maybe NULL values should be treated differently on each side
+ * of the dependency? E.g. as ignored on the left (condition) and
+ * as regular values on the right - this seems consistent with how
+ * equality clauses work, as equality clause means 'NOT NULL'.
+ * So if we say [A => B] then it may also imply "NOT NULL" on the
+ * right side.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct columns in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we should expected to
+ * observe in the sample - we can then use the average group
+ * size as a threshold. That seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, stats);
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ SortItem current;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, stats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ current = items[0];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
+ {
+ n_violations += 1;
+ }
+
+ current = items[i];
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means there's a 1:1 relation (or one is a 'label'), making the
+ * conditions rather redundant. Although it's possible that the
+ * query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+bytea *
+fetch_mv_dependencies(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *stadeps = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ stadeps = DatumGetByteaP(deps);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return stadeps;
+}
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index a680229..22bb781 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,11 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..81ec23b
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,69 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_attrdef
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 5
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_deps_enabled 2
+#define Anum_pg_mv_statistic_deps_built 3
+#define Anum_pg_mv_statistic_stakeys 4
+#define Anum_pg_mv_statistic_stadeps 5
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 8890ade..f728d88 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2712,6 +2712,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s 2 0 16 "26 25" _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3284 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3285 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index fb2f035..724a169 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3288, 3289);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 38469ef..3a0e7c4 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -414,6 +414,7 @@ typedef enum NodeTag
T_WithClause,
T_CommonTableExpr,
T_RoleSpec,
+ T_StatisticsDef,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2893cef..81ca159 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -570,6 +570,14 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct StatisticsDef
+{
+ NodeTag type;
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+} StatisticsDef;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1362,7 +1370,8 @@ typedef enum AlterTableType
AT_ReplicaIdentity, /* REPLICA IDENTITY */
AT_EnableRowSecurity, /* ENABLE ROW SECURITY */
AT_DisableRowSecurity, /* DISABLE ROW SECURITY */
- AT_GenericOptions /* OPTIONS (...) */
+ AT_GenericOptions, /* OPTIONS (...) */
+ AT_AddStatistics /* add statistics */
} AlterTableType;
typedef struct ReplicaIdentityStmt
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..5c8643d
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,86 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "commands/vacuum.h"
+
+/*
+ * Basic info about the stats, used when choosing what to use
+ *
+ * TODO Add info about what statistics is available (histogram, MCV,
+ * hashed MCV, functional dependencies).
+ */
+typedef struct MVStatsData {
+ Oid mvoid; /* OID of the stats in pg_mv_statistic */
+ int2vector *stakeys; /* attnums for columns in the stats */
+ bool deps_built; /* functional dependencies available */
+} MVStatsData;
+
+typedef struct MVStatsData *MVStats;
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+MVStats list_mv_stats(Oid relid, int *nstats, bool built_only);
+
+bytea * fetch_mv_dependencies(Oid mvoid);
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies);
+
+#endif
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index ba0b090..12147ab 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,7 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 1788270..f0117ca 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1353,6 +1353,14 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index c7be273..00f5fe7 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
--
2.0.5
0002-clause-reduction-using-functional-dependencies.patchtext/x-diff; name=0002-clause-reduction-using-functional-dependencies.patchDownload
>From 47a48180be115db2fa29ac659f4e4f259e01600d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 16 Jan 2015 22:33:41 +0100
Subject: [PATCH 2/5] clause reduction using functional dependencies
During planning, use functional dependencies to decide
which clauses to skip during cardinality estimation.
Initial and rather simplistic implementation.
This only works with regular WHERE clauses, not clauses
used for joining.
Note: The clause_is_mv_compatible() needs to identify the
relation (so that we can fetch the list of multivariate stats
by OID). planner_rt_fetch() seems like the appropriate way to
get the relation OID, but apparently it only works with simple
vars. Maybe examine_variable() would make this work with more
complex vars too?
Includes regression tests analyzing functional dependencies
(part of ANALYZE) on several datasets (no dependencies, no
transitive dependencies, ...).
Checks that a query with conditions on two columns, where one (B)
is functionally dependent on the other one (A), correctly ignores
the clause on (B) and chooses bitmap index scan instead of plain
index scan (which is what happens otherwise, thanks to assumption
of independence).
Note: Functional dependencies only work with equality clauses,
no inequalities etc.
---
src/backend/commands/analyze.c | 1 +
src/backend/commands/tablecmds.c | 9 +-
src/backend/optimizer/path/clausesel.c | 650 +++++++++++++++++++++++++-
src/backend/utils/mvstats/common.c | 5 +-
src/include/catalog/pg_proc.h | 4 +-
src/include/utils/mvstats.h | 23 +-
src/test/regress/expected/mv_dependencies.out | 175 +++++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 153 ++++++
10 files changed, 1013 insertions(+), 11 deletions(-)
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index f82fcf5..e247f84 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -115,6 +115,7 @@ static void update_attstats(Oid relid, bool inh,
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
+
/*
* analyze_rel() -- analyze one relation
*/
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index a321755..965d342 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -420,6 +420,7 @@ static void ATExecEnableRowSecurity(Relation rel);
static void ATExecDisableRowSecurity(Relation rel);
static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
StatisticsDef *def, LOCKMODE lockmode);
+
static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
static const char *storage_name(char c);
@@ -11900,7 +11901,7 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
Relation mvstatrel;
/* by default build everything */
- bool build_dependencies = true;
+ bool build_dependencies = false;
Assert(IsA(def, StatisticsDef));
@@ -11962,6 +11963,12 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
opt->defname)));
}
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index dcac1c1..e742827 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -24,6 +24,14 @@
#include "utils/lsyscache.h"
#include "utils/selfuncs.h"
+#include "utils/mvstats.h"
+#include "catalog/pg_collation.h"
+#include "utils/typcache.h"
+
+#include "parser/parsetree.h"
+
+
+#include <stdio.h>
/*
* Data structure for accumulating info about possible range-query
@@ -43,6 +51,16 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Oid *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+
+static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
+ Oid varRelid, Oid *relid, SpecialJoinInfo *sjinfo);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, int nmvstats, MVStats mvstats,
+ SpecialJoinInfo *sjinfo);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -61,7 +79,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -88,6 +106,76 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
*
* Of course this is all very dependent on the behavior of
* scalarltsel/scalargtsel; perhaps some day we can generalize the approach.
+ *
+ *
+ * Multivariate statististics
+ * --------------------------
+ * This also uses multivariate stats to estimate combinations of conditions,
+ * in a way attempting to minimize the overhead when there are no suitable
+ * multivariate stats.
+ *
+ * The following checks are performed (in this order), and the optimizer
+ * falls back to regular stats on the first 'false'.
+ *
+ * NOTE: This explains how this works with all the patches applied, not
+ * just the functional dependencies.
+ *
+ * (1) check that at least two columns are referenced from conditions
+ * compatible with multivariate stats
+ *
+ * If there are no conditions that might be handled by multivariate
+ * stats, or if the conditions reference just a single column, it
+ * makes no sense to use multivariate stats.
+ *
+ * What conditions are compatible with multivariate stats is decided
+ * by clause_is_mv_compatible(). At this moment, only simple conditions
+ * of the form "column operator constant" (for simple comparison
+ * operators), and IS NULL / IS NOT NULL are considered compatible
+ * with multivariate statistics.
+ *
+ * (2) reduce the clauses using functional dependencies
+ *
+ * This simply attempts to 'reduce' the clauses by applying functional
+ * dependencies. For example if there are two clauses:
+ *
+ * WHERE (a = 1) AND (b = 2)
+ *
+ * and we know that 'a' determines the value of 'b', we may remove
+ * the second condition (b = 2) when computing the selectivity.
+ * This is of course tricky - see mvstats/dependencies.c for details.
+ *
+ * After the reduction, step (1) is to be repeated.
+ *
+ * (3) check if there are multivariate stats built on the columns
+ *
+ * If there are no multivariate statistics, we have to fall back to
+ * the regular stats. We might perform checks (1) and (2) in reverse
+ * order, i.e. first check if there are multivariate statistics and
+ * then collect the attributes only if needed. The assumption is
+ * that checking the clauses is cheaper than querying the catalog,
+ * so this check is performed first.
+ *
+ * (4) choose the stats matching the most columns (at least two)
+ *
+ * If there are multiple instances of multivariate statistics (e.g.
+ * built on different sets of columns), we choose the stats covering
+ * the most columns from step (1). It may happen that all available
+ * stats match just a single column - for example with conditions
+ *
+ * WHERE a = 1 AND b = 2
+ *
+ * and statistics built on (a,c) and (b,c). In such case just fall
+ * back to the regular stats because it makes no sense to use the
+ * multivariate statistics.
+ *
+ * This selection criteria (the most columns) is certainly very
+ * simple and definitely not optimal - it's simple to come up with
+ * examples where other approaches work better. More about this
+ * at choose_mv_statistics().
+ *
+ * (5) use the multivariate stats to estimate matching clauses
+ *
+ * (6) estimate the remaining clauses using the regular statistics
*/
Selectivity
clauselist_selectivity(PlannerInfo *root,
@@ -100,6 +188,14 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+ int nmvstats = 0;
+ MVStats mvstats = NULL;
+
+ /* attributes in mv-compatible clauses */
+ Bitmapset *mvattnums = NULL;
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +204,28 @@ clauselist_selectivity(PlannerInfo *root,
return clause_selectivity(root, (Node *) linitial(clauses),
varRelid, jointype, sjinfo);
+ /* collect attributes referenced by mv-compatible clauses */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+
+ /*
+ * If there are mv-compatible clauses, referencing at least two
+ * different columns (otherwise it makes no sense to use mv stats),
+ * try to reduce the clauses using functional dependencies, and
+ * recollect the attributes from the reduced list.
+ *
+ * We don't need to select a single statistics for this - we can
+ * apply all the functional dependencies we have.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /* fetch info from the catalog (not the serialized stats yet) */
+ mvstats = list_mv_stats(relid, &nmvstats, true);
+
+ /* reduce clauses by applying functional dependencies rules */
+ clauses = clauselist_apply_dependencies(root, clauses, varRelid,
+ nmvstats, mvstats, sjinfo);
+ }
+
/*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
@@ -782,3 +900,533 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
+ Oid *relid, SpecialJoinInfo *sjinfo)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ *relid = InvalidOid;
+ }
+
+ return attnums;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Oid *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+{
+
+ if (IsA(clause, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* no support for OR clauses at this point */
+ if (rinfo->orclause)
+ return false;
+
+ /* get the actual clause from the RestrictInfo (it's not an OR clause) */
+ clause = (Node*)rinfo->clause;
+
+ /* only simple opclauses are compatible with multivariate stats */
+ if (! is_opclause(clause))
+ return false;
+
+ /* we don't support join conditions at this moment */
+ if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
+ return false;
+
+ /* is it 'variable op constant' ? */
+ if (list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+ RangeTblEntry * rte;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ rte = planner_rt_fetch(var->varno, root);
+ *relid = rte->relid;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+ *attnum = var->varattno;
+ return true;
+ }
+ }
+ }
+ }
+
+ return false;
+
+}
+
+/*
+ * Performs reduction of clauses using functional dependencies, i.e.
+ * removes clauses that are considered redundant. It simply walks
+ * through dependencies, and checks whether the dependency 'matches'
+ * the clauses, i.e. if there's a clause matching the condition. If yes,
+ * all clauses matching the implied part of the dependency are removed
+ * from the list.
+ *
+ * This simply looks at attnums references by the clauses, not at the
+ * type of the operator (equality, inequality, ...). This may not be the
+ * right way to do - it certainly works best for equalities, which is
+ * naturally consistent with functional dependencies (implications).
+ * It's not clear that other operators are handled sensibly - for
+ * example for inequalities, like
+ *
+ * WHERE (A >= 10) AND (B <= 20)
+ *
+ * and a trivial case where [A == B], resulting in symmetric pair of
+ * rules [A => B], [B => A], it's rather clear we can't remove either of
+ * those clauses.
+ *
+ * That only highlights that functional dependencies are most suitable
+ * for label-like data, where using non-equality operators is very rare.
+ * Using the common city/zipcode example, clauses like
+ *
+ * (zipcode <= 12345)
+ *
+ * or
+ *
+ * (cityname >= 'Washington')
+ *
+ * are rare. So restricting the reduction to equality should not harm
+ * the usefulness / applicability.
+ *
+ * The other assumption is that this assumes 'compatible' clauses. For
+ * example by using mismatching zip code and city name, this is unable
+ * to identify the discrepancy and eliminates one of the clauses. The
+ * usual approach (multiplying both selectivities) thus produces a more
+ * accurate estimate, although mostly by luck - the multiplication
+ * comes from assumption of statistical independence of the two
+ * conditions (which is not not valid in this case), but moves the
+ * estimate in the right direction (towards 0%).
+ *
+ * This might be somewhat improved by cross-checking the selectivities
+ * against MCV and/or histogram.
+ *
+ * The implementation needs to be careful about cyclic rules, i.e. rules
+ * like [A => B] and [B => A] at the same time. This must not reduce
+ * clauses on both attributes at the same time.
+ *
+ * Technically we might consider selectivities here too, somehow. E.g.
+ * when (A => B) and (B => A), we might use the clauses with minimum
+ * selectivity.
+ *
+ * TODO Consider restricting the reduction to equality clauses. Or maybe
+ * use equality classes somehow?
+ *
+ * TODO Merge this docs to dependencies.c, as it's saying mostly the
+ * same things as the comments there.
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses, Oid varRelid,
+ int nmvstats, MVStats mvstats, SpecialJoinInfo *sjinfo)
+{
+ int i;
+ ListCell *lc;
+ List * reduced_clauses = NIL;
+ Oid relid;
+
+ /*
+ * preallocate space for all clauses, including non-mv-compatible,
+ * so that we don't need to reallocate the arrays repeatedly
+ *
+ * XXX This assumes each clause references exactly one Var, so the
+ * arrays are sized accordingly - for functional dependencies
+ * this is safe, because it only works with Var=Const.
+ */
+ bool *reduced;
+ AttrNumber *mvattnums;
+ Node **mvclauses;
+ int nmvclauses = 0; /* number clauses in the arrays */
+
+ /*
+ * matrix of (natts x natts), 1 means x=>y
+ *
+ * This serves two purposes - first, it merges dependencies from all
+ * the statistics, second it makes generating all the transitive
+ * dependencies easier.
+ *
+ * We need to build this only for attributes from the dependencies,
+ * not for all attributes in the table.
+ *
+ * We can't do that only for attributes from the clauses, because we
+ * want to build transitive dependencies (including those going
+ * through attributes not listed in the stats).
+ *
+ * This only works for A=>B dependencies, not sure how to do that
+ * for complex dependencies.
+ */
+ bool *deps_matrix;
+ int deps_natts; /* size of the matric */
+
+ /* mapping attnum <=> matrix index */
+ int *deps_idx_to_attnum;
+ int *deps_attnum_to_idx;
+
+ /* attnums in dependencies and clauses (and intersection) */
+ Bitmapset *deps_attnums = NULL;
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *intersect_attnums = NULL;
+
+ int attnum, attidx, attnum_max;
+
+ bool has_deps_built = false;
+
+ /* see if there's at least one statistics with dependencies */
+ for (i = 0; i < nmvstats; i++)
+ {
+ if (mvstats[i].deps_built)
+ {
+ has_deps_built = true;
+ break;
+ }
+ }
+
+ /* no dependencies available - return the original clauses */
+ if (! has_deps_built)
+ return clauses;
+
+ mvclauses = (Node**)palloc0(list_length(clauses) * sizeof(Node*));
+ mvattnums = (AttrNumber*)palloc0(list_length(clauses) * sizeof(AttrNumber));
+
+ /*
+ * Walk through the clauses - clauses that are not mv-compatible copy
+ * directly into the result list, and mv-compatible ones store into
+ * an array of clauses (and remember the attnumb in another array).
+ */
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(lc);
+ if (! clause_is_mv_compatible(root, clause, varRelid, &relid, &attnum, sjinfo))
+ reduced_clauses = lappend(reduced_clauses, clause);
+ else
+ {
+ mvclauses[nmvclauses] = clause;
+ mvattnums[nmvclauses] = attnum;
+ nmvclauses++;
+
+ clause_attnums = bms_add_member(clause_attnums, attnum);
+ }
+ }
+
+ /*
+ * we need at least two clauses referencing two different attributes
+ * referencing to do the reduction
+ */
+ if ((nmvclauses < 2) || (bms_num_members(clause_attnums) < 2))
+ {
+ pfree(mvattnums);
+ pfree(mvclauses);
+
+ bms_free(clause_attnums);
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ reduced = (bool*)palloc0(list_length(clauses) * sizeof(bool));
+
+ /* build the dependency matrix */
+ attnum_max = -1;
+ for (i = 0; i < nmvstats; i++)
+ {
+ int j;
+ int2vector *stakeys = mvstats[i].stakeys;
+
+ /* skip stats without functional dependencies built */
+ if (! mvstats[i].deps_built)
+ continue;
+
+ for (j = 0; j < stakeys->dim1; j++)
+ {
+ int attnum = stakeys->values[j];
+ deps_attnums = bms_add_member(deps_attnums, attnum);
+
+ /* keep the max attnum in the dependencies */
+ attnum_max = (attnum > attnum_max) ? attnum : attnum_max;
+ }
+ }
+
+ /*
+ * We need at least two matching attributes in the clauses and
+ * dependencies, otherwise we can't reduce anything.
+ */
+ intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
+ if (bms_num_members(intersect_attnums) < 2)
+ {
+ pfree(mvattnums);
+ pfree(mvclauses);
+
+ bms_free(clause_attnums);
+ bms_free(deps_attnums);
+ bms_free(intersect_attnums);
+
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ /* allocate the matrix and mappings */
+ deps_natts = bms_num_members(deps_attnums);
+ deps_matrix = (bool*)palloc0(deps_natts * deps_natts * sizeof(int));
+ deps_idx_to_attnum = (int*)palloc0(deps_natts * sizeof(int));
+ deps_attnum_to_idx = (int*)palloc0((attnum_max+1) * sizeof(int));
+
+ /* build the (attnum => attidx) and (attidx => attnum) mappings */
+ attidx = 0;
+ attnum = -1;
+
+ while (true)
+ {
+ attnum = bms_next_member(deps_attnums, attnum);
+ if (attnum == -2)
+ break;
+
+ deps_idx_to_attnum[attidx] = attnum;
+ deps_attnum_to_idx[attnum] = attidx;
+
+ attidx += 1;
+ }
+
+ /* do we have all the attributes mapped? */
+ Assert(attidx == deps_natts);
+
+ /* walk through all the mvstats, build the adjacency matrix */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int j;
+ MVDependencies dependencies = NULL;
+
+ /* skip stats without functional dependencies built */
+ if (! mvstats[i].deps_built)
+ continue;
+
+ /* fetch dependencies */
+ dependencies = deserialize_mv_dependencies(fetch_mv_dependencies(mvstats[i].mvoid));
+ if (dependencies == NULL)
+ continue;
+
+ /* set deps_matrix[a,b] to 'true' if 'a=>b' */
+ for (j = 0; j < dependencies->ndeps; j++)
+ {
+ int aidx = deps_attnum_to_idx[dependencies->deps[j]->a];
+ int bidx = deps_attnum_to_idx[dependencies->deps[j]->b];
+
+ /* a=> b */
+ deps_matrix[aidx * deps_natts + bidx] = true;
+ }
+ }
+
+ /*
+ * Multiply the matrix N-times (N = size of the matrix), so that we
+ * get all the transitive dependencies. That makes the next step
+ * much easier and faster.
+ *
+ * This is essentially an adjacency matrix from graph theory, and
+ * by multiplying it we get transitive edges. We don't really care
+ * about the exact number (number of paths between vertices) though,
+ * so we can do the multiplication in-place (we don't care whether
+ * we found the dependency in this round or in the previous one).
+ *
+ * Track how many new dependencies were added, and stop when 0, but
+ * we can't multiply more than N-times (longest path in the graph).
+ */
+ for (i = 0; i < deps_natts; i++)
+ {
+ int k, l, m;
+ int nchanges = 0;
+
+ /* k => l */
+ for (k = 0; k < deps_natts; k++)
+ {
+ for (l = 0; l < deps_natts; l++)
+ {
+ /* we already have this dependency */
+ if (deps_matrix[k * deps_natts + l])
+ continue;
+
+ /* we don't really care about the exact value, just 0/1 */
+ for (m = 0; m < deps_natts; m++)
+ {
+ if (deps_matrix[k * deps_natts + m] * deps_matrix[m * deps_natts + l])
+ {
+ deps_matrix[k * deps_natts + l] = true;
+ nchanges += 1;
+ break;
+ }
+ }
+ }
+ }
+
+ /* no transitive dependency added here, so terminate */
+ if (nchanges == 0)
+ break;
+ }
+
+ /*
+ * Walk through the clauses, and see which other clauses we may
+ * reduce. The matrix contains all transitive dependencies, which
+ * makes this very fast.
+ *
+ * We have to be careful not to reduce the clause using itself, or
+ * reducing all clauses forming a cycle (so we have to skip already
+ * eliminated clauses).
+ *
+ * I'm not sure whether this guarantees finding the best solution,
+ * i.e. reducing the most clauses, but it probably does (thanks to
+ * having all the transitive dependencies).
+ */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ int j;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[i], deps_attnums))
+ continue;
+
+ /* this clause was already reduced, so let's skip it */
+ if (reduced[i])
+ continue;
+
+ /* walk the potentially 'implied' clauses */
+ for (j = 0; j < nmvclauses; j++)
+ {
+ int aidx, bidx;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[j], deps_attnums))
+ continue;
+
+ aidx = deps_attnum_to_idx[mvattnums[i]];
+ bidx = deps_attnum_to_idx[mvattnums[j]];
+
+ /* can't reduce the clause by itself, or if already reduced */
+ if ((i == j) || reduced[j])
+ continue;
+
+ /* mark the clause as reduced (if aidx => bidx) */
+ reduced[j] = deps_matrix[aidx * deps_natts + bidx];
+ }
+ }
+
+ /* now walk through the clauses, and keep only those not reduced */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ if (! reduced[i])
+ reduced_clauses = lappend(reduced_clauses, mvclauses[i]);
+ }
+
+ pfree(reduced);
+ pfree(mvclauses);
+ pfree(mvattnums);
+
+ pfree(deps_matrix);
+ pfree(deps_idx_to_attnum);
+ pfree(deps_attnum_to_idx);
+
+ bms_free(deps_attnums);
+ bms_free(clause_attnums);
+ bms_free(intersect_attnums);
+
+ return reduced_clauses;
+}
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index 8efc5ba..d44b95a 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -57,7 +57,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (mvstats->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(mvstats[i].mvoid, deps);
@@ -154,6 +155,7 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
result[*nstats].mvoid = HeapTupleGetOid(htup);
result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ result[*nstats].deps_enabled = stats->deps_enabled;
result[*nstats].deps_built = stats->deps_built;
*nstats += 1;
}
@@ -260,6 +262,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index f728d88..2916f11 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2712,9 +2712,9 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s 2 0 16 "26 25" _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
-DATA(insert OID = 3284 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DATA(insert OID = 3377 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies info");
-DATA(insert OID = 3285 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DATA(insert OID = 3378 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 5c8643d..ec6764b 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -18,24 +18,34 @@
/*
* Basic info about the stats, used when choosing what to use
- *
- * TODO Add info about what statistics is available (histogram, MCV,
- * hashed MCV, functional dependencies).
*/
typedef struct MVStatsData {
Oid mvoid; /* OID of the stats in pg_mv_statistic */
int2vector *stakeys; /* attnums for columns in the stats */
+
+ /* statistics requested in ALTER TABLE ... ADD STATISTICS */
+ bool deps_enabled; /* analyze functional dependencies */
+
+ /* available statistics (computed by ANALYZE) */
bool deps_built; /* functional dependencies available */
} MVStatsData;
typedef struct MVStatsData *MVStats;
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
int16 a;
@@ -61,6 +71,7 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
MVStats list_mv_stats(Oid relid, int *nstats, bool built_only);
+bytea * fetch_mv_rules(Oid mvoid);
bytea * fetch_mv_dependencies(Oid mvoid);
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..159d317
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,175 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE functional_dependencies ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c, d);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 6d3b865..00c6ddf 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -109,3 +109,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 8326894..b818be9 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -153,3 +153,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..f95dbf5
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,153 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (unknown_column);
+
+-- single column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a);
+
+-- single column, duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE functional_dependencies ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- correct command
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c, d);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+DROP TABLE functional_dependencies;
--
2.0.5
0003-multivariate-MCV-lists.patchtext/x-diff; name=0003-multivariate-MCV-lists.patchDownload
>From 13c3d4cbe85bbbe6b9509de15dd08384df1df97f Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:15:37 +0100
Subject: [PATCH 3/5] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
---
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/tablecmds.c | 47 +-
src/backend/optimizer/path/clausesel.c | 1153 ++++++++++++++++++++++++++++++--
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 58 +-
src/backend/utils/mvstats/common.h | 11 +-
src/backend/utils/mvstats/mcv.c | 1002 +++++++++++++++++++++++++++
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 2 +
src/include/utils/mvstats.h | 68 +-
src/test/regress/expected/mv_mcv.out | 210 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 181 +++++
15 files changed, 2662 insertions(+), 101 deletions(-)
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index d05a716..4538e63 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -156,7 +156,9 @@ CREATE VIEW pg_mv_stats AS
C.relname AS tablename,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 965d342..fae0fc7 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11901,7 +11901,13 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
Relation mvstatrel;
/* by default build everything */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(def, StatisticsDef));
@@ -11956,6 +11962,29 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -11964,10 +11993,16 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -11983,9 +12018,13 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
- nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index e742827..d24aedf 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -20,6 +20,7 @@
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
#include "utils/selfuncs.h"
@@ -50,17 +51,46 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Oid *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+ Oid *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int type);
static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
- Oid varRelid, Oid *relid, SpecialJoinInfo *sjinfo);
+ Oid varRelid, Oid *relid, SpecialJoinInfo *sjinfo,
+ int type);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, int nmvstats, MVStats mvstats,
SpecialJoinInfo *sjinfo);
+static int choose_mv_statistics(int nmvstats, MVStats mvstats,
+ Bitmapset *attnums);
+static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid,
+ List **mvclauses, MVStats mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStats mvstats);
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStats mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,or) \
+ (m) = (or) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -197,15 +227,19 @@ clauselist_selectivity(PlannerInfo *root,
Bitmapset *mvattnums = NULL;
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
varRelid, jointype, sjinfo);
- /* collect attributes referenced by mv-compatible clauses */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+ /*
+ * Collect attributes referenced by mv-compatible clauses (looking
+ * for clauses compatible with functional dependencies for now).
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_FDEP);
/*
* If there are mv-compatible clauses, referencing at least two
@@ -227,6 +261,49 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Recollect attributes from mv-compatible clauses (maybe we've
+ * removed so many clauses we have a single mv-compatible attnum).
+ * From now on we're only interested in MCV-compatible clauses.
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_MCV);
+
+ /*
+ * If there still are at least two columns, we'll try to select
+ * a suitable multivariate stats.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /* fetch info from the catalog (not the serialized stats yet) */
+ mvstats = list_mv_stats(relid, &nmvstats, true);
+
+ /* see choose_mv_statistics() for details */
+ if (nmvstats > 0)
+ {
+ int idx = choose_mv_statistics(nmvstats, mvstats, mvattnums);
+
+ if (idx >= 0) /* we have a matching stats */
+ {
+ MVStats mvstat = &mvstats[idx];
+
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -901,12 +978,198 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Estimate selectivity for the list of MV-compatible clauses, using that
+ * particular histogram.
+ *
+ * When we hit a single bucket, we don't know what portion of it actually
+ * matches the clauses (e.g. equality), and we use 1/2 the bucket by
+ * default. However, the MV histograms are usually less detailed than
+ * the per-column ones, meaning the sum of buckets is often quite high
+ * (thanks to combining a lot of "partially hit" buckets).
+ *
+ * There are several ways to improve this, usually with cases when it
+ * won't really help. Also, the more complex the process, the worse
+ * the failures (i.e. misestimates).
+ *
+ * (1) Use the MV histogram only as a way to combine multiple
+ * per-column histograms, essentially rewriting
+ *
+ * P(A & B) = P(A) * P(B|A)
+ *
+ * where P(B|A) may be computed using a proper "slice" of the
+ * histogram, by first selecting only buckets where A is true, and
+ * then using the boundaries to 'restrict' the per-colunm histogram.
+ *
+ * With more clauses, it gets more complicated, of course
+ *
+ * P(A & B & C) = P(A & C) * P(B|A & C)
+ * = P(A) * P(C|A) * P(B|A & C)
+ *
+ * and so on.
+ *
+ * Of course, the question is how well and efficiently we can
+ * compute the conditional probabilities - whether this approach
+ * can improve the estimates (instead of amplifying the errors).
+ *
+ * Also, this does not eliminate the need for histogram on [A,B,C].
+ *
+ * (2) Use multiple smaller (and more accurate) histograms, and combine
+ * them using a process similar to the above. E.g. by assuming that
+ * B and C are independent, we can rewrite
+ *
+ * P(B|A & C) = P(B|A)
+ *
+ * so we can rewrite the whole formula to
+ *
+ * P(A & B & C) = P(A) * P(C|A) * P(B|A)
+ *
+ * and we're OK with two 2D histograms [A,C] and [A,B].
+ *
+ * It'd be nice to perform some sort of statistical test (Fisher
+ * or another chi-squared test) to identify independent components
+ * and automatically separate them into smaller histograms.
+ *
+ * (3) Using the estimated number of distinct values in a bucket to
+ * decide the selectivity of equality in the bucket (instead of
+ * blindly using 1/2 of the bucket, we may use 1/ndistinct).
+ * Of course, if the ndistinct estimate is way off, or when the
+ * distribution is not uniform (one distict items get much more
+ * items), this will fail. Also, we currently don't have ndistinct
+ * estimate available at this moment (but it shouldn't be that
+ * difficult to compute as ndistinct and ntuples should be available).
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities
+ * (i.e. the selectivity of the most restrictive clause), because
+ * that's the maximum we can ever get from ANDed list of clauses.
+ * This may probably prevent issues with hitting too many buckets
+ * and low precision histograms.
+ *
+ * TODO We may support some additional conditions, most importantly
+ * those matching multiple columns (e.g. "a = b" or "a < b").
+ * Ultimately we could track multi-table histograms for join
+ * cardinality estimation.
+ *
+ * TODO Currently this is only estimating all clauses, or clauses
+ * matching varRelid (when it's not 0). I'm not sure what's the
+ * purpose of varRelid, but my assumption is this is used for
+ * join conditions and such. In that case we can use those clauses
+ * to restrict the other (i.e. filter the histogram buckets first,
+ * before estimating the other clauses). This is essentially equal
+ * to computing P(A|B) where "B" are the clauses not matching the
+ * varRelid.
+ *
+ * TODO Further thoughts on processing equality clauses - maybe it'd be
+ * better to look for stats (with MCV) covered by the equality
+ * clauses, because then we have a chance to find an exact match
+ * in the MCV list, which is pretty much the best we can do. We may
+ * also look at the least frequent MCV item, and use it as a upper
+ * boundary for the selectivity (had there been a more frequent
+ * item, it'd be in the MCV list).
+ *
+ * These conditions may then be used as a condition for the other
+ * selectivities, i.e. we may estimate P(A,B) first, and then
+ * compute P(C|A,B) from another histogram. This may be useful when
+ * we can estimate P(A,B) accurately (e.g. because it's a complete
+ * equality match evaluated on MCV list), and then compute the
+ * conditional probability P(C|A,B), giving us the requested stats
+ *
+ * P(A,B,C) = P(A,B) * P(C|A,B)
+ *
+ * TODO There are several options for 'sanity clamping' the estimates.
+ *
+ * First, if we have selectivities for each condition, then
+ *
+ * P(A,B) <= MIN(P(A), P(B))
+ *
+ * Because additional conditions (connected by AND) can only lower
+ * the probability.
+ *
+ * So we can do some basic sanity checks using the single-variate
+ * stats (the ones we have right now).
+ *
+ * Second, when we have multivariate stats with a MCV list, then
+ *
+ * (a) if we have a full equality condition (one equality condition
+ * on each column) and we found a match in the MCV list, this is
+ * the selectivity (and it's supposed to be exact)
+ *
+ * (b) if we have a full equality condition and we haven't found a
+ * match in the MCV list, then the selectivity is below the
+ * lowest selectivity in the MCV list
+ *
+ * (c) if we have a equality condition (not full), we can still
+ * search the MCV for matches and use the sum of probabilities
+ * as a lower boundary for the histogram (if there are no
+ * matches in the MCV list, then we have no boundary)
+ *
+ * Third, if there are multiple multivariate stats for a set of
+ * clauses, we may compute all of them and then somehow aggregate
+ * them - e.g. by choosing the minimum, median or average. The
+ * multi-variate stats are susceptible to overestimation (because
+ * we take 50% of the bucket for partial matches). Some stats may
+ * give better estimates than others, but it's very difficult to
+ * say determine that in advance which one is the best (it depends
+ * on the number of buckets, number of additional columns not
+ * referenced in the clauses etc.) so we may compute all and then
+ * choose a sane aggregation (minimum seems like a good approach).
+ * Of course, this may result in longer / more expensive estimation
+ * (CPU-wise), but it may be worth it.
+ *
+ * There are ways to address this, though. First, it's possible to
+ * add a GUC choosing whether to do a 'simple' (using a single
+ * stats expected to give the best estimate) and 'complex' (combining
+ * the multiple estimates).
+ *
+ * multivariate_estimates = (simple|full)
+ *
+ * Also, this might be enabled at a table level, by something like
+ *
+ * ALTER TABLE ... SET STATISTICS (simple|full)
+ *
+ * Which would make it possible to use this only for the tables
+ * where the simple approach does not work.
+ *
+ * Also, there are ways to optimize this algorithmically. E.g. we
+ * may try to get an estimate from a matching MCV list first, and
+ * if we happen to get a "full equality match" we may stop computing
+ * the estimates from other stats (for this condition) because
+ * that's probably the best estimate we can really get.
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive).
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
- Oid *relid, SpecialJoinInfo *sjinfo)
+ Oid *relid, SpecialJoinInfo *sjinfo, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
@@ -922,12 +1185,11 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
+ sjinfo, types);
}
/*
@@ -946,6 +1208,180 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
}
/*
+ * We're looking for statistics matching at least 2 attributes,
+ * referenced in the clauses compatible with multivariate statistics.
+ * The current selection criteria is very simple - we choose the
+ * statistics referencing the most attributes.
+ *
+ * If there are multiple statistics referencing the same number of
+ * columns (from the clauses), the one with less source columns
+ * (as listed in the ADD STATISTICS when creating the statistics) wins.
+ * Other wise the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns,
+ * but one has 100 buckets and the other one has 1000 buckets (thus
+ * likely providing better estimates), this is not currently
+ * considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list,
+ * another one with just a histogram and a third one with both,
+ * this is not considered.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts,
+ * so if there are multiple clauses on a single attribute, this
+ * still counts as a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example
+ * equality clauses probably work better with MCV lists than with
+ * histograms. But IS [NOT] NULL conditions may often work better
+ * with histograms (thanks to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
+ * selected as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list
+ * of clauses into two parts - conditions that are compatible with the
+ * selected stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while
+ * the last condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting
+ * conditions instead of just referenced attributes), but eventually
+ * the best option should be to combine multiple statistics. But that's
+ * much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing
+ * the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses,
+ * because 'dependencies' will probably work only with equality
+ * clauses.
+ */
+static int
+choose_mv_statistics(int nmvstats, MVStats mvstats, Bitmapset *attnums)
+{
+ int i, j;
+
+ int choice = -1;
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements)
+ * and for each one count the referenced attributes (encoded in
+ * the 'attnums' bitmap).
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = mvstats[i].stakeys;
+ int numattrs = mvstats[i].stakeys->dim1;
+
+ /* count columns covered by the histogram */
+ for (j = 0; j < numattrs; j++)
+ if (bms_is_member(attrs->values[j], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = i;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen statistics, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid, List **mvclauses,
+ MVStats mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes covered by the stats, so we can
+ * do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(root, clause, varRelid, NULL,
+ &attnums, sjinfo, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list
+ * of mv-compatible clauses. Otherwise, keep it in the list of
+ * 'regular' clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
* into simple_rte_array) and a bitmap of attributes. This is then
@@ -964,96 +1400,205 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
static bool
clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Oid *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+ Oid *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int types)
{
+ Relids clause_relids;
+ Relids left_relids;
+ Relids right_relids;
if (IsA(clause, RestrictInfo))
{
RestrictInfo *rinfo = (RestrictInfo *) clause;
- /* Pseudoconstants are not really interesting here. */
- if (rinfo->pseudoconstant)
+ if (! IsA(clause, RestrictInfo))
+ {
+ elog(WARNING, "expected RestrictInfo, got type %d", clause->type);
return false;
+ }
- /* no support for OR clauses at this point */
- if (rinfo->orclause)
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
return false;
/* get the actual clause from the RestrictInfo (it's not an OR clause) */
clause = (Node*)rinfo->clause;
- /* only simple opclauses are compatible with multivariate stats */
- if (! is_opclause(clause))
- return false;
-
/* we don't support join conditions at this moment */
if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
return false;
+ clause_relids = rinfo->clause_relids;
+ left_relids = rinfo->left_relids;
+ right_relids = rinfo->right_relids;
+ }
+ else if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
+ {
+ left_relids = pull_varnos(get_leftop((Expr*)clause));
+ right_relids = pull_varnos(get_rightop((Expr*)clause));
+
+ clause_relids = bms_union(left_relids,
+ right_relids);
+ }
+ else
+ {
+ /* Not a binary opclause, so mark left/right relid sets as empty */
+ left_relids = NULL;
+ right_relids = NULL;
+ /* and get the total relid set the hard way */
+ clause_relids = pull_varnos((Node *) clause);
+ }
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
/* is it 'variable op constant' ? */
- if (list_length(((OpExpr *) clause)->args) == 2)
- {
- OpExpr *expr = (OpExpr *) clause;
- bool varonleft = true;
- bool ok;
- ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
- (is_pseudo_constant_clause_relids(lsecond(expr->args),
- rinfo->right_relids) ||
- (varonleft = false,
- is_pseudo_constant_clause_relids(linitial(expr->args),
- rinfo->left_relids)));
+ ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ left_relids)));
- if (ok)
- {
- RangeTblEntry * rte;
- Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ if (ok)
+ {
+ RangeTblEntry * rte;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
- /*
- * Simple variables only - otherwise the planner_rt_fetch seems to fail
- * (return NULL).
- *
- * TODO Maybe use examine_variable() would fix that?
- */
- if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
- return false;
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
- /*
- * Only consider this variable if (varRelid == 0) or when the varno
- * matches varRelid (see explanation at clause_selectivity).
- *
- * FIXME I suspect this may not be really necessary. The (varRelid == 0)
- * part seems to be enforced by treat_as_join_clause().
- */
- if (! ((varRelid == 0) || (varRelid == var->varno)))
- return false;
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
- /* Also skip special varno values, and system attributes ... */
- if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
- return false;
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
- /* Lookup info about the base relation (we need to pass the OID out) */
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
+ {
rte = planner_rt_fetch(var->varno, root);
*relid = rte->relid;
-
- /*
- * If it's not a "<" or ">" or "=" operator, just ignore the
- * clause. Otherwise note the relid and attnum for the variable.
- * This uses the function for estimating selectivity, ont the
- * operator directly (a bit awkward, but well ...).
- */
- switch (get_oprrest(expr->opno))
- {
- case F_EQSEL:
- *attnum = var->varattno;
- return true;
- }
}
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ /* not compatible with functional dependencies */
+ if (types & MV_CLAUSE_TYPE_MCV)
+ {
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return (types & MV_CLAUSE_TYPE_MCV);
+ }
+ return false;
+
+ case F_EQSEL:
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return true;
+ }
}
}
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ RangeTblEntry * rte;
+ Var * var = (Var*)((NullTest*)clause)->arg;
- return false;
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
+ {
+ rte = planner_rt_fetch(var->varno, root);
+ *relid = rte->relid;
+ }
+
+ *attnums = bms_add_member(*attnums, var->varattno);
+
+ return true;
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /*
+ * AND/OR-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses
+ * are supported and some are not, and treat all supported
+ * subclauses as a single clause, compute it's selectivity
+ * using mv stats, and compute the total selectivity using
+ * the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the
+ * orclause with nested RestrictInfo - we won't have to
+ * call pull_varnos() for each clause, saving time.
+ */
+ Bitmapset *tmp = NULL;
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ if (! clause_is_mv_compatible(root, (Node*)lfirst(l),
+ varRelid, relid, &tmp, sjinfo, types))
+ return false;
+ }
+
+ /* add the attnums from the OR-clause to the set of attnums */
+ *attnums = bms_join(*attnums, tmp);
+
+ return true;
+ }
+
+ return false;
}
/*
@@ -1115,6 +1660,13 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
*
* TODO Merge this docs to dependencies.c, as it's saying mostly the
* same things as the comments there.
+ *
+ * TODO Currently this is applied only to the top-level clauses, but
+ * maybe we could apply it to lists at subtrees too, e.g. to the
+ * two AND-clauses in
+ *
+ * (x=1 AND y=2) OR (z=3 AND q=10)
+ *
*/
static List *
clauselist_apply_dependencies(PlannerInfo *root, List *clauses, Oid varRelid,
@@ -1195,17 +1747,27 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses, Oid varRelid,
*/
foreach (lc, clauses)
{
- AttrNumber attnum;
+ Bitmapset *attnums = NULL;
Node *clause = (Node *) lfirst(lc);
- if (! clause_is_mv_compatible(root, clause, varRelid, &relid, &attnum, sjinfo))
+ if (! clause_is_mv_compatible(root, clause, varRelid, &relid, &attnums,
+ sjinfo, MV_CLAUSE_TYPE_FDEP))
+ reduced_clauses = lappend(reduced_clauses, clause);
+ else if (bms_num_members(attnums) > 1)
+ /* FIXME This may happen thanks to OR-clauses, which should
+ * really be handled differently for functional
+ * dependencies.
+ */
reduced_clauses = lappend(reduced_clauses, clause);
else
{
+ /* functional dependencies support only [Var = Const] */
+ Assert(bms_num_members(attnums) == 1);
mvclauses[nmvclauses] = clause;
- mvattnums[nmvclauses] = attnum;
+ mvattnums[nmvclauses] = bms_singleton_member(attnums);
nmvclauses++;
- clause_attnums = bms_add_member(clause_attnums, attnum);
+ clause_attnums = bms_add_member(clause_attnums,
+ bms_singleton_member(attnums));
}
}
@@ -1430,3 +1992,446 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses, Oid varRelid,
return reduced_clauses;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStats mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = deserialize_mv_mcvlist(fetch_mv_mcvlist(mvstats->mvoid));
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /* frequency of the lowest MCV item */
+ *lowsel = 1.0;
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ *
+ * FIXME This would probably deserve a refactoring, I guess. Unify
+ * the two loops and put the checks inside, or something like
+ * that.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ /* operator */
+ FmgrInfo opproc;
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo ltproc, gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype,
+ TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* TODO consider bsearch here (list is sorted by values)
+ * TODO handle other operators too (LT, GT)
+ * TODO identify "full match" when the clauses fully
+ * match the whole MCV list (so that checking the
+ * histogram is not needed)
+ */
+ if (get_oprrest(expr->opno) == F_EQSEL)
+ {
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ bool match = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (match)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ mismatch = (! match);
+ }
+ else if (get_oprrest(expr->opno) == F_SCALARLTSEL) /* column < constant */
+ {
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ } /* (get_oprrest(expr->opno) == F_SCALARLTSEL) */
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+ }
+ else if (get_oprrest(expr->opno) == F_SCALARGTSEL) /* column > constant */
+ {
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARGTSEL) */
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! mcvlist->items[i]->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (mcvlist->items[i]->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..3c0aff4 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o mcv.o dependencies.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index d44b95a..bd952c6 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -17,8 +17,8 @@
#include "common.h"
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
-
+ int natts,
+ VacAttrStats **vacattrstats);
/*
* Compute requested multivariate stats, using the rows sampled for the
@@ -44,6 +44,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
for (i = 0; i < nmvstats; i++)
{
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
/* int2 vector of attnums the stats should be computed on */
int2vector * attrs = mvstats[i].stakeys;
@@ -60,8 +62,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (mvstats->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (mvstats->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(mvstats[i].mvoid, deps);
+ update_mv_stats(mvstats[i].mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -143,7 +149,7 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
* Skip statistics that were not computed yet (if only stats
* that were already built were requested)
*/
- if (built_only && (! stats->deps_built))
+ if (built_only && (! (stats->mcv_built || stats->deps_built)))
continue;
/* double the array size if needed */
@@ -156,7 +162,9 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
result[*nstats].mvoid = HeapTupleGetOid(htup);
result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
result[*nstats].deps_enabled = stats->deps_enabled;
+ result[*nstats].mcv_enabled = stats->mcv_enabled;
result[*nstats].deps_built = stats->deps_built;
+ result[*nstats].mcv_built = stats->mcv_built;
*nstats += 1;
}
@@ -171,7 +179,9 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
}
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -196,15 +206,26 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
/* Is there already a pg_mv_statistic tuple for this attribute? */
oldtup = SearchSysCache1(MVSTATOID,
@@ -232,6 +253,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -242,11 +278,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index 6d5465b..f4309f7 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -46,7 +46,15 @@ typedef struct
Datum value; /* a data value */
int tupno; /* position index for tuple it came from */
} ScalarItem;
-
+
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -71,5 +79,6 @@ int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..4466cee
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1002 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+/*
+ * Multivariate MCVs (most-common values lists) are a straightforward
+ * extension of regular MCV list by tracking combinations of values for
+ * several attributes (columns), including NULL flags, and frequency
+ * of the combination.
+ *
+ * For columns small number of distinct values, this works quite well
+ * and may represent the distribution pretty exactly. For columns with
+ * large number of distinct values (e.g. stored as FLOAT), this does
+ * not work that well.
+ *
+ * If we can represent the distribution as a MCV list, we can estimate
+ * some clauses (e.g. equality clauses) much accurately than using
+ * histograms for example.
+ *
+ * Discrete distributions are also easier to combine into a larger
+ * distribution (but this is not yet implemented).
+ *
+ *
+ * TODO For types that don't reasonably support ordering (either because
+ * the type does not support that or when the user adds some option
+ * to the ADD STATISTICS command - e.g. UNSORTED_STATS), building
+ * the histogram may be pointless and inefficient. This is esp.
+ * true for varlena types that may be quite large and a large MCV
+ * list may be a better choice, because it makes equality estimates
+ * more accurate. Due to the unsorted nature, range queries on those
+ * attributes are rather useless anyway.
+ *
+ * Another thing is that by restricting to MCV list and equality
+ * conditions, we can use hash values instead of long varlena values.
+ * The equality estimation will be very accurate.
+ *
+ * This however complicates matching the columns to available
+ * statistics, as it will require matching clauses (not columns) to
+ * stats. And it may get quite complex - e.g. what if there are
+ * multiple clauses, each compatible with different stats subset?
+ *
+ *
+ * Selectivity estimation
+ * ----------------------
+ * The estimation, implemented in clauselist_mv_selectivity_mcvlist(),
+ * is quite simple in principle - walk through the MCV items and sum
+ * frequencies of all the items that match all the clauses.
+ *
+ * The current implementation uses MCV lists to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ *
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (a) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ *
+ * (b) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ *
+ * Estimating equality clauses
+ * ---------------------------
+ * When computing selectivity estimate for equality clauses
+ *
+ * (a = 1) AND (b = 2)
+ *
+ * we can do this estimate pretty exactly assuming that two conditions
+ * are met:
+ *
+ * (1) there's an equality condition on each attribute
+ *
+ * (2) we find a matching item in the MCV list
+ *
+ * In that case we know the MCV item represents all the tuples matching
+ * the clauses, and the selectivity estimate is complete. This is what
+ * we call 'full match'.
+ *
+ * When only (1) holds, but there's no matching MCV item, we don't know
+ * whether there are no such rows or just are not very frequent. We can
+ * however use the frequency of the least frequent MCV item as an upper
+ * bound for the selectivity.
+ *
+ * If the equality conditions match only a subset of the attributes
+ * the MCV list is built on (i.e. we can't get a full match - we may get
+ * multiple MCV items matching the clauses, but even if we get a single
+ * match there may be items that did not get into the MCV list. But in
+ * this case we can still use the frequency of the last MCV item to clam
+ * the 'additional' selectivity not accounted for by the matching items.
+ *
+ * If there's no histogram, because the MCV list approximates the
+ * distribution accurately (not because the histogram was disabled),
+ * it does not really matter whether there are equality conditions on
+ * all the columns - we can do pretty accurate estimation using the MCV.
+ *
+ * TODO For a combination of equality conditions (not full-match case)
+ * we probably can clamp the selectivity by the minimum of
+ * selectivities for each condition. For example if we know the
+ * number of distinct values for each column, we can use 1/ndistinct
+ * as a per-column estimate. Or rather 1/ndistinct + selectivity
+ * derived from the MCV list.
+ *
+ * If we know the estimate of number of combinations of the columns
+ * (i.e. ndistinct(A,B)), we may estimate the average frequency of
+ * items in the remaining 10% as [10% / ndistinct(A,B)].
+ *
+ *
+ * Bounding estimates
+ * ------------------
+ * In general the MCV lists may not provide estimates as accurate as
+ * for the full-match equality case, but may provide some useful
+ * lower/upper boundaries for the estimation error.
+ *
+ * With equality clauses we can do a few more tricks to narrow this
+ * error range (see the previous section and TODO), but with inequality
+ * clauses (or generally non-equality clauses), it's rather dificult.
+ * There's nothing like a 'full match' - we have to consider both the
+ * MCV items and the remaining part every time. We can't use the minimum
+ * selectivity of MCV items, as the clauses may match multiple items.
+ *
+ * For example with a MCV list on columns (A, B), covering 90% of the
+ * table (computed while building the MCV list), about ~10% of the table
+ * is not represented by the MCV list. So even if the conditions match
+ * all the remaining rows (not represented by the MCV items), we can't
+ * get selectivity higher than those 10%. We may use 1/2 the remaining
+ * selectivity as an estimate (minimizing average error).
+ *
+ * TODO Most of these ideas (error limiting) are not yet implemented.
+ *
+ *
+ * General TODO
+ * ------------
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * TODO Add support for IS [NOT] NULL clauses, and clauses referencing
+ * multiple columns (a < b).
+ *
+ * TODO It's possible to build a special case of MCV list, storing not
+ * the actual values but only 32/64-bit hash. This is only useful
+ * for estimating equality clauses and for large varlena types,
+ * which are very impractical for plain MCV list because of size.
+ * But for those data types we really want just the equality
+ * clauses, so it's actually a good solution.
+ *
+ * TODO Currently there's no logic to consider building only a MCV list
+ * (and not building the histogram at all), except for doing this
+ * decision manually in ADD STATISTICS.
+ */
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(int32))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(int32) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(int32) + sizeof(bool)) + sizeof(double))
+
+/* pointers into a flat serialized item of ITEM_SIZE(n) bytes */
+#define ITEM_INDEXES(item) ((int32*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+/*
+ * Builds MCV list from sample rows, and removes rows represented by
+ * the MCV list from the sample (the number of remaining sample rows is
+ * returned by the numrows_filtered parameter).
+ *
+ * The method is quite simple - in short it does about these steps:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * For more details, see the comments in the code.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we
+ * want to check the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or
+ * float4). Maybe we could save some space here, but the bytea
+ * compression should handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed
+ * from the table, but rather estimate the number of distinct
+ * values in the table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * Preallocate space for all the items as a single chunk, and point
+ * the items to the appropriate parts of the array.
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ /* keep all the rows by default (as if there was no MCV list) */
+ *numrows_filtered = numrows;
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ items[j].values[i] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Count the number of distinct groups - just walk through the
+ * sorted list and count the number of key changes. We use this to
+ * determine the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array. We'll always
+ * require at least 4 rows per group.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e.
+ * if there are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS),
+ * we'll require only 2 rows per group.
+ *
+ * TODO For now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ * for the multivariate case.
+ *
+ * TODO We can do this only if we believe we got all the distinct
+ * values of the table.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog)
+ * instead of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /*
+ * Walk through the sorted data again, and see how many groups
+ * reach the mcv_threshold (and become an item in the MCV list).
+ */
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or new group, so check if we exceed mcv_threshold */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* group hits the threshold, count the group as MCV item */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* within group, so increase the number of items */
+ count += 1;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as
+ * we'll pass this outside this method and thus it needs to be
+ * easy to pfree() the data (and we wouldn't know where the
+ * arrays start).
+ *
+ * TODO Maybe the reasoning that we can't allocate a single
+ * piece because we're passing it out is bogus? Who'd
+ * free a single item of the MCV list, anyway?
+ *
+ * TODO Maybe with a proper encoding (stuffing all the values
+ * into a list-level array, this will be untrue)?
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /*
+ * Repeat the same loop as above, but this time copy the data
+ * into the MCV list (for items exceeding the threshold).
+ *
+ * TODO Maybe we could simply remember indexes of the last item
+ * in each group (from the previous loop)?
+ */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[nitems];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, items[(i-1)].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, items[(i-1)].isnull, sizeof(bool) * numattrs);
+
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ /* next item */
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows
+ * that are not represented by the MCV list).
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ *
+ * A better option would be keeping the ID of the row in
+ * the sort item, and then just walk through the items and
+ * mark rows to remove (in a bitmap of the same size).
+ * There's not space for that in SortItem at this moment,
+ * but it's trivial to add 'private' pointer, or just
+ * using another structure with extra field (starting with
+ * SortItem, so that the comparators etc. still work).
+ *
+ * Another option is to use the sorted array of items
+ * (because that's how we sorted the source data), and
+ * simply do a bsearch() into it. If we find a matching
+ * item, the row belongs to the MCV list.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* used for the searches */
+ SortItem item, mcvitem;;
+
+ item.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ item.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * FIXME we don't need to allocate this, we can reference
+ * the MCV item directly ...
+ */
+ mcvitem.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ mcvitem.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ item.values[j] = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &item.isnull[j]);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /*
+ * TODO Create a SortItem/MCVItem comparator so that
+ * we don't need to do memcpy() like crazy.
+ */
+ memcpy(mcvitem.values, mcvlist->items[j]->values,
+ numattrs * sizeof(Datum));
+ memcpy(mcvitem.isnull, mcvlist->items[j]->isnull,
+ numattrs * sizeof(bool));
+
+ if (multi_sort_compare(&item, &mcvitem, mss) == 0)
+ {
+ match = true;
+ break;
+ }
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the rows and remember how many rows we kept */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(rows_filtered);
+ pfree(item.values);
+ pfree(item.isnull);
+ pfree(mcvitem.values);
+ pfree(mcvitem.isnull);
+ }
+ }
+
+ pfree(values);
+ pfree(items);
+ pfree(isnull);
+
+ return mcvlist;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+bytea *
+fetch_mv_mcvlist(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *mcvlist = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum tmp = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ mcvlist = DatumGetByteaP(tmp);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return mcvlist;
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Serialize MCV list into a bytea value. The basic algorithm is simple:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use 32-bit values for the indexes in step (3), although we
+ * could probably use just 16 bits as we don't allow more than 8k
+ * items in the MCV list max_mcv_items (well, we might increase this to
+ * 32k and still fit into signed 16-bits). But let's be lazy and rely
+ * on the varlena compression to kick in. If most bytes will be 0x00
+ * so it should work nicely.
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider using 16-bit values for the indexes in step (3).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ Size total_length = 0;
+
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for all values, including NULLs (won't use them) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ if (! mcvlist->items[j]->isnull[i]) /* skip NULL values */
+ {
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval)
+ /*
+ * passed by value, so just Datum array (int4, int8, ...)
+ *
+ * TODO Might save a few bytes here, by storing just typlen
+ * bytes instead of whole Datum (8B) on 64-bits.
+ */
+ info[i].nbytes = info[i].nvalues * sizeof(Datum);
+ else if (info[i].typlen > 0)
+ /* pased by reference, but fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data.
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized MCV exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'ptr' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], sizeof(Datum));
+ data += sizeof(Datum);
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the values for each item */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! mcvlist->items[i]->isnull[j])
+ {
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ v = (Datum*)bsearch(&mcvlist->items[i]->values[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims),
+ mcvlist->items[i]->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims),
+ &mcvlist->items[i]->frequency, sizeof(double));
+
+ /* copy the item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/* inverse to serialize_mv_mcvlist() - see the comment there */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ int32 *indexes = NULL;
+ Datum **values = NULL;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert(nitems > 0);
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MCVListData,items) +
+ ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* let's parse the value arrays */
+ values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+
+ /* allocate space for the MCV items */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)palloc0(sizeof(MCVItemData));
+
+ item->values = (Datum*)palloc0(sizeof(Datum)*ndims);
+ item->isnull = (bool*) palloc0(sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ return mcvlist;
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 81ec23b..c6e7d74 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -35,15 +35,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -59,11 +65,15 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-#define Natts_pg_mv_statistic 5
+#define Natts_pg_mv_statistic 9
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_deps_enabled 2
-#define Anum_pg_mv_statistic_deps_built 3
-#define Anum_pg_mv_statistic_stakeys 4
-#define Anum_pg_mv_statistic_stadeps 5
+#define Anum_pg_mv_statistic_mcv_enabled 3
+#define Anum_pg_mv_statistic_mcv_max_items 4
+#define Anum_pg_mv_statistic_deps_built 5
+#define Anum_pg_mv_statistic_mcv_built 6
+#define Anum_pg_mv_statistic_stakeys 7
+#define Anum_pg_mv_statistic_stadeps 8
+#define Anum_pg_mv_statistic_stamcv 9
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 2916f11..b2aa815 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2716,6 +2716,8 @@ DATA(insert OID = 3377 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3378 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index ec6764b..6ff29d6 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -25,9 +25,11 @@ typedef struct MVStatsData {
/* statistics requested in ALTER TABLE ... ADD STATISTICS */
bool deps_enabled; /* analyze functional dependencies */
+ bool mcv_enabled; /* analyze MCV lists */
/* available statistics (computed by ANALYZE) */
bool deps_built; /* functional dependencies available */
+ bool mcv_built; /* MCV list is already available */
} MVStatsData;
typedef struct MVStatsData *MVStats;
@@ -66,6 +68,47 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -74,24 +117,39 @@ MVStats list_mv_stats(Oid relid, int *nstats, bool built_only);
bytea * fetch_mv_rules(Oid mvoid);
bytea * fetch_mv_dependencies(Oid mvoid);
+bytea * fetch_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..595cfbf
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,210 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE mcv_list ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+ALTER TABLE mcv_list ADD STATISTICS (dependencies, max_mcv_items 200) ON (a, b, c);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10) ON (a, b, c);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10000) ON (a, b, c);
+ERROR: max number of MCV items is 8192
+-- correct command
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c, d);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f0117ca..6d9ab2f 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1357,7 +1357,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
c.relname AS tablename,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 00c6ddf..63727a4 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -111,4 +111,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index b818be9..5b07b3b 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -154,3 +154,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..410b52d
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,181 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (unknown_column);
+
+-- single column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a);
+
+-- single column, duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE mcv_list ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- missing MCV statistics
+ALTER TABLE mcv_list ADD STATISTICS (dependencies, max_mcv_items 200) ON (a, b, c);
+
+-- invalid mcv_max_items value / too low
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10) ON (a, b, c);
+
+-- invalid mcv_max_items value / too high
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10000) ON (a, b, c);
+
+-- correct command
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c, d);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+DROP TABLE mcv_list;
--
2.0.5
0004-multivariate-histograms.patchtext/x-diff; name=0004-multivariate-histograms.patchDownload
>From 166a13ed6152ebc0e384c53f765946ae8be5193f Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 4/5] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/tablecmds.c | 67 +-
src/backend/optimizer/path/clausesel.c | 549 ++++++++-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 41 +-
src/backend/utils/mvstats/histogram.c | 1800 ++++++++++++++++++++++++++++
src/backend/utils/mvstats/mcv.c | 1 +
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 2 +
src/include/utils/mvstats.h | 76 +-
src/test/regress/expected/mv_histogram.out | 210 ++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 179 +++
15 files changed, 2924 insertions(+), 38 deletions(-)
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 4538e63..87086f9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,7 +158,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fae0fc7..6b01660 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11876,15 +11876,19 @@ static int compare_int16(const void *a, const void *b)
* The code is an unholy mix of pieces that really belong to other parts
* of the source tree.
*
- * FIXME Check that the types are pass-by-value and support sort,
- * although maybe we can live without the sort (and only build
- * MCV list / association rules).
- *
- * FIXME This should probably check for duplicate stats (i.e. same
- * keys, same options). Although maybe it's useful to have
- * multiple stats on the same columns with different options
- * (say, a detailed MCV-only stats for some queries, histogram
- * for others, etc.)
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ *
+ * TODO It might be useful to have ALTER TABLE DROP STATISTICS too, but
+ * it's tricky because there may be multiple kinds of stats for the
+ * same list of columns, with different options (e.g. one just MCV
+ * list, another with histogram, etc.).
*/
static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
StatisticsDef *def, LOCKMODE lockmode)
@@ -11902,12 +11906,15 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
/* by default build everything */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(def, StatisticsDef));
@@ -11985,6 +11992,29 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -11993,10 +12023,10 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -12004,6 +12034,11 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -12021,10 +12056,14 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index d24aedf..ea4d588 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -53,6 +53,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
Oid *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
@@ -77,6 +78,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStats mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStats mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -84,6 +87,11 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MVHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
#define MIN(x, y) (((x) < (y)) ? (x) : (y))
@@ -266,7 +274,7 @@ clauselist_selectivity(PlannerInfo *root,
* From now on we're only interested in MCV-compatible clauses.
*/
mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- MV_CLAUSE_TYPE_MCV);
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/*
* If there still are at least two columns, we'll try to select
@@ -292,7 +300,7 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, sjinfo, clauses,
varRelid, &mvclauses, mvstat,
- MV_CLAUSE_TYPE_MCV);
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -1146,6 +1154,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -1159,9 +1168,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1461,7 +1485,6 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
bool ok;
/* is it 'variable op constant' ? */
-
ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
(is_pseudo_constant_clause_relids(lsecond(expr->args),
right_relids) ||
@@ -1515,10 +1538,10 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
case F_SCALARLTSEL:
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (types & MV_CLAUSE_TYPE_MCV)
+ if (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
*attnums = bms_add_member(*attnums, var->varattno);
- return (types & MV_CLAUSE_TYPE_MCV);
+ return (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
}
return false;
@@ -2435,3 +2458,515 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all buckets, and increase the match level
+ * for the clauses (and skip buckets that are 'full match').
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStats mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ int nmatches = 0;
+ char *matches = NULL;
+ MVHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = deserialize_mv_histogram(fetch_mv_histogram(mvstats->mvoid));
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MVHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Assert (mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert (clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ *
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR
+ | TYPECACHE_GT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ bool tmp;
+ MVBucket bucket = mvhist->buckets[i];
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ *
+ * FIXME Once the min/max values are deduplicated, we can easily minimize
+ * the number of calls to the comparator (assuming we keep the
+ * deduplicated structure). See the note on compression at MVBucket
+ * serialize/deserialize methods.
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL: /* column < constant */
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->max[idx]));
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+
+ }
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->min[idx],
+ cst->constvalue));
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ break;
+
+ case F_SCALARGTSEL: /* column > constant */
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->max[idx]));
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->min[idx],
+ cst->constvalue));
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the lt/gt
+ * operators fetched from type cache.
+ *
+ * TODO We'll use the default 50% estimate, but that's probably way off
+ * if there are multiple distinct values. Consider tweaking this a
+ * somehow, e.g. using only a part inversely proportional to the
+ * estimated number of distinct values in the bucket.
+ *
+ * TODO This does not handle inclusion flags at the moment, thus counting
+ * some buckets twice (when hitting the boundary).
+ *
+ * TODO Optimization is that if max[i] == min[i], it's effectively a MCV
+ * item and we can count the whole bucket as a complete match (thus
+ * using 100% bucket selectivity and not just 50%).
+ *
+ * TODO Technically some buckets may "degenerate" into single-value
+ * buckets (not necessarily for all the dimensions) - maybe this
+ * is better than keeping a separate MCV list (multi-dimensional).
+ * Update: Actually, that's unlikely to be better than a separate
+ * MCV list for two reasons - first, it requires ~2x the space
+ * (because of storing lower/upper boundaries) and second because
+ * the buckets are ranges - depending on the partitioning algorithm
+ * it may not even degenerate into (min=max) bucket. For example the
+ * the current partitioning algorithm never does that.
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ bucket->min[idx]));
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ bucket->max[idx],
+ cst->constvalue));
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+
+ break;
+ }
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ return nmatches;
+}
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 3c0aff4..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o mcv.o dependencies.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index bd952c6..6e824bd 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -45,7 +45,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
{
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
/* int2 vector of attnums the stats should be computed on */
int2vector * attrs = mvstats[i].stakeys;
@@ -66,8 +67,16 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (mvstats->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (mvstats->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(mvstats[i].mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(mvstats[i].mvoid, deps, mcvlist, histogram, attrs, stats);
+
+#ifdef MVSTATS_DEBUG
+ print_mv_histogram_info(histogram);
+#endif
}
}
@@ -149,7 +158,7 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
* Skip statistics that were not computed yet (if only stats
* that were already built were requested)
*/
- if (built_only && (! (stats->mcv_built || stats->deps_built)))
+ if (built_only && (! (stats->mcv_built || stats->deps_built || stats->hist_built)))
continue;
/* double the array size if needed */
@@ -161,10 +170,15 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
result[*nstats].mvoid = HeapTupleGetOid(htup);
result[*nstats].stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+
result[*nstats].deps_enabled = stats->deps_enabled;
result[*nstats].mcv_enabled = stats->mcv_enabled;
+ result[*nstats].hist_enabled = stats->hist_enabled;
+
result[*nstats].deps_built = stats->deps_built;
result[*nstats].mcv_built = stats->mcv_built;
+ result[*nstats].hist_built = stats->hist_built;
+
*nstats += 1;
}
@@ -178,9 +192,16 @@ list_mv_stats(Oid relid, int *nstats, bool built_only)
return result;
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -213,19 +234,31 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
/* Is there already a pg_mv_statistic tuple for this attribute? */
oldtup = SearchSysCache1(MVSTATOID,
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..2a7f660
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,1800 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+/*
+ * Multivariate histograms
+ *
+ * Histograms are a collection of buckets, represented by n-dimensional
+ * rectangles. Each rectangle is delimited by a min/max value in each
+ * dimension, stored in an array, so that the bucket includes values
+ * fulfilling condition
+ *
+ * min[i] <= value[i] <= max[i]
+ *
+ * where 'i' is the dimension. In 1D this corresponds to a simple
+ * interval, in 2D to a rectangle, and in 3D to a block. If you can
+ * imagine this in 4D, congrats!
+ *
+ * In addition to the bounaries, each bucket tracks additional details:
+ *
+ * * frequency (fraction of tuples it matches)
+ * * whether the boundaries are inclusive or exclusive
+ * * whether the dimension contains only NULL values
+ * * number of distinct values in each dimension (for building)
+ *
+ * and possibly some additional information.
+ *
+ * We do expect to support multiple histogram types, with different
+ * features etc. The 'type' field is used to identify those types.
+ * Technically some histogram types might use completely different
+ * bucket representation, but that's not expected at the moment.
+ *
+ * Although the current implementation builds non-overlapping buckets,
+ * the code does not rely on the non-overlapping nature - there are
+ * interesting types of histograms / histogram building algorithms
+ * producing overlapping buckets.
+ *
+ * TODO Currently the histogram does not include information about what
+ * part of the table it covers (because the frequencies are
+ * computed from the rows that may be filtered by MCV list). Seems
+ * wrong, possibly causing misestimates (when not matching the MCV
+ * list, we'll probably get much higher selectivity).
+ *
+ *
+ * Estimating selectivity
+ * ----------------------
+ * With histograms, we always "match" a whole bucket, not indivitual
+ * rows (or values), irrespectedly of the type of clause. Therefore we
+ * can't use the optimizations for equality clauses, as in MCV lists.
+ *
+ * The current implementation uses histograms to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (a) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ * (b) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ * When used on low-cardinality data, histograms usually perform
+ * considerably worse than MCV lists (which are a good fit for this
+ * kind of data). This is especially true on categorical data, where
+ * ordering of the values is only loosely related to meaning of the
+ * data, as proper ordering is crucial for histograms.
+ *
+ * On high-cardinality data the histograms are usually a better choice,
+ * because MCV lists can't accurately represent the distribution.
+ *
+ * By evaluating a clause on a bucket, we may get one of three results:
+ *
+ * (a) FULL_MATCH - The bucket definitely matches the clause.
+ *
+ * (b) PARTIAL_MATCH - The bucket matches the clause, but not
+ * necessarily all the tuples it represents.
+ *
+ * (c) NO_MATCH - The bucket definitely does not match the clause.
+ *
+ * This may be illustrated using a range [1, 5], which is essentially
+ * a 1D bucket. With clause
+ *
+ * WHERE (a < 10) => FULL_MATCH (all range values are below
+ * 10, so the whole bucket matches)
+ *
+ * WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ * the clause, but we don't know how many)
+ *
+ * WHERE (a < 0) => NO_MATCH (all range values are above 1, so
+ * no values from the bucket match)
+ *
+ * Some clauses may produce only some of those results - for example
+ * equality clauses may never produce FULL_MATCH as we always hit only
+ * part of the bucket, not all the values. This results in less accurate
+ * estimates compared to MCV lists, where we can hit a MCV items exactly
+ * (an extreme case of that is 'full match').
+ *
+ * There are clauses that may not produce any PARTIAL_MATCH results.
+ * A nice example of that is 'IS [NOT] NULL' clause, which either
+ * matches the bucket completely (FULL_MATCH) or not at all (NO_MATCH),
+ * thanks to how the NULL-buckets are constructed.
+ *
+ * TODO The IS [NOT] NULL clause is not yet implemented, but should be
+ * rather trivial to.
+ *
+ * Computing the total selectivity estimate is trivial - simply sum
+ * selectivities from all the FULL_MATCH and PARTIAL_MATCH buckets, but
+ * multiply the PARTIAL_MATCH buckets by 0.5 to minimize average error.
+ *
+ *
+ * NULL handling
+ * -------------
+ * Buckets may not contain tuples with NULL and non-NULL values in
+ * a single dimension (attribute). To handle this, the histogram may
+ * contain NULL-buckets, i.e. buckets with one or more NULL-only
+ * dimensions.
+ *
+ * The maximum number of NULL-buckets is determined by the number of
+ * attributes the histogram is built on. For N-dimensional histogram,
+ * the maximum number of NULL-buckets is 2^N. So for 8 attributes
+ * (which is the current value of MVSTATS_MAX_DIMENSIONS), there may be
+ * up to 256 NULL-buckets.
+ *
+ * Those buckets are only built if needed - if there are no NULL values
+ * in the data, no such buckets are built.
+ *
+ *
+ * Serialization
+ * -------------
+ * After building, the histogram is serialized into a more efficient
+ * form (dedup boundary values etc.). See serialize_mv_histogram() for
+ * more details about how it's done.
+ *
+ * Serialized histograms are marked with 'magic' constant, to make it
+ * easier to check the bytea really is a histogram in serialized form.
+ *
+ *
+ * TODO This structure is used both when building the histogram, and
+ * then when using it to compute estimates. That's why the last
+ * few elements are not used once the histogram is built.
+ *
+ * Add pointer to 'private' data, meant for private data for
+ * other algorithms for building the histogram. It also removes
+ * the bogus / unnecessary fields.
+ *
+ * TODO The limit on number of buckets is quite arbitrary, aiming for
+ * sufficient accuracy while still being fast. Probably should be
+ * replaced with a dynamic limit dependent on statistics target,
+ * number of attributes (dimensions) and statistics target
+ * associated with the attributes. Also, this needs to be related
+ * to the number of sampled rows, by either clamping it to a
+ * reasonable number (after seeing the number of rows) or using
+ * it when computing the number of rows to sample. Something like
+ * 10 rows per bucket seems reasonable.
+ *
+ * TODO Add MVSTAT_HIST_ROWS_PER_BUCKET tracking minimal number of
+ * tuples per bucket (also, see the previous TODO).
+ *
+ * TODO We may replace the bool arrays with a suitably large data type
+ * (say, uint16 or uint32) and get rid of the allocations. It's
+ * unlikely we'll ever support more than 32 columns as that'd
+ * result in poor precision, huge histograms (splitting each
+ * dimension once would mean 2^32 buckets), and very expensive
+ * estimation. MCVItem already does it this way.
+ *
+ * Update: Actually, this is not 100% true, because we're splitting
+ * a single bucket, not all the buckets at the same time. So each
+ * split simply adds one new bucket, and we choose the bucket that
+ * is most in need of a slit. So even with 32 columns this might
+ * give reasonable accuracy, maybe? After 1000 splits we'll get
+ * about 1001 buckets, and some may be quite large (if that area
+ * frequency has low frequency of tuples).
+ *
+ * There are other challenges though - e.g. with this many columns
+ * it's more likely to reference both label/non-label columns,
+ * which is rather quirky (especially with histograms).
+ *
+ * However, while this would save some space for histograms built
+ * on many columns, it won't save anything for up to 4 columns
+ * (actually, on less than 3 columns it's probably wasteful).
+ *
+ * TODO Maybe the distinct stats (both for combination of all columns
+ * and for combinations of various subsets of columns) should be
+ * moved to a separate structure (next to histogram/MCV/...) to
+ * make it useful even without a histogram computed etc.
+ */
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(int32))
+ * - max boundary indexes (2 * ndim * sizeof(int32))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(int32) + 3 * sizeof(bool)) +
+ * 2 * sizeof(float)
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(int32) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) ((float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((int32*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* some debugging methods */
+#ifdef MVSTATS_DEBUG
+static void print_mv_histogram_info(MVHistogram histogram);
+#endif
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /* index of the dimension the bucket was split previously */
+ int last_split_dimension;
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ *
+ * XXX Maybe it could be useful for improving ndistinct estimates for
+ * combinations of columns (e.g. in GROUP BY queries). It would
+ * probably mean tracking 2^N values for each bucket, and even if
+ * those values might be stores in 1B (which is unlikely) it's
+ * still a lot of space (considering the expected number of
+ * buckets). So maybe that might be tracked just at the top level.
+ *
+ * TODO Consider tracking ndistincts for all attribute combinations.
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, by looking at the number of
+ * distinct values (combination of column values for bucket, column
+ * values for a dimension). This is somehow naive, but seems to work
+ * quite well. See the discussion at select_bucket_to_partition and
+ * partition_bucket for more details about alternative algorithms.
+ *
+ * So the current algorithm looks like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (max distinct combinations)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (max distinct values)
+ * split the bucket into two buckets
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * The initial bucket may contain NULL values, so we have to create
+ * buckets with NULL-only dimensions.
+ *
+ * FIXME We may need up to 2^ndims buckets - check that there are
+ * enough buckets (MVSTAT_HIST_MAX_BUCKETS >= 2^ndims).
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets]
+ = partition_bucket(bucket, attrs, stats);
+
+ histogram->nbuckets += 1;
+ }
+
+ /* finalize the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data = ((HistogramBuild)histogram->buckets[i]->build_data);
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+bytea *
+fetch_mv_histogram(Oid mvoid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ bytea *stahist = NULL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(mvoid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticOidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ bool isnull = false;
+ Datum hist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ stahist = DatumGetByteaP(hist);
+
+ break;
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /*
+ * TODO Maybe save the histogram into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?.
+ */
+
+ return stahist;
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm
+ * is simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use 32-bit values for the indexes in step (3), although we
+ * could probably use just 16 bits as we don't allow more than 8k
+ * buckets in the histogram max_buckets (well, we might increase this
+ * to 16k and still fit into signed 16-bits). But let's be lazy and rely
+ * on the varlena compression to kick in. If most bytes will be 0x00
+ * so it should work nicely.
+ *
+ *
+ * Deduplication in serialization
+ * ------------------------------
+ * The deduplication is very effective and important here, because every
+ * time we split a bucket, we keep all the boundary values, except for
+ * the dimension that was used for the split. Another way to look at
+ * this is that each split introduces 1 new value (the value used to do
+ * the split). A histogram with M buckets was created by (M-1) splits
+ * of the initial bucket, and each bucket has 2*N boundary values. So
+ * assuming the initial bucket does not have any 'collapsed' dimensions,
+ * the number of distinct values is
+ *
+ * (2*N + (M-1))
+ *
+ * but the total number of boundary values is
+ *
+ * 2*N*M
+ *
+ * which is clearly much higher. For a histogram on two columns, with
+ * 1024 buckets, it's 1027 vs. 4096. Of course, we're not saving all
+ * the difference (because we'll use 32-bit indexes into the values).
+ * But with large values (e.g. stored as varlena), this saves a lot.
+ *
+ * An interesting feature is that the total number of distinct values
+ * does not really grow with the number of dimensions, except for the
+ * size of the initial bucket. After that it only depends on number of
+ * buckets (i.e. number of splits).
+ *
+ * XXX Of course this only holds for the current histogram building
+ * algorithm. Algorithms doing the splits differently (e.g.
+ * producing overlapping buckets) may behave differently.
+ *
+ * TODO This only confirms we can use the uint16 indexes. The worst
+ * that could happen is if all the splits happened by a single
+ * dimension. To exhaust the uint16 this would require ~64k
+ * splits (needs to be reflected in MVSTAT_HIST_MAX_BUCKETS).
+ *
+ * TODO We don't need to use a separate boolean for each flag, instead
+ * use a single char and set bits.
+ *
+ * TODO We might get a bit better compression by considering the actual
+ * data type length. The current implementation treats all data
+ * types passed by value as requiring 8B, but for INT it's actually
+ * just 4B etc.
+ *
+ * OTOH this is only related to the lookup table, and most of the
+ * space is occupied by the buckets (with int16 indexes).
+ *
+ *
+ * Varlena compression
+ * -------------------
+ * This encoding may prevent automatic varlena compression (similarly
+ * to JSONB), because first part of the serialized bytea will be an
+ * array of unique values (although sorted), and pglz decides whether
+ * to compress by trying to compress the first part (~1kB or so). Which
+ * is likely to be poor, due to the lack of repetition.
+ *
+ * One possible cure to that might be storing the buckets first, and
+ * then the deduplicated arrays. The buckets might be better suited
+ * for compression.
+ *
+ * On the other hand the encoding scheme is a context-aware compression,
+ * usually compressing to ~30% (or less, with large data types). So the
+ * lack of pglz compression may be OK.
+ *
+ * XXX But maybe we don't really want to compress this, to save on
+ * planning time?
+ *
+ * TODO Try storing the buckets / deduplicated arrays in reverse order,
+ * measure impact on compression.
+ *
+ *
+ * Deserialization
+ * ---------------
+ * The deserialization is currently implemented so that it reconstructs
+ * the histogram back into the same structures - this involves quite
+ * a few of memcpy() and palloc(), but maybe we could create a special
+ * structure for the serialized histogram, and access the data directly,
+ * without the unpacking.
+ *
+ * Not only it would save some memory and CPU time, but might actually
+ * work better with CPU caches (not polluting the caches).
+ *
+ * TODO Try to keep the compressed form, instead of deserializing it to
+ * MVHistogram/MVBucket.
+ *
+ *
+ * General TODOs
+ * -------------
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs
+ * (we won't use them, but we don't know how many are there),
+ * and then collect all non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval)
+ /*
+ * passed by value, so just Datum array (int4, int8, ...)
+ *
+ * TODO Might save a few bytes here, by storing just typlen
+ * bytes instead of whole Datum (8B) on 64-bits.
+ */
+ info[i].nbytes = info[i].nvalues * sizeof(Datum);
+ else if (info[i].typlen > 0)
+ /* pased by reference, but fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized histogram exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], sizeof(Datum));
+ data += sizeof(Datum);
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the histogram buckets */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ *BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ int idx;
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ /* min boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* FIXME free the values/counts arrays here */
+
+ return output;
+}
+
+/*
+ * Reverse to serialize histogram. This essentially expands the serialized
+ * form back to MVHistogram / MVBucket.
+ */
+MVHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+ Datum **values = NULL;
+
+ MVHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVHistogramData, buckets));
+ tmp += offsetof(MVHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* let's parse the value arrays */
+ values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * info[i].nvalues);
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+
+ /* allocate space for the buckets */
+ histogram->buckets = (MVBucket*)palloc0(sizeof(MVBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ bucket->nullsonly = (bool*) palloc0(sizeof(bool) * ndims);
+ bucket->min_inclusive = (bool*) palloc0(sizeof(bool) * ndims);
+ bucket->max_inclusive = (bool*) palloc0(sizeof(bool) * ndims);
+
+ bucket->min = (Datum*) palloc0(sizeof(Datum) * ndims);
+ bucket->max = (Datum*) palloc0(sizeof(Datum) * ndims);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+
+ memcpy(bucket->nullsonly, BUCKET_NULLS_ONLY(tmp, ndims),
+ sizeof(bool) * ndims);
+
+ memcpy(bucket->min_inclusive, BUCKET_MIN_INCL(tmp, ndims),
+ sizeof(bool) * ndims);
+
+ memcpy(bucket->max_inclusive, BUCKET_MAX_INCL(tmp, ndims),
+ sizeof(bool) * ndims);
+
+ /* translate the indexes to values */
+ for (j = 0; j < ndims; j++)
+ {
+ if (! bucket->nullsonly[j])
+ {
+ bucket->min[j] = values[j][BUCKET_MIN_INDEXES(tmp, ndims)[j]];
+ bucket->max[j] = values[j][BUCKET_MAX_INDEXES(tmp, ndims)[j]];
+ }
+ }
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller
+ * buckets.
+ *
+ * TODO Add ndistinct estimation, probably the one described in "Towards
+ * Estimation Error Guarantees for Distinct Values, PODS 2000,
+ * p. 268-279" (the ones called GEE, or maybe AE).
+ *
+ * TODO The "combined" ndistinct is more likely to scale with the number
+ * of rows (in the table), because a single column behaving this
+ * way is sufficient for such behavior.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ /*
+ * The initial bucket was not split at all, so we'll start with the
+ * first dimension in the next round (index = 0).
+ */
+ data->last_split_dimension = -1;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * TODO Fix to handle arbitrarily-sized histograms (not just 2D ones)
+ * and call the right output procedures (for the particular type).
+ *
+ * TODO This should somehow fetch info about the data types, and use
+ * the appropriate output functions to print the boundary values.
+ * Right now this prints the 8B value as an integer.
+ *
+ * TODO Also, provide a special function for 2D histogram, printing
+ * a gnuplot script (with rectangles).
+ *
+ * TODO For string types (once supported) we can sort the strings first,
+ * assign them a sequence of integers and use the original values
+ * as labels.
+ */
+#ifdef MVSTATS_DEBUG
+static void
+print_mv_histogram_info(MVHistogram histogram)
+{
+ int i = 0;
+
+ elog(WARNING, "histogram nbuckets=%d", histogram->nbuckets);
+
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ MVBucket bucket = histogram->buckets[i];
+ elog(WARNING, " bucket %d : ndistinct=%f ntuples=%d min=[%ld, %ld], max=[%ld, %ld] distinct=[%d,%d]",
+ i, bucket->ndistinct, bucket->numrows,
+ bucket->min[0], bucket->min[1], bucket->max[0], bucket->max[1],
+ bucket->ndistincts[0], bucket->ndistincts[1]);
+ }
+}
+#endif
+
+/*
+ * A very simple partitioning selection criteria - choose the bucket
+ * with the highest number of distinct values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int ndistinct = 1; /* if ndistinct=1, we can't split the bucket */
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+ /* if the ndistinct count is higher, use this bucket */
+ if (data->ndistinct > ndistinct) {
+ bucket = buckets[i];
+ ndistinct = data->ndistinct;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - splits the dimensions in
+ * a round-robin manner (considering only those with ndistinct>1). That
+ * is first a dimension 0 is split, then 1, 2, ... until reaching the
+ * end of attribute list, and then wrapping back to 0. Of course,
+ * dimensions with a single distinct value are skipped.
+ *
+ * This is essentially what Muralikrishna/DeWitt described in their SIGMOD
+ * article (M. Muralikrishna, David J. DeWitt: Equi-Depth Histograms For
+ * Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD
+ * Conference 1988: 28-36).
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * This splits the bucket by tweaking the existing one, and returning the
+ * new bucket (essentially shrinking the existing one in-place and returning
+ * the other "half" as a new bucket). The caller is responsible for adding
+ * the new bucket into the list of buckets.
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case of
+ * strongly dependent columns - e.g. y=x).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g. to
+ * split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split, in a round robin manner.
+ * We'll use the first one with (ndistinct > 1).
+ *
+ * If we happen to wrap around, something clearly went wrong (we
+ * can't mess with the last_split_dimension directly, because we
+ * couldn't do this check).
+ */
+ dimension = data->last_split_dimension;
+ while (true)
+ {
+ dimension = (dimension + 1) % numattrs;
+
+ if (data->ndistincts[dimension] > 1)
+ break;
+
+ /* if we ran the previous split dimension, it's infinite loop */
+ Assert(dimension != data->last_split_dimension);
+ }
+
+ /* Remember the dimension for the next split of this bucket. */
+ data->last_split_dimension = dimension;
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ split_value = values[0].value;
+ for (i = 1; i < data->numrows; i++)
+ {
+ /* count distinct values */
+ if (values[i].value != values[i-1].value)
+ ndistinct += 1;
+
+ /* once we've seen 1/2 distinct values (and use the value) */
+ if (ndistinct > data->ndistincts[dimension] / 2)
+ {
+ split_value = values[i].value;
+ break;
+ }
+
+ /* keep track how many rows belong to the first bucket */
+ nrows += 1;
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->last_split_dimension = ((HistogramBuild)bucket->build_data)->last_split_dimension;
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and
+ * non-NULL values in a single dimension. Each dimension may either be
+ * marked as 'nulls only', and thus containing only NULL values, or
+ * it must not contain any NULL values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns,
+ * it's necessary to build those NULL-buckets. This is done in an
+ * iterative way using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL
+ * and non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not
+ * marked as NULL-only, mark it as NULL-only and run the
+ * algorithm again (on this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the
+ * bucket into two parts - one with NULL values, one with
+ * non-NULL values (replacing the current one). Then run
+ * the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions
+ * should be quite low - limited by the number of NULL-buckets. Also,
+ * in each branch the number of nested calls is limited by the number
+ * of dimensions (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The
+ * number of buckets produced by this algorithm is rather limited - with
+ * N dimensions, there may be only 2^N such buckets (each dimension may
+ * be either NULL or non-NULL). So with 8 dimensions (current value of
+ * MVSTATS_MAX_DIMENSIONS) there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further
+ * optimizing the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL
+ * in a dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only,
+ * but is not yet marked like that. It's enough to mark it and
+ * repeat the process recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in
+ * the dimension, one with non-NULL values. We don't need to sort
+ * the data or anything, but otherwise it's similar to what's done
+ * in partition_bucket().
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each
+ * bucket (NULL is not a value, so 0, and the other bucket got
+ * all the ndistinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
index 4466cee..4f60bd1 100644
--- a/src/backend/utils/mvstats/mcv.c
+++ b/src/backend/utils/mvstats/mcv.c
@@ -961,6 +961,7 @@ MCVList deserialize_mv_mcvlist(bytea * data)
for (i = 0; i < nitems; i++)
{
+ /* FIXME allocate as a single chunk (minimize palloc overhead) */
MCVItem item = (MCVItem)palloc0(sizeof(MCVItemData));
item->values = (Datum*)palloc0(sizeof(Datum)*ndims);
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index c6e7d74..84579da 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -36,13 +36,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -50,6 +53,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -65,15 +69,19 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-#define Natts_pg_mv_statistic 9
+#define Natts_pg_mv_statistic 13
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_deps_enabled 2
#define Anum_pg_mv_statistic_mcv_enabled 3
-#define Anum_pg_mv_statistic_mcv_max_items 4
-#define Anum_pg_mv_statistic_deps_built 5
-#define Anum_pg_mv_statistic_mcv_built 6
-#define Anum_pg_mv_statistic_stakeys 7
-#define Anum_pg_mv_statistic_stadeps 8
-#define Anum_pg_mv_statistic_stamcv 9
+#define Anum_pg_mv_statistic_hist_enabled 4
+#define Anum_pg_mv_statistic_mcv_max_items 5
+#define Anum_pg_mv_statistic_hist_max_buckets 6
+#define Anum_pg_mv_statistic_deps_built 7
+#define Anum_pg_mv_statistic_mcv_built 8
+#define Anum_pg_mv_statistic_hist_built 9
+#define Anum_pg_mv_statistic_stakeys 10
+#define Anum_pg_mv_statistic_stadeps 11
+#define Anum_pg_mv_statistic_stamcv 12
+#define Anum_pg_mv_statistic_stahist 13
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index b2aa815..a1b5e2b 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2718,6 +2718,8 @@ DATA(insert OID = 3378 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies show");
DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 6ff29d6..673e546 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -26,10 +26,12 @@ typedef struct MVStatsData {
/* statistics requested in ALTER TABLE ... ADD STATISTICS */
bool deps_enabled; /* analyze functional dependencies */
bool mcv_enabled; /* analyze MCV lists */
+ bool hist_enabled; /* analyze histogram */
/* available statistics (computed by ANALYZE) */
bool deps_built; /* functional dependencies available */
bool mcv_built; /* MCV list is already available */
+ bool hist_built; /* histogram is already available */
} MVStatsData;
typedef struct MVStatsData *MVStats;
@@ -109,6 +111,68 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -118,14 +182,18 @@ bytea * fetch_mv_rules(Oid mvoid);
bytea * fetch_mv_dependencies(Oid mvoid);
bytea * fetch_mv_mcvlist(Oid mvoid);
+bytea * fetch_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
@@ -137,6 +205,7 @@ int mv_get_index(AttrNumber varattno, int2vector * stakeys);
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -146,10 +215,15 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..ff2f0cc
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,210 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE mv_histogram ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+ALTER TABLE mv_histogram ADD STATISTICS (dependencies, max_buckets 200) ON (a, b, c);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 10) ON (a, b, c);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 100000) ON (a, b, c);
+ERROR: minimum number of buckets is 16384
+-- correct command
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=10000
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=1001
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=1001
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=10000
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=3492
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built | pg_mv_stats_histogram_info
+--------------+------------+----------------------------
+ t | t | nbuckets=3433
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c, d);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6d9ab2f..ccc778a 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1359,7 +1359,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 63727a4..aeb89f8 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -111,4 +111,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 5b07b3b..ee1468d 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -155,3 +155,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..78890c8
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,179 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (unknown_column);
+
+-- single column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a);
+
+-- single column, duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE mv_histogram ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- missing histogram statistics
+ALTER TABLE mv_histogram ADD STATISTICS (dependencies, max_buckets 200) ON (a, b, c);
+
+-- invalid max_buckets value / too low
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 10) ON (a, b, c);
+
+-- invalid max_buckets value / too high
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 100000) ON (a, b, c);
+
+-- correct command
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built, pg_mv_stats_histogram_info(stahist)
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c, d);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DELETE FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+DROP TABLE mv_histogram;
--
2.0.5
0005-multi-statistics-estimation.patchtext/x-diff; name=0005-multi-statistics-estimation.patchDownload
>From db24cc534985ce97b238ea539b4216d8e33397a5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 5/5] multi-statistics estimation
The general idea is that a probability (which
is what selectivity is) can be split into a product of
conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part
may be simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute
the original probability.
The implementation works in the other direction, though.
We know what probability P(A & B & C) we need to compute,
and also what statistics are available.
So we search for a combinations of statistics, covering
the clauses in an optimal way (most clauses covered, most
dependencies exploited).
There are two possible approaches - exhaustive and greedy.
The exhaustive one walks through all permutations of
stats using dynamic programming, so it's guaranteed to
find the optimal solution, but it soon gets very slow as
it's roughly O(N!). The dynamic programming may improve
that a bit, but it's still far too expensive for large
numbers of statistics (on a single table).
The greedy algorithm is very simple - in every step choose
the best solution. That may not guarantee the best solution
globally (but maybe it does?), but it only needs N steps
to find the solution, so it's very fast (processing the
selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with
respect to runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply
them to the clauses using the conditional probabilities.
We process the selected stats one by one, and for each
we select the estimated clauses and conditions. See
clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to
be covered by a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single
multivariate statistics.
Clauses not covered by a single statistics at this level
will be passed to clause_selectivity() but this will treat
them as a collection of simpler clauses (connected by AND
or OR), and the clauses from the previous level will be
used as conditions.
So using the same example, the last clause will be passed
to clause_selectivity() with 'clause1' and 'clause2' as
conditions, and it will be processed using multivariate
stats if possible.
The other limitation is that all the expressions have to
be mv-compatible, i.e. there can't be a mix of expressions.
Fixing this should be relatively simple - just split the
list into two parts (mv-compatible/incompatible), as at
the top level.
---
src/backend/optimizer/path/clausesel.c | 1533 ++++++++++++++++++++++++++++++--
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
7 files changed, 1513 insertions(+), 98 deletions(-)
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index ea4d588..98ad802 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -30,7 +30,7 @@
#include "utils/typcache.h"
#include "parser/parsetree.h"
-
+#include "miscadmin.h"
#include <stdio.h>
@@ -63,23 +63,25 @@ static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
Oid varRelid, Oid *relid, SpecialJoinInfo *sjinfo,
int type);
+static Bitmapset *clause_mv_get_attnums(PlannerInfo *root, Node *clause);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, int nmvstats, MVStats mvstats,
SpecialJoinInfo *sjinfo);
-static int choose_mv_statistics(int nmvstats, MVStats mvstats,
- Bitmapset *attnums);
static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
List *clauses, Oid varRelid,
List **mvclauses, MVStats mvstats, int types);
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStats mvstats);
+ MVStats mvstats, List *clauses, List *conditions);
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStats mvstats,
+ MVStats mvstats,
+ List *clauses, List *conditions,
bool *fullmatch, Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStats mvstats);
+ MVStats mvstats,
+ List *clauses, List *conditions);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -92,6 +94,31 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static mv_solution_t *choose_mv_statistics(PlannerInfo *root,
+ int nmvstats, MVStats mvstats,
+ List *clauses, List *conditions,
+ Oid varRelid,
+ SpecialJoinInfo *sjinfo, int type);
+
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
+
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
#define MIN(x, y) (((x) < (y)) ? (x) : (y))
@@ -220,7 +247,8 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
@@ -234,13 +262,8 @@ clauselist_selectivity(PlannerInfo *root,
/* attributes in mv-compatible clauses */
Bitmapset *mvattnums = NULL;
- /*
- * If there's exactly one clause, then no use in trying to match up
- * pairs, so just go directly to clause_selectivity().
- */
- if (list_length(clauses) == 1)
- return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
+ /* local conditions, accumulated and passed to clauses in this list */
+ List *conditions_local = list_copy(conditions);
/*
* Collect attributes referenced by mv-compatible clauses (looking
@@ -288,30 +311,185 @@ clauselist_selectivity(PlannerInfo *root,
/* see choose_mv_statistics() for details */
if (nmvstats > 0)
{
- int idx = choose_mv_statistics(nmvstats, mvstats, mvattnums);
+ mv_solution_t * solution
+ = choose_mv_statistics(root, nmvstats, mvstats,
+ clauses, conditions,
+ varRelid, sjinfo,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
- if (idx >= 0) /* we have a matching stats */
+ /*
+ * FIXME This probaly leaks memory a bit - the lists of clauses
+ * should be freed properly.
+ */
+
+ /* we have a good solution stats */
+ if (solution != NULL)
{
- MVStats mvstat = &mvstats[idx];
+ int i, j, k;
+
+ for (i = 0; i < solution->nstats; i++)
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+ List *mvclauses_new = NIL;
+ List *mvclauses_conditions = NIL;
+ Bitmapset *stat_attnums = NULL;
+
+ MVStats mvstat = &mvstats[solution->stats[i]];
+
+ /* build attnum bitmapset for this statistics */
+ for (k = 0; k < mvstat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ mvstat->stakeys->values[k]);
+
+ /*
+ * Append the compatible conditions (passed from above)
+ * to mvclauses_conditions.
+ */
+ foreach (l, conditions)
+ {
+ Node *c = (Node*)lfirst(l);
+ Bitmapset *tmp = clause_mv_get_attnums(root, c);
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
+ if (bms_is_subset(tmp, stat_attnums))
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, c);
+
+ bms_free(tmp);
+ }
+
+ /* split the clauselist into regular and mv-clauses
+ *
+ * We keep the list of clauses (we don't remove the
+ * clauses yet, because we want to use the clauses
+ * as conditions of other clauses).
+ *
+ * FIXME Do this only once, i.e. filter the clauses
+ * once (selecting clauses covered by at least
+ * one statistics) and then convert them into
+ * smaller per-statistics lists of conditions
+ * and estimated clauses.
+ */
+ clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * We've chosen the statistics to match the clauses, so
+ * each statistics from the solution should have at least
+ * one new clause (not covered by the previous stats).
+ */
+ Assert(mvclauses != NIL);
+
+ /*
+ * Mvclauses now contains only clauses compatible
+ * with the currently selected stats, but we have to
+ * split that into conditions (already matched by
+ * the previous stats), and the new clauses we need
+ * to estimate using this stats.
+ */
+ foreach (l, mvclauses)
+ {
+ bool covered = false;
+ Node *clause = (Node *) lfirst(l);
+ Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
+
+ /*
+ * If already covered by previous stats, add it to
+ * conditions.
+ *
+ * TODO Maybe this could be relaxed a bit? Because
+ * with complex and/or clauses, this might
+ * mean no statistics actually covers such
+ * complex clause.
+ */
+ for (j = 0; j < i; j++)
+ {
+ int k;
+ Bitmapset *stat_attnums = NULL;
+ MVStats prev_stat = &mvstats[solution->stats[j]];
+
+ for (k = 0; k < prev_stat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ prev_stat->stakeys->values[k]);
+
+ covered = bms_is_subset(clause_attnums, stat_attnums);
+
+ bms_free(stat_attnums);
+
+ if (covered)
+ break;
+ }
+
+ if (covered)
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, clause);
+ else
+ mvclauses_new
+ = lappend(mvclauses_new, clause);
+ }
+
+ /*
+ * We need at least one new clause (not just conditions).
+ */
+ Assert(mvclauses_new != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ mvclauses_new,
+ mvclauses_conditions);
+ }
+
+ /*
+ * And now finally remove all the mv-compatible clauses.
+ *
+ * This only repeats the same split as above, but this
+ * time we actually use the result list (and feed it to
+ * the next call).
+ */
+ for (i = 0; i < solution->nstats; i++)
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+ MVStats mvstat = &mvstats[solution->stats[i]];
- /* we've chosen the histogram to match the clauses */
- Assert(mvclauses != NIL);
+ /* split the list into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
- /* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ /*
+ * Add the clauses to the conditions (to be passed
+ * to regular clauses), irrespectedly whether it
+ * will be used as a condition or a clause here.
+ *
+ * We only keep the remaining conditions in the
+ * clauses (we keep what clauselist_mv_split returns)
+ * so we add each MV condition exactly once.
+ */
+ foreach (l, mvclauses)
+ conditions_local = lappend(conditions_local,
+ (Node*)lfirst(l));
+ }
}
}
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ {
+ Selectivity s = clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo,
+ conditions_local);
+ list_free(conditions_local);
+ return s;
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -323,7 +501,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions_local);
/*
* Check for being passed a RestrictInfo.
@@ -478,6 +657,9 @@ clauselist_selectivity(PlannerInfo *root,
rqlist = rqnext;
}
+ /* free the local conditions */
+ list_free(conditions_local);
+
return s1;
}
@@ -688,7 +870,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -818,7 +1001,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -827,7 +1011,8 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
@@ -839,6 +1024,13 @@ clause_selectivity(PlannerInfo *root,
*/
ListCell *arg;
+ /* TODO Split the clause list into mv-compatible part, pretty
+ * much just like in clauselist_selectivity(), and call
+ * clauselist_mv_selectivity(). It has to be taught about
+ * OR-semantics (right now it assumes AND) or maybe just
+ * create a fake OR clause here, and pass it in.
+ */
+
s1 = 0.0;
foreach(arg, ((BoolExpr *) clause)->args)
{
@@ -846,7 +1038,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) lfirst(arg),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
s1 = s1 + s2 - s1 * s2;
}
@@ -958,7 +1151,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -967,7 +1161,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
/* Cache the result if possible */
@@ -1149,9 +1344,67 @@ clause_selectivity(PlannerInfo *root,
* that from the most selective clauses first, because that'll
* eliminate the buckets/items sooner (so we'll be able to skip
* them without inspection, which is more expensive).
+ *
+ * TODO All this is based on the assumption that the statistics represent
+ * the necessary dependencies, i.e. that if two colunms are not in
+ * the same statistics, there's no dependency. If that's not the
+ * case, we may get misestimates, just like before. For example
+ * assume we have a table with three columns [a,b,c] with exactly
+ * the same values, and statistics on [a,b] and [b,c]. So somthing
+ * like this:
+ *
+ * CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+ *
+ * ALTER TABLE test ADD STATISTICS (mcv) ON (a,b);
+ * ALTER TABLE test ADD STATISTICS (mcv) ON (b,c);
+ *
+ * ANALYZE test;
+ *
+ * EXPLAIN ANALYZE SELECT * FROM test
+ * WHERE (a < 10) AND (b < 20) AND (c < 10);
+ *
+ * The problem here is that the only shared column between the two
+ * statistics is 'b' so the probability will be computed like this
+ *
+ * P[(a < 10) & (b < 20) & (c < 10)]
+ * = P[(a < 10) & (b < 20)] * P[(c < 10) | (a < 10) & (b < 20)]
+ * = P[(a < 10) & (b < 20)] * P[(c < 10) | (b < 20)]
+ *
+ * or like this
+ *
+ * P[(a < 10) & (b < 20) & (c < 10)]
+ * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20) & (c < 10)]
+ * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20)]
+ *
+ * In both cases the conditional probabilities will be evaluated as
+ * 0.5, because they lack the other column (which would make it 1.0).
+ *
+ * Theoretically it might be possible to transfer the dependency,
+ * e.g. by building bitmap for [a,b] and then combine it with [b,c]
+ * by doing something like this:
+ *
+ * 1) build bitmap on [a,b] using [(a<10) & (b < 20)]
+ * 2) for each element in [b,c] check the bitmap
+ *
+ * But that's certainly nontrivial - for example the statistics may
+ * be different (MCV list vs. histogram) and/or the items may not
+ * match (e.g. MCV items or histogram buckets will be built
+ * differently). Also, for one value of 'b' there might be multiple
+ * MCV items (because of the other column values) with different
+ * bitmap values (some will match, some won't) - so it's not exactly
+ * bitmap but a partial match.
+ *
+ * Maybe a hash table with number of matches and mismatches (or
+ * maybe sums of frequencies) would work? The step (2) would then
+ * lookup the values and use that to weight the item somehow.
+ *
+ * Currently the only solution is to build statistics on all three
+ * columns.
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStats mvstats,
+ List *clauses, List *conditions)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
@@ -1169,7 +1422,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
*/
/* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions,
&fullmatch, &mcv_low);
/*
@@ -1182,7 +1436,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStats mvstats)
/* FIXME if (fullmatch) without matching MCV item, use the mcv_low
* selectivity as upper bound */
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions);
/* TODO clamp to <= 1.0 (or more strictly, when possible) */
return s1 + s2;
@@ -1232,6 +1487,665 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
}
/*
+ * Selects the best combination of multivariate statistics, where
+ * 'best' means:
+ *
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ *
+ * There may be other optimality criteria, not considered in the initial
+ * implementation (more on that 'weaknesses' section).
+ *
+ * This is pretty much equal to splitting the probability of clauses
+ * (aka selectivity) into a sequence of conditional probabilities, like
+ * this
+ *
+ * P(A,B,C,D) = P(A,B) * P(C|A,B) * P(D|A,B,C)
+ *
+ * and removing the attributes not referenced by the existing stats,
+ * under the assumption that there's no dependency (otherwise the DBA
+ * would create the stats).
+ *
+ *
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with
+ * maximum 'depth' equal to the number of multi-variate statistics
+ * available on the table.
+ *
+ * It explores all the possible permutations of the stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it
+ * matches are divided into 'conditions' (clauses already matched by at
+ * least one previous statistics) and clauses that are estimated.
+ *
+ * Then several checks are performed:
+ *
+ * (a) The statistics covers at least 2 columns, referenced in the
+ * estimated clauses (otherwise multi-variate stats are useless).
+ *
+ * (b) The statistics covers at least 1 new column, i.e. column not
+ * refefenced by the already used stats (and the new column has
+ * to be referenced by the clauses, of couse). Otherwise the
+ * statistics would not add any new information.
+ *
+ * There are some other sanity checks (e.g. that the stats must not be
+ * used twice etc.).
+ *
+ * Finally the new solution is compared to the currently best one, and
+ * if it's considered better, it's used instead.
+ *
+ *
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a somewhat simple optimality criteria,
+ * suffering by the following weaknesses.
+ *
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but
+ * with statistics in a different order). It's unclear which solution
+ * is the best one - in a sense all of them are equal.
+ *
+ * TODO It might be possible to compute estimate for each of those
+ * solutions, and then combine them to get the final estimate
+ * (e.g. by using average or median).
+ *
+ * (b) Does not consider that some types of stats are a better match for
+ * some types of clauses (e.g. MCV list is a good match for equality
+ * than a histogram).
+ *
+ * XXX Maybe MCV is almost always better / more accurate?
+ *
+ * But maybe this is pointless - generally, each column is either
+ * a label (it's not important whether because of the data type or
+ * how it's used), or a value with ordering that makes sense. So
+ * either a MCV list is more appropriate (labels) or a histogram
+ * (values with orderings).
+ *
+ * Now sure what to do with statistics on columns mixing columns of
+ * both types - maybe it'd be beeter to invent a new type of stats
+ * combining MCV list and histogram (keeping a small histogram for
+ * each MCV item, and a separate histogram for values not on the
+ * MCV list). But that's not implemented at this moment.
+ *
+ * (c) Does not consider that some solutions may better exploit the
+ * dependencies. For example with clauses on columns [A,B,C,D] and
+ * statistics on [A,B,C] and [C,D] cover all the columns just like
+ * [A,B,C] and [B,C,D], but the latter probably exploits additional
+ * dependencies thanks to having 'B' in both stats (thus allowing
+ * using it as a condition for the second stats). Of course, if
+ * B and [C,D] are independent, this is untrue - but if we have that
+ * statistics created, it's a sign that the DBA/developer believes
+ * there's a dependency.
+ *
+ * (d) Does not consider the order of clauses, which may be significant.
+ * For example, when there's a mix of simple and complex clauses,
+ * i.e. something like
+ *
+ * (a=2) AND (b=3 OR (c=3 AND d=4)) AND (c=3)
+ *
+ * It may be better to evaluate the simple clauses first, and then
+ * use them as conditions for the complex clause.
+ *
+ * We can for example count number of different attributes
+ * referenced in the clause, and use that as a metric of complexity
+ * (lower number -> simpler). Maybe use ratio (#vars/#atts) or
+ * (#clauses/#atts) as secondary metrics? Also the general complexity
+ * of the clause (levels of nesting etc.) might be useful.
+ *
+ * Hopefully most clauses will be reasonably simple, though.
+ *
+ * Update: On second thought, I believe the order of clauses is
+ * determined by choosing the order of statistics, and therefore
+ * optimized by the current algorithm.
+ *
+ * TODO Consider adding a counter of attributes covered by previous
+ * stats (possibly tracking the number of how many stats reference
+ * it too), and use this 'dependency_count' when selecting the best
+ * solution (not sure how). Similarly to (a) it might be possible
+ * to build estimate for each solution (different criteria) and then
+ * combine them somehow.
+ *
+ * TODO The current implementation repeatedly walks through the previous
+ * stats, just to compute the number of covered attributes over and
+ * over. With non-trivial number of statistics this might be an
+ * issue, so maybe we should keep track of 'covered' attributes by
+ * each step, so that we can get rid of this. We'll need this
+ * information anyway (when splitting clauses into condition and
+ * the estimated part).
+ *
+ * TODO This needs to consider the conditions passed from the preceding
+ * and upper clauses (in complex cases), but only as conditions
+ * and not as estimated clauses. So it needs to somehow affect the
+ * score (the more conditions we use the better).
+ *
+ * TODO The algorithm should probably count number of Vars (not just
+ * attnums) when computing the 'score' of each solution. Computing
+ * the ratio of (num of all vars) / (num of condition vars) as a
+ * measure of how well the solution uses conditions might be
+ * useful.
+ *
+ * TODO This might be much easier if we kept Bitmapset of attributes
+ * covered by the stats up to that step.
+ *
+ * FIXME When comparing the solutions, we currently use this condition:
+ *
+ * ((current->nstats > (*best)->nstats))
+ *
+ * i.e. we're choosing solution with more stats, because with
+ * clauses
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * and stats on [a,b], [b,c], [c,d] we want to choose the solution
+ * with all three stats, and not just [a,b], [c,d]. Otherwise we'd
+ * fail to exploit one of the dependencies.
+ *
+ * This is however a workaround for another issue - we're not
+ * tracking number of 'dependencies' covered by the solution, only
+ * number of clauses, and that's the same for both solutions.
+ * ([a,b], [c,d]) and ([a,b], [b,c], [c,d]) both cover all 4 clauses.
+ *
+ * Once a suitable metric is added, we want to choose the solution
+ * with less stats, assuming it covers the same number of clauses
+ * and exploits the same number of dependencies.
+ */
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStats mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ /*
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int c;
+
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
+
+ Bitmapset *all_attnums = NULL;
+ Bitmapset *new_attnums = NULL;
+
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nclauses; c++)
+ {
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
+
+ if (covered)
+ break;
+ }
+
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
+
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
+
+ /* add the attnums into attnums from 'new clauses' */
+ // new_attnums = bms_union(new_attnums, clause_attnums);
+ }
+
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+ bms_free(new_attnums);
+
+ all_attnums = NULL;
+ new_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
+ {
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+ }
+
+ /*
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
+ */
+ ruled_out[i] = step;
+
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
+
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
+
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ /* we can't get more conditions that clauses and conditions combined
+ *
+ * FIXME This assert does not work because we count the conditions
+ * repeatedly (once for each statistics covering it).
+ */
+ /* Assert((nconditions + nclauses) >= current->nconditions); */
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics
+ * covering the clauses. This chooses the "best" statistics at each step,
+ * so the resulting solution may not be the best solution globally, but
+ * this produces the solution in only N steps (where N is the number of
+ * statistics), while the exhaustive approach may have to walk through
+ * ~N! combinations (although some of those are terminated early).
+ *
+ * TODO There are probably other metrics we might use - e.g. using
+ * number of columns (num_cond_columns / num_cov_columns), which
+ * might work better with a mix of simple and complex clauses.
+ *
+ * TODO Also the choice at the very first step should be handled
+ * in a special way, because there will be 0 conditions at that
+ * moment, so there needs to be some other criteria - e.g. using
+ * the simplest (or most complex?) clause might be a good idea.
+ *
+ * TODO We might also select multiple stats using different criteria,
+ * and branch the search. This is however tricky, because if we
+ * choose k statistics at each step, we get k^N branches to
+ * walk through (with N steps). That's not really good with
+ * large number of stats (yet better than exhaustive search).
+ */
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStats mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
+
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
+
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
+ */
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *new = bms_union(attnums_covered, clauses_attnums[j]);
+
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = new;
+
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
+
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
+
+ if (gain > max_gain)
+ {
+ max_gain = gain;
+ best_stat = i;
+ }
+ }
+
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
+ }
+
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
+}
+
+/*
* We're looking for statistics matching at least 2 attributes,
* referenced in the clauses compatible with multivariate statistics.
* The current selection criteria is very simple - we choose the
@@ -1299,48 +2213,386 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
* TODO This will probably have to consider compatibility of clauses,
* because 'dependencies' will probably work only with equality
* clauses.
+ *
+ * TODO Another way to make the optimization problems smaller might
+ * be splitting the statistics into several disjoint subsets, i.e.
+ * if we can split the graph of statistics (after the elimination)
+ * into multiple components (so that stats in different components
+ * share no attributes), we can do the optimization for each
+ * component separately.
+ *
+ * TODO Another possible optimization might be removing redundant
+ * statistics - if statistics S1 covers S2 (covers S2 attributes
+ * and possibly some more), we can probably remove S2. What
+ * actually matters are attributes from covered clauses (not all
+ * the original attributes). This might however prefer larger,
+ * and thus less accurate, statistics.
+ *
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew
+ * that we can cover 10 clauses and reuse 8 dependencies, maybe
+ * covering 9 clauses and 7 dependencies would be OK?
*/
-static int
-choose_mv_statistics(int nmvstats, MVStats mvstats, Bitmapset *attnums)
+static mv_solution_t *
+choose_mv_statistics(PlannerInfo *root, int nmvstats, MVStats mvstats,
+ List *clauses, List *conditions,
+ Oid varRelid, SpecialJoinInfo *sjinfo, int type)
{
int i, j;
+ mv_solution_t *best = NULL;
+ ListCell *l;
+
+ /* pass only stats matching at least two attributes (from clauses) */
+ MVStats mvstats_filtered = (MVStats)palloc0(nmvstats * sizeof(MVStatsData));
+ int nmvstats_filtered;
+ bool repeat = true;
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
+
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
- int choice = -1;
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
/*
- * Walk through the statistics (simple array with nmvstats elements)
- * and for each one count the referenced attributes (encoded in
- * the 'attnums' bitmap).
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
*/
+ while (repeat)
+ {
+ /* pass only mv-compatible clauses covered by at least one statistics */
+ List *compatible_clauses = NIL;
+ List *compatible_conditions = NIL;
+
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
+
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new.
+ */
+ foreach (l, clauses)
+ {
+ Node *clause = (Node*)lfirst(l);
+ Bitmapset *clause_attnums = NULL;
+ Oid relid;
+
+ /*
+ * The clause has to be mv-compatible (suitable operators etc.).
+ */
+ if (! clause_is_mv_compatible(root, clause, varRelid,
+ &relid, &clause_attnums, sjinfo, type))
+ continue;
+
+ /* is there a statistics covering this clause? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int k, matches = 0;
+ for (k = 0; k < mvstats[i].stakeys->dim1; k++)
+ {
+ if (bms_is_member(mvstats[i].stakeys->values[k],
+ clause_attnums))
+ matches += 1;
+ }
+
+ /*
+ * The clause is compatible if all attributes it references
+ * are covered by the statistics.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ compatible_attnums = bms_union(compatible_attnums,
+ clause_attnums);
+ compatible_clauses = lappend(compatible_clauses,
+ clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible clauses that source clauses */
+ Assert(list_length(clauses) >= list_length(compatible_clauses));
+
+ /* work with only compatible clauses from now */
+ list_free(clauses);
+ clauses = compatible_clauses;
+
+ /*
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new.
+ */
+
+ /* next, generate bitmap of attnums from all mv_compatible conditions */
+ foreach (l, conditions)
+ {
+ Node *clause = (Node*)lfirst(l);
+ Bitmapset *clause_attnums = NULL;
+ Oid relid;
+
+ /*
+ * The clause has to be mv-compatible (suitable operators etc.).
+ */
+ if (! clause_is_mv_compatible(root, clause, varRelid,
+ &relid, &clause_attnums, sjinfo, type))
+ continue;
+
+ /* is there a statistics covering this clause? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int k, matches = 0;
+ for (k = 0; k < mvstats[i].stakeys->dim1; k++)
+ {
+ if (bms_is_member(mvstats[i].stakeys->values[k],
+ clause_attnums))
+ matches += 1;
+ }
+
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ condition_attnums = bms_union(condition_attnums,
+ clause_attnums);
+ compatible_conditions = lappend(compatible_conditions,
+ clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(conditions) >= list_length(compatible_conditions));
+
+ /* keep only compatible clauses */
+ list_free(conditions);
+ conditions = compatible_conditions;
+
+ /* get a union of attnums (from conditions and clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ nmvstats_filtered = 0;
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ for (k = 0; k < mvstats[i].stakeys->dim1; k++)
+ {
+ /* attribute covered by new clause(s) */
+ if (bms_is_member(mvstats[i].stakeys->values[k],
+ compatible_attnums))
+ matches_new += 1;
+
+ /* attribute covered by clause(s) or ondition(s) */
+ if (bms_is_member(mvstats[i].stakeys->values[k],
+ all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ {
+ mvstats_filtered[nmvstats_filtered] = mvstats[i];
+ nmvstats_filtered += 1;
+ }
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(nmvstats >= nmvstats_filtered);
+
+ /* if we've eliminated a statistics, trigger another round */
+ repeat = (nmvstats > nmvstats_filtered);
+
+ /*
+ * work only with filtered statistics from now
+ *
+ * FIXME This rewrites the input 'mvstats' array, which is not
+ * exactly pretty as it's an unexpected side-effect (the
+ * caller may use the stats for something else). But the
+ * solution contains indexes into this 'reduced' array so
+ * we can't stop doing that easily.
+ *
+ * Another issue is that we only modify the local 'mvstats'
+ * value, so the caller will still see the original number
+ * of stats (and thus maybe duplicate entries).
+ *
+ * We should make a copy of the array, and only mess with
+ * that copy (and map the indexes to the original ones at
+ * the end, when returning the solution to the user). Or
+ * simply work with OIDs.
+ */
+ if (nmvstats_filtered < nmvstats)
+ {
+ nmvstats = nmvstats_filtered;
+ memcpy(mvstats, mvstats_filtered, sizeof(MVStatsData)*nmvstats);
+ nmvstats_filtered = 0;
+ }
+ }
+
+ /* only do the optimization if we have clauses/statistics */
+ if ((nmvstats == 0) || (list_length(clauses) == 0))
+ return NULL;
+
+ stats_attnums
+ = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset *));
+
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+
for (i = 0; i < nmvstats; i++)
{
- /* columns matching this statistics */
- int matches = 0;
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i] = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+ }
- int2vector * attrs = mvstats[i].stakeys;
- int numattrs = mvstats[i].stakeys->dim1;
+ /* collect clauses an bitmap of attnums */
+ nclauses = 0;
+ clauses_attnums = (Bitmapset **)palloc0(list_length(clauses)
+ * sizeof(Bitmapset *));
+ clauses_array = (Node **)palloc0(list_length(clauses)
+ * sizeof(Node *));
- /* count columns covered by the histogram */
- for (j = 0; j < numattrs; j++)
- if (bms_is_member(attrs->values[j], attnums))
- matches++;
+ foreach (l, clauses)
+ {
+ Oid relid;
+ Bitmapset * attnums = NULL;
/*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
+ * The clause has to be mv-compatible (suitable operators etc.).
*/
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
+ if (! clause_is_mv_compatible(root, (Node *)lfirst(l), varRelid,
+ &relid, &attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ clauses_attnums[nclauses] = attnums;
+ clauses_array[nclauses] = (Node *)lfirst(l);
+ nclauses += 1;
+ }
+
+ /* collect conditions and bitmap of attnums */
+ nconditions = 0;
+ conditions_attnums = (Bitmapset **)palloc0(list_length(conditions)
+ * sizeof(Bitmapset *));
+ conditions_array = (Node **)palloc0(list_length(conditions)
+ * sizeof(Node *));
+
+ foreach (l, conditions)
+ {
+ Oid relid;
+ Bitmapset * attnums = NULL;
+
+ /* conditions are mv-compatible (thanks to the reduction) */
+ if (! clause_is_mv_compatible(root, (Node *)lfirst(l), varRelid,
+ &relid, &attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ conditions_attnums[nconditions] = attnums;
+ conditions_array[nconditions] = (Node *)lfirst(l);
+ nconditions += 1;
+ }
+
+ /*
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
+ */
+ clause_cover_map = (bool*)palloc0(nclauses * nmvstats);
+ condition_cover_map = (bool*)palloc0(nconditions * nmvstats);
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ ruled_out[i] = -1; /* not ruled out by default */
+ for (j = 0; j < nclauses; j++)
+ {
+ clause_cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j],
+ stats_attnums[i]);
+ }
+
+ for (j = 0; j < nconditions; j++)
{
- choice = i;
- current_matches = matches;
- current_dims = numattrs;
+ condition_cover_map[i * nconditions + j]
+ = bms_is_subset(conditions_attnums[j],
+ stats_attnums[i]);
}
}
- return choice;
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* maybe we should leave the cleanup up to the memory context */
+ pfree(mvstats_filtered);
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(clauses_array);
+ pfree(conditions_attnums);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+
+ return best;
}
@@ -1624,6 +2876,51 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
return false;
}
+
+static Bitmapset *
+clause_mv_get_attnums(PlannerInfo *root, Node *clause)
+{
+ Bitmapset * attnums = NULL;
+
+ /* Extract clause from restrict info, if needed. */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+
+ if (IsA(linitial(expr->args), Var))
+ attnums = bms_add_member(attnums,
+ ((Var*)linitial(expr->args))->varattno);
+ else
+ attnums = bms_add_member(attnums,
+ ((Var*)lsecond(expr->args))->varattno);
+ }
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ attnums = bms_add_member(attnums,
+ ((Var*)((NullTest*)clause)->arg)->varattno);
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ attnums = bms_join(attnums,
+ clause_mv_get_attnums(root, (Node*)lfirst(l)));
+ }
+ }
+
+ return attnums;
+}
+
/*
* Performs reduction of clauses using functional dependencies, i.e.
* removes clauses that are considered redundant. It simply walks
@@ -2049,20 +3346,24 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses, Oid varRelid,
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStats mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStats mvstats,
+ List *clauses, List *conditions,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -2073,13 +3374,34 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+
+ /* conditions */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * nmatches);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /*
+ * build the match bitmap for the conditions
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
nmatches, matches,
@@ -2088,14 +3410,25 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
{
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t);
}
/*
@@ -2490,13 +3823,17 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStats mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStats mvstats,
+ List *clauses, List *conditions)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
+
MVHistogram mvhist = NULL;
/* there's no histogram */
@@ -2508,18 +3845,34 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
Assert (mvhist != NULL);
Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
/*
* Bitmap of bucket matches (mismatch, partial, full). by default
* all buckets fully match (and we'll eliminate them).
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+ matches = palloc0(sizeof(char) * nmatches);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
- nmatches = mvhist->nbuckets;
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /* build the match bitmap for the conditions */
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, false);
- /* build the match bitmap */
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
nmatches, matches, false);
@@ -2527,17 +3880,37 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t);
}
/*
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 1a0d358..71beb2e 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3280,7 +3280,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3303,7 +3304,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3470,7 +3472,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3506,7 +3508,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3543,7 +3546,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3681,12 +3685,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3698,7 +3704,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index f0acc14..e41508b 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 4dd3f9f..326dd36 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1580,13 +1580,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6196,7 +6198,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6521,7 +6524,8 @@ btcostestimate(PG_FUNCTION_ARGS)
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7264,7 +7268,8 @@ gincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7496,7 +7501,7 @@ brincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index b8a0f9f..5cd2583 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -396,6 +397,15 @@ static const struct config_enum_entry row_security_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3651,6 +3661,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 9c2000b..7a3835b 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -182,11 +182,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 673e546..194bbf8 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -16,6 +16,14 @@
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Basic info about the stats, used when choosing what to use
*/
--
2.0.5
On Mon, Mar 30, 2015 at 5:26 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com>
wrote:
Hello,
attached is a new version of the patch series. Aside from fixing various
issues (crashes, memory leaks). The patches are rebased to current
master, and I also attach a few SQL scripts I used for testing (nothing
fancy, just stress-testing all the parts the patch touches).
Hi Tomas,
I get cascading conflicts in pg_proc.h. It looked easy enough to fix,
except then I get compiler errors:
funcapi.c: In function 'get_func_trftypes':
funcapi.c:890: warning: unused variable 'procStruct'
utils/fmgrtab.o:(.rodata+0x10cf8): undefined reference to `_null_'
utils/fmgrtab.o:(.rodata+0x10d18): undefined reference to `_null_'
utils/fmgrtab.o:(.rodata+0x10d38): undefined reference to `_null_'
utils/fmgrtab.o:(.rodata+0x10d58): undefined reference to `_null_'
collect2: ld returned 1 exit status
make[2]: *** [postgres] Error 1
make[1]: *** [all-backend-recurse] Error 2
make: *** [all-src-recurse] Error 2
make: *** Waiting for unfinished jobs....
make: *** [temp-install] Error 2
Cheers,
Jeff
* Jeff Janes (jeff.janes@gmail.com) wrote:
On Mon, Mar 30, 2015 at 5:26 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com>
wrote:attached is a new version of the patch series. Aside from fixing various
issues (crashes, memory leaks). The patches are rebased to current
master, and I also attach a few SQL scripts I used for testing (nothing
fancy, just stress-testing all the parts the patch touches).I get cascading conflicts in pg_proc.h. It looked easy enough to fix,
except then I get compiler errors:
Yeah, those are because you didn't address the new column which was
added to pg_proc. You need to add another _null_ in the pg_proc.h lines
in the correct place, apparently on four lines.
Thanks!
Stephen
On Tue, Apr 28, 2015 at 9:13 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Jeff Janes (jeff.janes@gmail.com) wrote:
On Mon, Mar 30, 2015 at 5:26 PM, Tomas Vondra <
tomas.vondra@2ndquadrant.com>
wrote:
attached is a new version of the patch series. Aside from fixing
various
issues (crashes, memory leaks). The patches are rebased to current
master, and I also attach a few SQL scripts I used for testing (nothing
fancy, just stress-testing all the parts the patch touches).I get cascading conflicts in pg_proc.h. It looked easy enough to fix,
except then I get compiler errors:Yeah, those are because you didn't address the new column which was
added to pg_proc. You need to add another _null_ in the pg_proc.h lines
in the correct place, apparently on four lines.
Thanks. I think I tried that, but was still having trouble. But it turns
out that the trouble was for an unrelated reason, and I got it to compile
now.
Some of the fdw's need a patch as well in order to compile, see attached.
Cheers,
Jeff
Attachments:
multivariate_contrib.patchapplication/octet-stream; name=multivariate_contrib.patchDownload
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
new file mode 100644
index 4368897..7b4839b
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
*************** estimate_size(PlannerInfo *root, RelOptI
*** 947,953 ****
baserel->baserestrictinfo,
0,
JOIN_INNER,
! NULL);
nrows = clamp_row_est(nrows);
--- 947,954 ----
baserel->baserestrictinfo,
0,
JOIN_INNER,
! NULL,
! NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
new file mode 100644
index 478e124..ff6b438
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
*************** postgresGetForeignRelSize(PlannerInfo *r
*** 478,484 ****
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
! NULL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
--- 478,485 ----
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
! NULL,
! NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
*************** estimate_path_cost_size(PlannerInfo *roo
*** 1770,1776 ****
local_join_conds,
baserel->relid,
JOIN_INNER,
! NULL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
--- 1771,1778 ----
local_join_conds,
baserel->relid,
JOIN_INNER,
! NULL,
! NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
Hi,
On 04/28/15 19:36, Jeff Janes wrote:
...
Thanks. I think I tried that, but was still having trouble. But it
turns out that the trouble was for an unrelated reason, and I got it
to compile now.
Yeah, a new column was added to pg_proc the day after I submitted the
pacth. Will address that in a new version, hopefully in a few days.
Some of the fdw's need a patch as well in order to compile, see
attached.
Thanks, I forgot to tweak the clauselist_selectivity() calls contrib :-(
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Attached is v6 of the multivariate stats, with a number of improvements:
1) fix of the contrib compile-time errors (reported by Jeff)
2) fix of pg_proc issues (reported by Jeff)
3) rebase to current master
4) fix a bunch of issues in the previous patches, due to referencing
some parts too early (e.g. histograms in the first patch, etc.)
5) remove the explicit DELETEs from pg_mv_statistic (in the regression
tests), this is now handled automatically by DROP TABLE etc.
6) number of performance optimizations in selectivity estimations:
(a) minimize calls to get_oprrest, significantly reducing
syscache calls
(b) significant reduction of palloc overhead in deserialization of
MCV lists and histograms
(c) use more compact serialized representation of MCV lists and
histograms, often removing ~50% of the size
(d) use histograms with limited deserialization, which also allows
caching function calls
(e) modified histogram bucket partitioning, resulting in more even
bucket distribution (i.e. producing buckets with more equal
density and about equal size of each dimension)
7) add functions for listing MCV list items and histogram buckets:
- pg_mv_mcvlist_items(oid)
- pg_mv_histogram_buckets(oid, type)
This is quite useful when analyzing the MCV lists / histograms.
8) improved support for OR clauses
9) allow calling pull_varnos() on expression trees containing
RestrictInfo nodes (not sure if this is the right fix, it's being
discussed in another thread)
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-shared-infrastructure-and-functional-dependencies.patchtext/x-patch; name=0001-shared-infrastructure-and-functional-dependencies.patchDownload
>From 62e862b0debfdb44976388a577798179eb7a0727 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 1/6] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate
stats, most importantly:
- adds a new system catalog (pg_mv_statistic)
- ALTER TABLE ... ADD STATISTICS syntax
- implementation of functional dependencies (the simplest
type of multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e.
it does not influence the query planning (subject to
follow-up patches).
The current implementation requires a valid 'ltopr' for
the columns, so that we can sort the sample rows in various
ways, both in this patch and other kinds of statistics.
Maybe this restriction could be relaxed in the future,
requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
The algorithm detecting the dependencies is rather simple
and probably needs improvements.
The name 'functional dependencies' is more correct (than
'association rules') as it's exactly the name used in
relational theory (esp. Normal Forms) for tracking
column-level dependencies.
The multivariate statistics are automatically removed in
two situations
(a) after a DROP TABLE (obviously)
(b) after ALTER TABLE ... DROP COLUMN, if the statistics
would be defined on less than 2 columns (remaining)
If there are more at least 2 columns remaining, we keep
the statistics but perform cleanup on the next ANALYZE.
The dropped columns are removed from stakeys, and the new
statistics is built on the smaller set.
We can't do this at DROP COLUMN, because that'd leave us
with invalid statistics, or we'd have to throw it away
although we can still use it. This lazy approach lets us
use the statistics although some of the columns are dead.
Dropping the statistics is done using DROP STATISTICS
ALTER TABLE ... DROP STATISTICS ALL;
ALTER TABLE ... DROP STATISTICS (opts) ON (cols);
The bad consequence of this is that 'statistics' becomes
a reserved keyword (was unreserved before), otherwise it
conflicts with DROP <columnname> in the grammar. Not sure
if there's a workaround to this.
This also adds a simple list of statistics to \d in psql.
---
src/backend/catalog/Makefile | 1 +
src/backend/catalog/heap.c | 102 +++++
src/backend/catalog/system_views.sql | 10 +
src/backend/commands/analyze.c | 20 +-
src/backend/commands/tablecmds.c | 346 +++++++++++++++-
src/backend/nodes/copyfuncs.c | 14 +
src/backend/nodes/outfuncs.c | 18 +
src/backend/optimizer/util/plancat.c | 63 +++
src/backend/parser/gram.y | 84 +++-
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/relcache.c | 59 +++
src/backend/utils/cache/syscache.c | 12 +
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/common.c | 356 ++++++++++++++++
src/backend/utils/mvstats/common.h | 75 ++++
src/backend/utils/mvstats/dependencies.c | 638 +++++++++++++++++++++++++++++
src/bin/psql/describe.c | 40 ++
src/include/catalog/heap.h | 1 +
src/include/catalog/indexing.h | 5 +
src/include/catalog/pg_mv_statistic.h | 69 ++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 12 +-
src/include/nodes/relation.h | 28 ++
src/include/parser/kwlist.h | 2 +-
src/include/utils/mvstats.h | 69 ++++
src/include/utils/rel.h | 4 +
src/include/utils/relcache.h | 1 +
src/include/utils/syscache.h | 1 +
src/test/regress/expected/rules.out | 8 +
src/test/regress/expected/sanity_check.out | 1 +
32 files changed, 2057 insertions(+), 9 deletions(-)
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 37d05d1..8476489 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index d04e94d..1c28ca3 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -46,6 +46,7 @@
#include "catalog/pg_constraint.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_tablespace.h"
@@ -1611,7 +1612,10 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
if (attnum > 0)
+ {
RemoveStatistics(relid, attnum);
+ RemoveMVStatistics(relid, attnum);
+ }
relation_close(rel, NoLock);
}
@@ -1839,6 +1843,11 @@ heap_drop_with_catalog(Oid relid)
RemoveStatistics(relid, 0);
/*
+ * delete multi-variate statistics
+ */
+ RemoveMVStatistics(relid, 0);
+
+ /*
* delete attribute tuples
*/
DeleteAttributeTuples(relid);
@@ -2694,6 +2703,99 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
/*
+ * RemoveMVStatistics --- remove entries in pg_mv_statistic for a rel
+ *
+ * If attnum is zero, remove all entries for rel; else remove only the one(s)
+ * for that column.
+ */
+void
+RemoveMVStatistics(Oid relid, AttrNumber attnum)
+{
+ Relation pgmvstatistic;
+ TupleDesc tupdesc = NULL;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ /*
+ * When dropping a column, we'll drop statistics with a single
+ * remaining (undropped column). To do that, we need the tuple
+ * descriptor.
+ *
+ * We already have the relation locked (as we're running ALTER
+ * TABLE ... DROP COLUMN), so we'll just get the descriptor here.
+ */
+ if (attnum != 0)
+ {
+ Relation rel = relation_open(relid, NoLock);
+
+ /* multivariate stats are supported on tables and matviews */
+ if (rel->rd_rel->relkind == RELKIND_RELATION ||
+ rel->rd_rel->relkind == RELKIND_MATVIEW)
+ tupdesc = RelationGetDescr(rel);
+
+ relation_close(rel, NoLock);
+ }
+
+ if (tupdesc == NULL)
+ return;
+
+ pgmvstatistic = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ scan = systable_beginscan(pgmvstatistic,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ bool delete = true;
+
+ if (attnum != 0)
+ {
+ Datum adatum;
+ bool isnull;
+ int i;
+ int ncolumns = 0;
+ ArrayType *arr;
+ int16 *attnums;
+
+ /* get the columns */
+ adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+ attnums = (int16*)ARR_DATA_PTR(arr);
+
+ for (i = 0; i < ARR_DIMS(arr)[0]; i++)
+ {
+ /* count the column unless it's has been / is being dropped */
+ if ((! tupdesc->attrs[attnums[i]-1]->attisdropped) &&
+ (attnums[i] != attnum))
+ ncolumns += 1;
+ }
+
+ /* delete if there are less than two attributes */
+ delete = (ncolumns < 2);
+ }
+
+ if (delete)
+ simple_heap_delete(pgmvstatistic, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(pgmvstatistic, RowExclusiveLock);
+}
+
+
+/*
* RelationTruncateIndexes - truncate all indexes associated
* with the heap relation to zero tuples.
*
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2ad01f4..07586c6 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -150,6 +150,16 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 15ec0ad..fff27e0 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -54,7 +55,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Data structure for Algorithm S from Knuth 3.4.2 */
typedef struct
@@ -111,7 +116,6 @@ static void update_attstats(Oid relid, bool inh,
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
-
/*
* analyze_rel() -- analyze one relation
*/
@@ -474,6 +478,17 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, and in some cases samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -576,6 +591,9 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 299d8cc..5c57146 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -35,6 +35,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_tablespace.h"
@@ -92,7 +93,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -140,8 +141,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
@@ -416,6 +418,10 @@ static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKM
static void ATExecGenericOptions(Relation rel, List *options);
static void ATExecEnableRowSecurity(Relation rel);
static void ATExecDisableRowSecurity(Relation rel);
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode);
+static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode);
static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
@@ -3011,6 +3017,8 @@ AlterTableGetLockLevel(List *cmds)
* updates.
*/
case AT_SetStatistics: /* Uses MVCC in getTableAttrs() */
+ case AT_AddStatistics: /* XXX not sure if the right level */
+ case AT_DropStatistics: /* XXX not sure if the right level */
case AT_ClusterOn: /* Uses MVCC in getIndexes() */
case AT_DropCluster: /* Uses MVCC in getIndexes() */
case AT_SetOptions: /* Uses MVCC in getTableAttrs() */
@@ -3167,6 +3175,8 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
pass = AT_PASS_ADD_CONSTR;
break;
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
+ case AT_AddStatistics: /* XXX maybe not the right place */
+ case AT_DropStatistics: /* XXX maybe not the right place */
ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode);
/* Performs own permission checks */
ATPrepSetStatistics(rel, cmd->name, cmd->def, lockmode);
@@ -3469,6 +3479,12 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
address = ATExecSetStatistics(rel, cmd->name, cmd->def, lockmode);
break;
+ case AT_AddStatistics: /* ADD STATISTICS */
+ ATExecAddStatistics(tab, rel, (StatisticsDef *) cmd->def, lockmode);
+ break;
+ case AT_DropStatistics: /* DROP STATISTICS */
+ ATExecDropStatistics(tab, rel, (StatisticsDef *) cmd->def, lockmode);
+ break;
case AT_SetOptions: /* ALTER COLUMN SET ( options ) */
address = ATExecSetOptions(rel, cmd->name, cmd->def, false, lockmode);
break;
@@ -11860,3 +11876,327 @@ RangeVarCallbackForAlterRelation(const RangeVar *rv, Oid relid, Oid oldrelid,
ReleaseSysCache(tuple);
}
+
+/* used for sorting the attnums in ATExecAddStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the ALTER TABLE ... ADD STATISTICS (options) ON (columns).
+ *
+ * The code is an unholy mix of pieces that really belong to other parts
+ * of the source tree.
+ *
+ * FIXME Check that the types are pass-by-value and support sort,
+ * although maybe we can live without the sort (and only build
+ * MCV list / association rules).
+ *
+ * FIXME This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+
+ /* by default build everything */
+ bool build_dependencies = true;
+
+ Assert(IsA(def, StatisticsDef));
+
+ /* transform the column names to attnum values */
+
+ foreach(l, def->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, def->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ heap_freetuple(htup);
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ /*
+ * Invalidate relcache so that others see the new statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ return;
+}
+
+/*
+ * Implements the ALTER TABLE ... DROP STATISTICS in two forms:
+ *
+ * ALTER TABLE ... DROP STATISTICS (options) ON (columns)
+ * ALTER TABLE ... DROP STATISTICS ALL;
+ *
+ * The first one requires an exact match, the second one just drops
+ * all the statistics on a table.
+ */
+static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode)
+{
+ Relation statrel;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ ListCell *l;
+
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+
+ /* checking whether the statistics matches / should be dropped */
+ bool build_dependencies = false;
+ bool check_dependencies = false;
+
+ if (def != NULL)
+ {
+ Assert(IsA(def, StatisticsDef));
+
+ /* collect attribute numbers */
+ foreach(l, def->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /* parse the statistics options */
+ foreach (l, def->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ {
+ check_dependencies = true;
+ build_dependencies = defGetBoolean(opt);
+ }
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ }
+
+ statrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(rel)));
+
+ scan = systable_beginscan(statrel,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ /* by default we delete everything */
+ bool delete = true;
+
+ /* check that the options match (dependencies, mcv, histogram) */
+ if (delete && check_dependencies)
+ {
+ bool isnull;
+ Datum adatum = heap_getattr(tuple,
+ Anum_pg_mv_statistic_deps_enabled,
+ RelationGetDescr(statrel),
+ &isnull);
+
+ delete = (! isnull) &&
+ (DatumGetBool(adatum) == build_dependencies);
+ }
+
+ /* check that the columns match the statistics definition */
+ if (delete && (numcols > 0))
+ {
+ int i, j;
+ ArrayType *arr;
+ bool isnull;
+
+ int16 *stakeys;
+ int nstakeys;
+
+ Datum adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ nstakeys = ARR_DIMS(arr)[0];
+ stakeys = (int16 *) ARR_DATA_PTR(arr);
+
+ /* assume match */
+ delete = true;
+
+ /* check that for each column we find a match in stakeys */
+ for (i = 0; i < numcols; i++)
+ {
+ bool found = false;
+ for (j = 0; j < nstakeys; j++)
+ {
+ if (attnums[i] == stakeys[j])
+ {
+ found = true;
+ break;
+ }
+ }
+
+ if (! found)
+ {
+ delete = false;
+ break;
+ }
+ }
+
+ /* check that for each stakeys we find a match in columns */
+ for (j = 0; j < nstakeys; j++)
+ {
+ bool found = false;
+
+ for (i = 0; i < numcols; i++)
+ {
+ if (attnums[i] == stakeys[j])
+ {
+ found = true;
+ break;
+ }
+ }
+
+ if (! found)
+ {
+ delete = false;
+ break;
+ }
+ }
+ }
+
+ /* don't delete, if we've found mismatches */
+ if (delete)
+ simple_heap_delete(statrel, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(statrel, RowExclusiveLock);
+
+ /*
+ * Invalidate relcache so that others forget the dropped statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ return;
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 805045d..ddc88a3 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3938,6 +3938,17 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static StatisticsDef *
+_copyStatisticsDef(const StatisticsDef *from)
+{
+ StatisticsDef *newnode = makeNode(StatisticsDef);
+
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4755,6 +4766,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_StatisticsDef:
+ retval = _copyStatisticsDef(from);
+ break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index f9f948e..c2d5dc5 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1842,6 +1842,21 @@ _outIndexOptInfo(StringInfo str, const IndexOptInfo *node)
}
static void
+_outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
+{
+ WRITE_NODE_TYPE("MVSTATISTICINFO");
+
+ /* NB: this isn't a complete set of fields */
+ WRITE_OID_FIELD(mvoid);
+
+ /* enabled statistics */
+ WRITE_BOOL_FIELD(deps_enabled);
+
+ /* built/available statistics */
+ WRITE_BOOL_FIELD(deps_built);
+}
+
+static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
@@ -3220,6 +3235,9 @@ _outNode(StringInfo str, const void *obj)
case T_PlannerParamItem:
_outPlannerParamItem(str, obj);
break;
+ case T_MVStatisticInfo:
+ _outMVStatisticInfo(str, obj);
+ break;
case T_CreateStmt:
_outCreateStmt(str, obj);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 068ab39..1cf64f8 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -26,6 +26,7 @@
#include "access/xlog.h"
#include "catalog/catalog.h"
#include "catalog/heap.h"
+#include "catalog/pg_mv_statistic.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -38,7 +39,9 @@
#include "parser/parsetree.h"
#include "rewrite/rewriteManip.h"
#include "storage/bufmgr.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
+#include "utils/syscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -89,6 +92,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
Relation relation;
bool hasindex;
List *indexinfos = NIL;
+ List *stainfos = NIL;
/*
* We need not lock the relation since it was already locked, either by
@@ -377,6 +381,65 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->indexlist = indexinfos;
+ if (true)
+ {
+ List *mvstatoidlist;
+ ListCell *l;
+
+ mvstatoidlist = RelationGetMVStatList(relation);
+
+ foreach(l, mvstatoidlist)
+ {
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ Oid mvoid = lfirst_oid(l);
+ Form_pg_mv_statistic mvstat;
+ MVStatisticInfo *info;
+
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ continue;
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /* unavailable stats are not interesting for the planner */
+ if (mvstat->deps_built)
+ {
+ info = makeNode(MVStatisticInfo);
+
+ info->mvoid = mvoid;
+ info->rel = rel;
+
+ /* enabled statistics */
+ info->deps_enabled = mvstat->deps_enabled;
+
+ /* built/available statistics */
+ info->deps_built = mvstat->deps_built;
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ info->stakeys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+
+ stainfos = lcons(info, stainfos);
+ }
+
+ ReleaseSysCache(htup);
+ }
+
+ list_free(mvstatoidlist);
+ }
+
+ rel->mvstatlist = stainfos;
+
/* Grab the fdwroutine info using the relcache, while we have it */
if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0180530..dbeb3c8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -369,6 +369,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
relation_expr_list dostmt_opt_list
transform_element_list transform_type_list
+%type <list> OptStatsOptions
+%type <str> stats_options_name
+%type <node> stats_options_arg
+%type <defelt> stats_options_elem
+%type <list> stats_options_list
+
+
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -488,7 +495,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <keyword> unreserved_keyword type_func_name_keyword
%type <keyword> col_name_keyword reserved_keyword
-%type <node> TableConstraint TableLikeClause
+%type <node> TableConstraint TableLikeClause TableStatistics
%type <ival> TableLikeOptionList TableLikeOption
%type <list> ColQualList
%type <node> ColConstraint ColConstraintElem ConstraintAttr
@@ -2315,6 +2322,29 @@ alter_table_cmd:
n->subtype = AT_DisableRowSecurity;
$$ = (Node *)n;
}
+ /* ALTER TABLE <name> ADD STATISTICS (options) ON (columns) */
+ | ADD_P TableStatistics
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_AddStatistics;
+ n->def = $2;
+ $$ = (Node *)n;
+ }
+ /* ALTER TABLE <name> DROP STATISTICS (options) ON (columns) */
+ | DROP TableStatistics
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_DropStatistics;
+ n->def = $2;
+ $$ = (Node *)n;
+ }
+ /* ALTER TABLE <name> DROP STATISTICS ALL */
+ | DROP STATISTICS ALL
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_DropStatistics;
+ $$ = (Node *)n;
+ }
| alter_generic_options
{
AlterTableCmd *n = makeNode(AlterTableCmd);
@@ -3389,6 +3419,56 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * ALTER TABLE relname ADD STATISTICS (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+TableStatistics:
+ STATISTICS OptStatsOptions ON '(' columnList ')'
+ {
+ StatisticsDef *n = makeNode(StatisticsDef);
+ n->keys = $5;
+ n->options = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
+OptStatsOptions:
+ '(' stats_options_list ')' { $$ = $2; }
+ | /*EMPTY*/ { $$ = NIL; }
+ ;
+
+stats_options_list:
+ stats_options_elem
+ {
+ $$ = list_make1($1);
+ }
+ | stats_options_list ',' stats_options_elem
+ {
+ $$ = lappend($1, $3);
+ }
+ ;
+
+stats_options_elem:
+ stats_options_name stats_options_arg
+ {
+ $$ = makeDefElem($1, $2);
+ }
+ ;
+
+stats_options_name:
+ NonReservedWord { $$ = $1; }
+ ;
+
+stats_options_arg:
+ opt_boolean_or_string { $$ = (Node *) makeString($1); }
+ | NumericOnly { $$ = (Node *) $1; }
+ | /* EMPTY */ { $$ = NULL; }
+ ;
+
/*****************************************************************************
*
@@ -13547,7 +13627,6 @@ unreserved_keyword:
| STANDALONE_P
| START
| STATEMENT
- | STATISTICS
| STDIN
| STDOUT
| STORAGE
@@ -13762,6 +13841,7 @@ reserved_keyword:
| SELECT
| SESSION_USER
| SOME
+ | STATISTICS
| SYMMETRIC
| TABLE
| THEN
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index e745006..855ff05 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -47,6 +47,7 @@
#include "catalog/pg_auth_members.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
@@ -3906,6 +3907,62 @@ RelationGetIndexList(Relation relation)
return result;
}
+
+List *
+RelationGetMVStatList(Relation relation)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result;
+ List *oldlist;
+ MemoryContext oldcxt;
+
+ /* Quick exit if we already computed the list. */
+ if (relation->rd_mvstatvalid != 0)
+ return list_copy(relation->rd_mvstatlist);
+
+ /*
+ * We build the list we intend to return (in the caller's context) while
+ * doing the scan. After successfully completing the scan, we copy that
+ * list into the relcache entry. This avoids cache-context memory leakage
+ * if we get some sort of error partway through.
+ */
+ result = NIL;
+
+ /* Prepare to scan pg_index for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(relation)));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ /* TODO maybe include only already built statistics? */
+ result = insert_ordered_oid(result, HeapTupleGetOid(htup));
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* Now save a copy of the completed list in the relcache entry. */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ oldlist = relation->rd_mvstatlist;
+ relation->rd_mvstatlist = list_copy(result);
+
+ relation->rd_mvstatvalid = true;
+ MemoryContextSwitchTo(oldcxt);
+
+ /* Don't leak the old list, if there is one */
+ list_free(oldlist);
+
+ return result;
+}
+
/*
* insert_ordered_oid
* Insert a new Oid into a sorted list of Oids, preserving ordering
@@ -4875,6 +4932,8 @@ load_relcache_init_file(bool shared)
rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_idattr = NULL;
+ rel->rd_mvstatvalid = false;
+ rel->rd_mvstatlist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index f58e1ce..9aaf68f 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -501,6 +502,17 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 128
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..a755c49
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,356 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+static List* list_mv_stats(Oid relid);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ ListCell *lc;
+ List *mvstats;
+
+ TupleDesc tupdesc = RelationGetDescr(onerel);
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel));
+
+ foreach (lc, mvstats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies deps = NULL;
+
+ VacAttrStats **stats = NULL;
+ int numatts = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = stat->stakeys;
+
+ /* see how many of the columns are not dropped */
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ numatts += 1;
+
+ /* if there are dropped attributes, build a filtered int2vector */
+ if (numatts != attrs->dim1)
+ {
+ int16 *tmp = palloc0(numatts * sizeof(int16));
+ int attnum = 0;
+
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ tmp[attnum++] = attrs->values[j];
+
+ pfree(attrs);
+ attrs = buildint2vector(tmp, numatts);
+ }
+
+ /* filter only the interesting vacattrstats records */
+ stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(stat->mvoid, deps, attrs);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+static List*
+list_mv_stats(Oid relid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result = NIL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ info->mvoid = HeapTupleGetOid(htup);
+ info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_built = stats->deps_built;
+
+ result = lappend(result, info);
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_stakeys-1] = false;
+
+ /* use the new attnums, in case we removed some dropped ones */
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_stakeys -1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..6d5465b
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/datum.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "access/sysattr.h"
+
+#include "utils/mvstats.h"
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..0ca16a0
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,638 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ * Mine functional dependencies between columns, in the form (A => B),
+ * meaning that a value in column 'A' determines value in 'B'. A simple
+ * artificial example may be a table created like this
+ *
+ * CREATE TABLE deptest (a INT, b INT)
+ * AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+ *
+ * Clearly, once we know the value for 'A' we can easily determine the
+ * value of 'B' by dividing (A/10). A more practical example may be
+ * addresses, where (ZIP code => city name), i.e. once we know the ZIP,
+ * we probably know which city it belongs to. Larger cities usually have
+ * multiple ZIP codes, so the dependency can't be reversed.
+ *
+ * Functional dependencies are a concept well described in relational
+ * theory, especially in definition of normalization and "normal forms".
+ * Wikipedia has a nice definition of a functional dependency [1]:
+ *
+ * In a given table, an attribute Y is said to have a functional
+ * dependency on a set of attributes X (written X -> Y) if and only
+ * if each X value is associated with precisely one Y value. For
+ * example, in an "Employee" table that includes the attributes
+ * "Employee ID" and "Employee Date of Birth", the functional
+ * dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ * It follows from the previous two sentences that each {Employee ID}
+ * is associated with precisely one {Employee Date of Birth}.
+ *
+ * [1] http://en.wikipedia.org/wiki/Database_normalization
+ *
+ * Most datasets might be normalized not to contain any such functional
+ * dependencies, but sometimes it's not practical. In some cases it's
+ * actually a conscious choice to model the dataset in denormalized way,
+ * either because of performance or to make querying easier.
+ *
+ * The current implementation supports only dependencies between two
+ * columns, but this is merely a simplification of the initial patch.
+ * It's certainly useful to mine for dependencies involving multiple
+ * columns on the 'left' side, i.e. a condition for the dependency.
+ * That is dependencies [A,B] => C and so on.
+ *
+ * TODO The implementation may/should be smart enough not to mine both
+ * [A => B] and [A,C => B], because the second dependency is a
+ * consequence of the first one (if values of A determine values
+ * of B, adding another column won't change that). The ANALYZE
+ * should first analyze 1:1 dependencies, then 2:1 dependencies
+ * (and skip the already identified ones), etc.
+ *
+ * For example the dependency [city name => zip code] is much weaker
+ * than [city name, state name => zip code], because there may be
+ * multiple cities with the same name in various states. It's not
+ * perfect though - there are probably cities with the same name within
+ * the same state, but this is relatively rare occurence hopefully.
+ * More about this in the section about dependency mining.
+ *
+ * Handling multiple columns on the right side is not necessary, as such
+ * dependencies may be decomposed into a set of dependencies with
+ * the same meaning, one for each column on the right side. For example
+ *
+ * A => [B,C]
+ *
+ * is exactly the same as
+ *
+ * (A => B) & (A => C).
+ *
+ * Of course, storing (A => [B, C]) may be more efficient thant storing
+ * the two dependencies (A => B) and (A => C) separately.
+ *
+ *
+ * Dependency mining (ANALYZE)
+ * ---------------------------
+ *
+ * The current build algorithm is rather simple - for each pair [A,B] of
+ * columns, the data are sorted lexicographically (first by A, then B),
+ * and then a number of metrics is computed by walking the sorted data.
+ *
+ * In general the algorithm counts distict values of A (forming groups
+ * thanks to the sorting), supporting or contradicting the hypothesis
+ * that A => B (i.e. that values of B are predetermined by A). If there
+ * are multiple values of B for a single value of A, it's counted as
+ * contradicting.
+ *
+ * A group may be neither supporting nor contradicting. To be counted as
+ * supporting, the group has to have at least min_group_size(=3) rows.
+ * Smaller 'supporting' groups are counted as neutral.
+ *
+ * Finally, the number of rows in supporting and contradicting groups is
+ * compared, and if there is at least 10x more supporting rows, the
+ * dependency is considered valid.
+ *
+ *
+ * Real-world datasets are imperfect - there may be errors (e.g. due to
+ * data-entry mistakes), or factually correct records, yet contradicting
+ * the dependency (e.g. when a city splits into two, but both keep the
+ * same ZIP code). A strict ANALYZE implementation (where the functional
+ * dependencies are identified) would ignore dependencies on such noisy
+ * data, making the approach unusable in practice.
+ *
+ * The proposed implementation attempts to handle such noisy cases
+ * gracefully, by tolerating small number of contradicting cases.
+ *
+ * In the future this might also perform some sort of test and decide
+ * whether it's worth building any other kind of multivariate stats,
+ * or whether the dependencies sufficiently describe the data. Or at
+ * least not build the MCV list / histogram on the implied columns.
+ * Such reduction would however make the 'verification' (see the next
+ * section) impossible.
+ *
+ *
+ * Clause reduction (planner/optimizer)
+ * ------------------------------------
+ *
+ * Apllying the dependencies is quite simple - given a list of clauses,
+ * try to apply all the dependencies. For example given clause list
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d < 100)
+ *
+ * and dependencies [a=>b] and [a=>d], this may be reduced to
+ *
+ * (a = 1) AND (c = 1) AND (d < 100)
+ *
+ * The (d<100) can't be reduced as it's not an equality clause, so the
+ * dependency [a=>d] can't be applied.
+ *
+ * See clauselist_apply_dependencies() for more details.
+ *
+ * The problem with the reduction is that the query may use conditions
+ * that are not redundant, but in fact contradictory - e.g. the user
+ * may search for a ZIP code and a city name not matching the ZIP code.
+ *
+ * In such cases, the condition on the city name is not actually
+ * redundant, but actually contradictory (making the result empty), and
+ * removing it while estimating the cardinality will make the estimate
+ * worse.
+ *
+ * The current estimation assuming independence (and multiplying the
+ * selectivities) works better in this case, but only by utter luck.
+ *
+ * In some cases this might be verified using the other multivariate
+ * statistics - MCV lists and histograms. For MCV lists the verification
+ * might be very simple - peek into the list if there are any items
+ * matching the clause on the 'A' column (e.g. ZIP code), and if such
+ * item is found, check that the 'B' column matches the other clause.
+ * If it does not, the clauses are contradictory. We can't really say
+ * if such item was not found, except maybe restricting the selectivity
+ * using the MCV data (e.g. using min/max selectivity, or something).
+ *
+ * With histograms, it might work similarly - we can't check the values
+ * directly (because histograms use buckets, unlike MCV lists, storing
+ * the actual values). So we can only observe the buckets matching the
+ * clauses - if those buckets have very low frequency, it probably means
+ * the two clauses are incompatible.
+ *
+ * It's unclear what 'low frequency' is, but if one of the clauses is
+ * implied (automatically true because of the other clause), then
+ *
+ * selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+ *
+ * So we might compute selectivity of the first clause (on the column
+ * A in dependency [A=>B]) - for example using regular statistics.
+ * And then check if the selectivity computed from the histogram is
+ * about the same (or significantly lower).
+ *
+ * The problem is that histograms work well only when the data ordering
+ * matches the natural meaning. For values that serve as labels - like
+ * city names or ZIP codes, or even generated IDs, histograms really
+ * don't work all that well. For example sorting cities by name won't
+ * match the sorting of ZIP codes, rendering the histogram unusable.
+ *
+ * The MCV are probably going to work much better, because they don't
+ * really assume any sort of ordering. And it's probably more appropriate
+ * for the label-like data.
+ *
+ * TODO Support dependencies with multiple columns on left/right.
+ *
+ * TODO Investigate using histogram and MCV list to confirm the
+ * functional dependencies.
+ *
+ * TODO Investigate statistical testing of the distribution (to decide
+ * whether it makes sense to build the histogram/MCV list).
+ *
+ * TODO Using a min/max of selectivities would probably make more sense
+ * for the associated columns.
+ *
+ * TODO Consider eliminating the implied columns from the histogram and
+ * MCV lists (but maybe that's not a good idea, because that'd make
+ * it impossible to use these stats for non-equality clauses and
+ * also it wouldn't be possible to use the stats for verification
+ * of the dependencies as proposed in another TODO).
+ *
+ * TODO This builds a complete set of dependencies, i.e. including
+ * transitive dependencies - if we identify [A => B] and [B => C],
+ * we're likely to identify [A => C] too. It might be better to
+ * keep only the minimal set of dependencies, i.e. prune all the
+ * dependencies that we can recreate by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may
+ * be recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check
+ * before actually doing the work
+ *
+ * The second option has the advantage that we don't really need
+ * to perform the sort/count. It's not sufficient alone, though,
+ * because we may discover the dependencies in the wrong order.
+ * For example [A => B], [A => C] and then [B => C]. None of those
+ * dependencies is a combination of the already known ones, yet
+ * [A => C] is a combination of [A => B] and [B => C].
+ *
+ * FIXME Not sure the current NULL handling makes much sense. We assume
+ * that NULL is 0, so it's handled like a regular value
+ * (NULL == NULL), so all NULLs in a single column form a single
+ * group. Maybe that's not the right thing to do, especially with
+ * equality conditions - in that case NULLs are irrelevant. So
+ * maybe the right solution would be to just ignore NULL values?
+ *
+ * However simply "ignoring" the NULL values does not seem like
+ * a good idea - imagine columns A and B, where for each value of
+ * A, values in B are constant (same for the whole group) or NULL.
+ * Let's say only 10% of B values in each group is not NULL. Then
+ * ignoring the NULL values will result in 10x misestimate (and
+ * it's trivial to construct arbitrary errors). So maybe handling
+ * NULL values just like a regular value is the right thing here.
+ *
+ * Or maybe NULL values should be treated differently on each side
+ * of the dependency? E.g. as ignored on the left (condition) and
+ * as regular values on the right - this seems consistent with how
+ * equality clauses work, as equality clause means 'NOT NULL'.
+ * So if we say [A => B] then it may also imply "NOT NULL" on the
+ * right side.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct columns in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we should expected to
+ * observe in the sample - we can then use the average group
+ * size as a threshold. That seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, stats);
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ SortItem current;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, stats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ current = items[0];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
+ {
+ n_violations += 1;
+ }
+
+ current = items[i];
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means there's a 1:1 relation (or one is a 'label'), making the
+ * conditions rather redundant. Although it's possible that the
+ * query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 04d769e..0b3518c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2096,6 +2096,46 @@ describeOneTableDetails(const char *schemaname,
PQclear(result);
}
+ /* print any multivariate statistics */
+ if (pset.sversion >= 90500)
+ {
+ printfPQExpBuffer(&buf,
+ "SELECT oid, stakeys,\n"
+ " deps_enabled,\n"
+ " deps_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
+ " (SELECT string_agg(attname::text,', ')\n"
+ " FROM ((SELECT unnest(stakeys) AS attnum) s\n"
+ " JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
+ "FROM pg_mv_statistic stat WHERE starelid = '%s' ORDER BY 1;",
+ oid);
+
+ result = PSQLexec(buf.data);
+ if (!result)
+ goto error_return;
+ else
+ tuples = PQntuples(result);
+
+ if (tuples > 0)
+ {
+ printTableAddFooter(&cont, _("Statistics:"));
+ for (i = 0; i < tuples; i++)
+ {
+ printfPQExpBuffer(&buf, " ");
+
+ /* options */
+ if (!strcmp(PQgetvalue(result, i, 2), "t"))
+ appendPQExpBuffer(&buf, "(dependencies)");
+
+ appendPQExpBuffer(&buf, " ON (%s)",
+ PQgetvalue(result, i, 6));
+
+ printTableAddFooter(&cont, buf.data);
+ }
+ }
+ PQclear(result);
+ }
+
/* print rules */
if (tableinfo.hasrules && tableinfo.relkind != 'm')
{
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index e6ac394..36debeb 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -119,6 +119,7 @@ extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
DropBehavior behavior, bool complain, bool internal);
extern void RemoveAttrDefaultById(Oid attrdefId);
extern void RemoveStatistics(Oid relid, AttrNumber attnum);
+extern void RemoveMVStatistics(Oid relid, AttrNumber attnum);
extern Form_pg_attribute SystemAttributeDefinition(AttrNumber attno,
bool relhasoids);
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index 71e0010..e404ae3 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,11 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..81ec23b
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,69 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_attrdef
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 5
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_deps_enabled 2
+#define Anum_pg_mv_statistic_deps_built 3
+#define Anum_pg_mv_statistic_stakeys 4
+#define Anum_pg_mv_statistic_stadeps 5
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index bd67d72..5024a01 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2724,6 +2724,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3284 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3285 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index fb2f035..724a169 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3288, 3289);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 8991f3f..d60835f 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -243,6 +243,7 @@ typedef enum NodeTag
T_PlaceHolderInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
+ T_MVStatisticInfo,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
@@ -415,6 +416,7 @@ typedef enum NodeTag
T_WithClause,
T_CommonTableExpr,
T_RoleSpec,
+ T_StatisticsDef,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 852eb4f..3cd57fd 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -570,6 +570,14 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct StatisticsDef
+{
+ NodeTag type;
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+} StatisticsDef;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1372,7 +1380,9 @@ typedef enum AlterTableType
AT_ReplicaIdentity, /* REPLICA IDENTITY */
AT_EnableRowSecurity, /* ENABLE ROW SECURITY */
AT_DisableRowSecurity, /* DISABLE ROW SECURITY */
- AT_GenericOptions /* OPTIONS (...) */
+ AT_GenericOptions, /* OPTIONS (...) */
+ AT_AddStatistics, /* ADD STATISTICS */
+ AT_DropStatistics /* DROP STATISTICS */
} AlterTableType;
typedef struct ReplicaIdentityStmt
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 1713d29..f6c4932 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -453,6 +453,7 @@ typedef struct RelOptInfo
Relids lateral_relids; /* minimum parameterization of rel */
Relids lateral_referencers; /* rels that reference me laterally */
List *indexlist; /* list of IndexOptInfo */
+ List *mvstatlist; /* list of MVStatisticInfo */
BlockNumber pages; /* size estimates derived from pg_class */
double tuples;
double allvisfrac;
@@ -545,6 +546,33 @@ typedef struct IndexOptInfo
bool amhasgetbitmap; /* does AM have amgetbitmap interface? */
} IndexOptInfo;
+/*
+ * MVStatisticInfo
+ * Information about multivariate stats for planning/optimization
+ *
+ * This contains information about which columns are covered by the
+ * statistics (stakeys), which options were requested while adding the
+ * statistics (*_enabled), and which kinds of statistics were actually
+ * built and are available for the optimizer (*_built).
+ */
+typedef struct MVStatisticInfo
+{
+ NodeTag type;
+
+ Oid mvoid; /* OID of the statistics row */
+ RelOptInfo *rel; /* back-link to index's table */
+
+ /* enabled statistics */
+ bool deps_enabled; /* functional dependencies enabled */
+
+ /* built/available statistics */
+ bool deps_built; /* functional dependencies built */
+
+ /* columns in the statistics (attnums) */
+ int2vector *stakeys; /* attnums of the columns covered */
+
+} MVStatisticInfo;
+
/*
* EquivalenceClasses
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 5b1ee15..0d7d758 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -355,7 +355,7 @@ PG_KEYWORD("stable", STABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("standalone", STANDALONE_P, UNRESERVED_KEYWORD)
PG_KEYWORD("start", START, UNRESERVED_KEYWORD)
PG_KEYWORD("statement", STATEMENT, UNRESERVED_KEYWORD)
-PG_KEYWORD("statistics", STATISTICS, UNRESERVED_KEYWORD)
+PG_KEYWORD("statistics", STATISTICS, RESERVED_KEYWORD)
PG_KEYWORD("stdin", STDIN, UNRESERVED_KEYWORD)
PG_KEYWORD("stdout", STDOUT, UNRESERVED_KEYWORD)
PG_KEYWORD("storage", STORAGE, UNRESERVED_KEYWORD)
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..411cd16
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,69 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "commands/vacuum.h"
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+
+#endif
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 9e17d87..83ca7fb 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -80,6 +80,7 @@ typedef struct RelationData
bool rd_isvalid; /* relcache entry is valid */
char rd_indexvalid; /* state of rd_indexlist: 0 = not valid, 1 =
* valid, 2 = temporarily forced */
+ bool rd_mvstatvalid; /* state of rd_mvstatlist: true/false */
/*
* rd_createSubid is the ID of the highest subtransaction the rel has
@@ -112,6 +113,9 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
+
+ /* data managed by RelationGetMVStatList: */
+ List *rd_mvstatlist; /* list of OIDs of multivariate stats */
/* data managed by RelationGetIndexAttrBitmap: */
Bitmapset *rd_indexattr; /* identifies columns used in indexes */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 6953281..77efeff 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -38,6 +38,7 @@ extern void RelationClose(Relation relation);
* Routines to compute/retrieve additional cached information
*/
extern List *RelationGetIndexList(Relation relation);
+extern List *RelationGetMVStatList(Relation relation);
extern Oid RelationGetOidIndex(Relation relation);
extern Oid RelationGetReplicaIndex(Relation relation);
extern List *RelationGetIndexExpressions(Relation relation);
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 6634099..ac119b3 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,7 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f7f016b..2f9758f 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1353,6 +1353,14 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index eb0bc88..92a0d8a 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
--
1.9.3
0002-clause-reduction-using-functional-dependencies.patchtext/x-patch; name=0002-clause-reduction-using-functional-dependencies.patchDownload
>From 827e5633e4368706a60ea6a949208205fc0928a3 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 19:42:18 +0200
Subject: [PATCH 2/6] clause reduction using functional dependencies
During planning, use functional dependencies to decide
which clauses to skip during cardinality estimation.
Initial and rather simplistic implementation.
This only works with regular WHERE clauses, not clauses
used for joining.
Note: The clause_is_mv_compatible() needs to identify the
relation (so that we can fetch the list of multivariate stats
by OID). planner_rt_fetch() seems like the appropriate way to
get the relation OID, but apparently it only works with simple
vars. Maybe examine_variable() would make this work with more
complex vars too?
Includes regression tests analyzing functional dependencies
(part of ANALYZE) on several datasets (no dependencies, no
transitive dependencies, ...).
Checks that a query with conditions on two columns, where one (B)
is functionally dependent on the other one (A), correctly ignores
the clause on (B) and chooses bitmap index scan instead of plain
index scan (which is what happens otherwise, thanks to assumption
of independence).
Note: Functional dependencies only work with equality clauses,
no inequalities etc.
---
src/backend/commands/analyze.c | 1 +
src/backend/commands/tablecmds.c | 8 +-
src/backend/optimizer/path/clausesel.c | 659 +++++++++++++++++++++++++-
src/backend/utils/mvstats/common.c | 5 +-
src/backend/utils/mvstats/dependencies.c | 24 +
src/include/catalog/pg_proc.h | 4 +-
src/include/utils/mvstats.h | 16 +-
src/test/regress/expected/mv_dependencies.out | 172 +++++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 150 ++++++
11 files changed, 1035 insertions(+), 8 deletions(-)
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index fff27e0..8f335f2 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -116,6 +116,7 @@ static void update_attstats(Oid relid, bool inh,
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
+
/*
* analyze_rel() -- analyze one relation
*/
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 5c57146..b372660 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11914,7 +11914,7 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
Relation mvstatrel;
/* by default build everything */
- bool build_dependencies = true;
+ bool build_dependencies = false;
Assert(IsA(def, StatisticsDef));
@@ -11976,6 +11976,12 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
opt->defname)));
}
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index dcac1c1..fb7adf8 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -24,6 +24,14 @@
#include "utils/lsyscache.h"
#include "utils/selfuncs.h"
+#include "utils/mvstats.h"
+#include "catalog/pg_collation.h"
+#include "utils/typcache.h"
+
+#include "parser/parsetree.h"
+
+
+#include <stdio.h>
/*
* Data structure for accumulating info about possible range-query
@@ -43,6 +51,16 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+
+static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, List *stats,
+ SpecialJoinInfo *sjinfo);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -61,7 +79,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -88,6 +106,76 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
*
* Of course this is all very dependent on the behavior of
* scalarltsel/scalargtsel; perhaps some day we can generalize the approach.
+ *
+ *
+ * Multivariate statististics
+ * --------------------------
+ * This also uses multivariate stats to estimate combinations of conditions,
+ * in a way attempting to minimize the overhead when there are no suitable
+ * multivariate stats.
+ *
+ * The following checks are performed (in this order), and the optimizer
+ * falls back to regular stats on the first 'false'.
+ *
+ * NOTE: This explains how this works with all the patches applied, not
+ * just the functional dependencies.
+ *
+ * (1) check that at least two columns are referenced from conditions
+ * compatible with multivariate stats
+ *
+ * If there are no conditions that might be handled by multivariate
+ * stats, or if the conditions reference just a single column, it
+ * makes no sense to use multivariate stats.
+ *
+ * What conditions are compatible with multivariate stats is decided
+ * by clause_is_mv_compatible(). At this moment, only simple conditions
+ * of the form "column operator constant" (for simple comparison
+ * operators), and IS NULL / IS NOT NULL are considered compatible
+ * with multivariate statistics.
+ *
+ * (2) reduce the clauses using functional dependencies
+ *
+ * This simply attempts to 'reduce' the clauses by applying functional
+ * dependencies. For example if there are two clauses:
+ *
+ * WHERE (a = 1) AND (b = 2)
+ *
+ * and we know that 'a' determines the value of 'b', we may remove
+ * the second condition (b = 2) when computing the selectivity.
+ * This is of course tricky - see mvstats/dependencies.c for details.
+ *
+ * After the reduction, step (1) is to be repeated.
+ *
+ * (3) check if there are multivariate stats built on the columns
+ *
+ * If there are no multivariate statistics, we have to fall back to
+ * the regular stats. We might perform checks (1) and (2) in reverse
+ * order, i.e. first check if there are multivariate statistics and
+ * then collect the attributes only if needed. The assumption is
+ * that checking the clauses is cheaper than querying the catalog,
+ * so this check is performed first.
+ *
+ * (4) choose the stats matching the most columns (at least two)
+ *
+ * If there are multiple instances of multivariate statistics (e.g.
+ * built on different sets of columns), we choose the stats covering
+ * the most columns from step (1). It may happen that all available
+ * stats match just a single column - for example with conditions
+ *
+ * WHERE a = 1 AND b = 2
+ *
+ * and statistics built on (a,c) and (b,c). In such case just fall
+ * back to the regular stats because it makes no sense to use the
+ * multivariate statistics.
+ *
+ * This selection criteria (the most columns) is certainly very
+ * simple and definitely not optimal - it's simple to come up with
+ * examples where other approaches work better. More about this
+ * at choose_mv_statistics().
+ *
+ * (5) use the multivariate stats to estimate matching clauses
+ *
+ * (6) estimate the remaining clauses using the regular statistics
*/
Selectivity
clauselist_selectivity(PlannerInfo *root,
@@ -100,6 +188,12 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+
+ /* attributes in mv-compatible clauses */
+ Bitmapset *mvattnums = NULL;
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +202,35 @@ clauselist_selectivity(PlannerInfo *root,
return clause_selectivity(root, (Node *) linitial(clauses),
varRelid, jointype, sjinfo);
+ /* collect attributes referenced by mv-compatible clauses */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+
+ /*
+ * If there are mv-compatible clauses, referencing at least two
+ * different columns (otherwise it makes no sense to use mv stats),
+ * try to reduce the clauses using functional dependencies, and
+ * recollect the attributes from the reduced list.
+ *
+ * We don't need to select a single statistics for this - we can
+ * apply all the functional dependencies we have.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /*
+ * fetch info from the catalog (not the serialized stats yet)
+ *
+ * TODO This is rather ugly - we get the stats as a list from
+ * RelOptInfo (thanks to relcache/syscache), but we transform
+ * it into an array (which the other methods use for now).
+ * This should not be necessary, I guess.
+ * */
+ List *stats = root->simple_rel_array[relid]->mvstatlist;
+
+ /* reduce clauses by applying functional dependencies rules */
+ clauses = clauselist_apply_dependencies(root, clauses, varRelid,
+ stats, sjinfo);
+ }
+
/*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
@@ -782,3 +905,537 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
+ Index *relid, SpecialJoinInfo *sjinfo)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ *relid = InvalidOid;
+ }
+
+ return attnums;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+{
+
+ if (IsA(clause, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* no support for OR clauses at this point */
+ if (rinfo->orclause)
+ return false;
+
+ /* get the actual clause from the RestrictInfo (it's not an OR clause) */
+ clause = (Node*)rinfo->clause;
+
+ /* only simple opclauses are compatible with multivariate stats */
+ if (! is_opclause(clause))
+ return false;
+
+ /* we don't support join conditions at this moment */
+ if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
+ return false;
+
+ /* is it 'variable op constant' ? */
+ if (list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ *relid = var->varno;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+ *attnum = var->varattno;
+ return true;
+ }
+ }
+ }
+ }
+
+ return false;
+
+}
+
+/*
+ * Performs reduction of clauses using functional dependencies, i.e.
+ * removes clauses that are considered redundant. It simply walks
+ * through dependencies, and checks whether the dependency 'matches'
+ * the clauses, i.e. if there's a clause matching the condition. If yes,
+ * all clauses matching the implied part of the dependency are removed
+ * from the list.
+ *
+ * This simply looks at attnums references by the clauses, not at the
+ * type of the operator (equality, inequality, ...). This may not be the
+ * right way to do - it certainly works best for equalities, which is
+ * naturally consistent with functional dependencies (implications).
+ * It's not clear that other operators are handled sensibly - for
+ * example for inequalities, like
+ *
+ * WHERE (A >= 10) AND (B <= 20)
+ *
+ * and a trivial case where [A == B], resulting in symmetric pair of
+ * rules [A => B], [B => A], it's rather clear we can't remove either of
+ * those clauses.
+ *
+ * That only highlights that functional dependencies are most suitable
+ * for label-like data, where using non-equality operators is very rare.
+ * Using the common city/zipcode example, clauses like
+ *
+ * (zipcode <= 12345)
+ *
+ * or
+ *
+ * (cityname >= 'Washington')
+ *
+ * are rare. So restricting the reduction to equality should not harm
+ * the usefulness / applicability.
+ *
+ * The other assumption is that this assumes 'compatible' clauses. For
+ * example by using mismatching zip code and city name, this is unable
+ * to identify the discrepancy and eliminates one of the clauses. The
+ * usual approach (multiplying both selectivities) thus produces a more
+ * accurate estimate, although mostly by luck - the multiplication
+ * comes from assumption of statistical independence of the two
+ * conditions (which is not not valid in this case), but moves the
+ * estimate in the right direction (towards 0%).
+ *
+ * This might be somewhat improved by cross-checking the selectivities
+ * against MCV and/or histogram.
+ *
+ * The implementation needs to be careful about cyclic rules, i.e. rules
+ * like [A => B] and [B => A] at the same time. This must not reduce
+ * clauses on both attributes at the same time.
+ *
+ * Technically we might consider selectivities here too, somehow. E.g.
+ * when (A => B) and (B => A), we might use the clauses with minimum
+ * selectivity.
+ *
+ * TODO Consider restricting the reduction to equality clauses. Or maybe
+ * use equality classes somehow?
+ *
+ * TODO Merge this docs to dependencies.c, as it's saying mostly the
+ * same things as the comments there.
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, List *stats,
+ SpecialJoinInfo *sjinfo)
+{
+ int i;
+ ListCell *lc;
+ List * reduced_clauses = NIL;
+ Index relid;
+
+ /*
+ * preallocate space for all clauses, including non-mv-compatible,
+ * so that we don't need to reallocate the arrays repeatedly
+ *
+ * XXX This assumes each clause references exactly one Var, so the
+ * arrays are sized accordingly - for functional dependencies
+ * this is safe, because it only works with Var=Const.
+ */
+ bool *reduced;
+ AttrNumber *mvattnums;
+ Node **mvclauses;
+ int nmvclauses = 0; /* number clauses in the arrays */
+
+ /*
+ * matrix of (natts x natts), 1 means x=>y
+ *
+ * This serves two purposes - first, it merges dependencies from all
+ * the statistics, second it makes generating all the transitive
+ * dependencies easier.
+ *
+ * We need to build this only for attributes from the dependencies,
+ * not for all attributes in the table.
+ *
+ * We can't do that only for attributes from the clauses, because we
+ * want to build transitive dependencies (including those going
+ * through attributes not listed in the stats).
+ *
+ * This only works for A=>B dependencies, not sure how to do that
+ * for complex dependencies.
+ */
+ bool *deps_matrix;
+ int deps_natts; /* size of the matric */
+
+ /* mapping attnum <=> matrix index */
+ int *deps_idx_to_attnum;
+ int *deps_attnum_to_idx;
+
+ /* attnums in dependencies and clauses (and intersection) */
+ Bitmapset *deps_attnums = NULL;
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *intersect_attnums = NULL;
+
+ int attnum, attidx, attnum_max;
+
+ bool has_deps_built = false;
+
+ /* see if there's at least one statistics with dependencies */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ if (info->deps_built)
+ {
+ has_deps_built = true;
+ break;
+ }
+ }
+
+ /* no dependencies available - return the original clauses */
+ if (! has_deps_built)
+ return clauses;
+
+ mvclauses = (Node**)palloc0(list_length(clauses) * sizeof(Node*));
+ mvattnums = (AttrNumber*)palloc0(list_length(clauses) * sizeof(AttrNumber));
+
+ /*
+ * Walk through the clauses - clauses that are not mv-compatible copy
+ * directly into the result list, and mv-compatible ones store into
+ * an array of clauses (and remember the attnumb in another array).
+ */
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(lc);
+ if (! clause_is_mv_compatible(root, clause, varRelid, &relid, &attnum, sjinfo))
+ reduced_clauses = lappend(reduced_clauses, clause);
+ else
+ {
+ mvclauses[nmvclauses] = clause;
+ mvattnums[nmvclauses] = attnum;
+ nmvclauses++;
+
+ clause_attnums = bms_add_member(clause_attnums, attnum);
+ }
+ }
+
+ /*
+ * we need at least two clauses referencing two different attributes
+ * referencing to do the reduction
+ */
+ if ((nmvclauses < 2) || (bms_num_members(clause_attnums) < 2))
+ {
+ pfree(mvattnums);
+ pfree(mvclauses);
+
+ bms_free(clause_attnums);
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ reduced = (bool*)palloc0(list_length(clauses) * sizeof(bool));
+
+ /* build the dependency matrix */
+ attnum_max = -1;
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ int2vector *stakeys = info->stakeys;
+
+ /* skip stats without functional dependencies built */
+ if (! info->deps_built)
+ continue;
+
+ for (j = 0; j < stakeys->dim1; j++)
+ {
+ int attnum = stakeys->values[j];
+ deps_attnums = bms_add_member(deps_attnums, attnum);
+
+ /* keep the max attnum in the dependencies */
+ attnum_max = (attnum > attnum_max) ? attnum : attnum_max;
+ }
+ }
+
+ /*
+ * We need at least two matching attributes in the clauses and
+ * dependencies, otherwise we can't reduce anything.
+ */
+ intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
+ if (bms_num_members(intersect_attnums) < 2)
+ {
+ pfree(mvattnums);
+ pfree(mvclauses);
+
+ bms_free(clause_attnums);
+ bms_free(deps_attnums);
+ bms_free(intersect_attnums);
+
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ /* allocate the matrix and mappings */
+ deps_natts = bms_num_members(deps_attnums);
+ deps_matrix = (bool*)palloc0(deps_natts * deps_natts * sizeof(int));
+ deps_idx_to_attnum = (int*)palloc0(deps_natts * sizeof(int));
+ deps_attnum_to_idx = (int*)palloc0((attnum_max+1) * sizeof(int));
+
+ /* build the (attnum => attidx) and (attidx => attnum) mappings */
+ attidx = 0;
+ attnum = -1;
+
+ while (true)
+ {
+ attnum = bms_next_member(deps_attnums, attnum);
+ if (attnum == -2)
+ break;
+
+ deps_idx_to_attnum[attidx] = attnum;
+ deps_attnum_to_idx[attnum] = attidx;
+
+ attidx += 1;
+ }
+
+ /* do we have all the attributes mapped? */
+ Assert(attidx == deps_natts);
+
+ /* walk through all the mvstats, build the adjacency matrix */
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies dependencies = NULL;
+
+ /* skip stats without functional dependencies built */
+ if (! info->deps_built)
+ continue;
+
+ /* fetch dependencies */
+ dependencies = load_mv_dependencies(info->mvoid);
+ if (dependencies == NULL)
+ continue;
+
+ /* set deps_matrix[a,b] to 'true' if 'a=>b' */
+ for (j = 0; j < dependencies->ndeps; j++)
+ {
+ int aidx = deps_attnum_to_idx[dependencies->deps[j]->a];
+ int bidx = deps_attnum_to_idx[dependencies->deps[j]->b];
+
+ /* a=> b */
+ deps_matrix[aidx * deps_natts + bidx] = true;
+ }
+ }
+
+ /*
+ * Multiply the matrix N-times (N = size of the matrix), so that we
+ * get all the transitive dependencies. That makes the next step
+ * much easier and faster.
+ *
+ * This is essentially an adjacency matrix from graph theory, and
+ * by multiplying it we get transitive edges. We don't really care
+ * about the exact number (number of paths between vertices) though,
+ * so we can do the multiplication in-place (we don't care whether
+ * we found the dependency in this round or in the previous one).
+ *
+ * Track how many new dependencies were added, and stop when 0, but
+ * we can't multiply more than N-times (longest path in the graph).
+ */
+ for (i = 0; i < deps_natts; i++)
+ {
+ int k, l, m;
+ int nchanges = 0;
+
+ /* k => l */
+ for (k = 0; k < deps_natts; k++)
+ {
+ for (l = 0; l < deps_natts; l++)
+ {
+ /* we already have this dependency */
+ if (deps_matrix[k * deps_natts + l])
+ continue;
+
+ /* we don't really care about the exact value, just 0/1 */
+ for (m = 0; m < deps_natts; m++)
+ {
+ if (deps_matrix[k * deps_natts + m] * deps_matrix[m * deps_natts + l])
+ {
+ deps_matrix[k * deps_natts + l] = true;
+ nchanges += 1;
+ break;
+ }
+ }
+ }
+ }
+
+ /* no transitive dependency added here, so terminate */
+ if (nchanges == 0)
+ break;
+ }
+
+ /*
+ * Walk through the clauses, and see which other clauses we may
+ * reduce. The matrix contains all transitive dependencies, which
+ * makes this very fast.
+ *
+ * We have to be careful not to reduce the clause using itself, or
+ * reducing all clauses forming a cycle (so we have to skip already
+ * eliminated clauses).
+ *
+ * I'm not sure whether this guarantees finding the best solution,
+ * i.e. reducing the most clauses, but it probably does (thanks to
+ * having all the transitive dependencies).
+ */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ int j;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[i], deps_attnums))
+ continue;
+
+ /* this clause was already reduced, so let's skip it */
+ if (reduced[i])
+ continue;
+
+ /* walk the potentially 'implied' clauses */
+ for (j = 0; j < nmvclauses; j++)
+ {
+ int aidx, bidx;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[j], deps_attnums))
+ continue;
+
+ aidx = deps_attnum_to_idx[mvattnums[i]];
+ bidx = deps_attnum_to_idx[mvattnums[j]];
+
+ /* can't reduce the clause by itself, or if already reduced */
+ if ((i == j) || reduced[j])
+ continue;
+
+ /* mark the clause as reduced (if aidx => bidx) */
+ reduced[j] = deps_matrix[aidx * deps_natts + bidx];
+ }
+ }
+
+ /* now walk through the clauses, and keep only those not reduced */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ if (! reduced[i])
+ reduced_clauses = lappend(reduced_clauses, mvclauses[i]);
+ }
+
+ pfree(reduced);
+ pfree(mvclauses);
+ pfree(mvattnums);
+
+ pfree(deps_matrix);
+ pfree(deps_idx_to_attnum);
+ pfree(deps_attnum_to_idx);
+
+ bms_free(deps_attnums);
+ bms_free(clause_attnums);
+ bms_free(intersect_attnums);
+
+ return reduced_clauses;
+}
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index a755c49..bd200bc 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -84,7 +84,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(stat->mvoid, deps, attrs);
@@ -163,6 +164,7 @@ list_mv_stats(Oid relid)
info->mvoid = HeapTupleGetOid(htup);
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
result = lappend(result, info);
@@ -274,6 +276,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 0ca16a0..cf66bc5 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -636,3 +636,27 @@ pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(result));
}
+
+MVDependencies
+load_mv_dependencies(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->deps_enabled && mvstat->deps_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_dependencies(DatumGetByteaP(deps));
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 5024a01..2178f6c 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2724,9 +2724,9 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
-DATA(insert OID = 3284 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DATA(insert OID = 3377 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies info");
-DATA(insert OID = 3285 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DATA(insert OID = 3378 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 411cd16..02a7dda 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -16,12 +16,20 @@
#include "commands/vacuum.h"
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
int16 a;
@@ -47,6 +55,8 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
+MVDependencies load_mv_dependencies(Oid mvoid);
+
bytea * serialize_mv_dependencies(MVDependencies dependencies);
/* deserialization of stats (serialization is private to analyze) */
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..cf986e8
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,172 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE functional_dependencies ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c, d);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 6d3b865..00c6ddf 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -109,3 +109,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 8326894..b818be9 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -153,3 +153,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..2491aca
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,150 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (unknown_column);
+
+-- single column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a);
+
+-- single column, duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE functional_dependencies ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- correct command
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c, d);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
--
1.9.3
0003-multivariate-MCV-lists.patchtext/x-patch; name=0003-multivariate-MCV-lists.patchDownload
>From d454055da3025437cbfab0ca772df818fccc3c13 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 16:52:15 +0200
Subject: [PATCH 3/6] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
Conflicts:
src/backend/optimizer/path/clausesel.c
---
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/tablecmds.c | 89 ++-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 1167 ++++++++++++++++++++++++++++--
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 104 ++-
src/backend/utils/mvstats/common.h | 11 +-
src/backend/utils/mvstats/mcv.c | 1232 ++++++++++++++++++++++++++++++++
src/bin/psql/describe.c | 24 +-
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 69 +-
src/test/regress/expected/mv_mcv.out | 207 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 178 +++++
19 files changed, 3021 insertions(+), 103 deletions(-)
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 07586c6..74fedf0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -156,7 +156,9 @@ CREATE VIEW pg_mv_stats AS
C.relname AS tablename,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b372660..545b595 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11914,7 +11914,13 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
Relation mvstatrel;
/* by default build everything */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(def, StatisticsDef));
@@ -11969,6 +11975,29 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -11977,10 +12006,16 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -11996,9 +12031,13 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
- nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
@@ -12045,7 +12084,13 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
/* checking whether the statistics matches / should be dropped */
bool build_dependencies = false;
+ bool build_mcv = false;
+
+ bool max_mcv_items = 0;
+
bool check_dependencies = false;
+ bool check_mcv = false;
+ bool check_mcv_items = false;
if (def != NULL)
{
@@ -12087,6 +12132,18 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
check_dependencies = true;
build_dependencies = defGetBoolean(opt);
}
+ else if (strcmp(opt->defname, "mcv") == 0)
+ {
+ check_mcv = true;
+ build_mcv = defGetBoolean(opt);
+ }
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ check_mcv = true;
+ check_mcv_items = true;
+ build_mcv = true;
+ max_mcv_items = defGetInt32(opt);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -12126,6 +12183,30 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
(DatumGetBool(adatum) == build_dependencies);
}
+ if (delete && check_mcv)
+ {
+ bool isnull;
+ Datum adatum = heap_getattr(tuple,
+ Anum_pg_mv_statistic_mcv_enabled,
+ RelationGetDescr(statrel),
+ &isnull);
+
+ delete = (! isnull) &&
+ (DatumGetBool(adatum) == build_mcv);
+ }
+
+ if (delete && check_mcv_items)
+ {
+ bool isnull;
+ Datum adatum = heap_getattr(tuple,
+ Anum_pg_mv_statistic_mcv_max_items,
+ RelationGetDescr(statrel),
+ &isnull);
+
+ delete = (! isnull) &&
+ (DatumGetInt32(adatum) == max_mcv_items);
+ }
+
/* check that the columns match the statistics definition */
if (delete && (numcols > 0))
{
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index c2d5dc5..635ccc1 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1851,9 +1851,11 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
+ WRITE_BOOL_FIELD(mcv_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
+ WRITE_BOOL_FIELD(mcv_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index fb7adf8..abffb0a 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -20,6 +20,7 @@
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
#include "utils/selfuncs.h"
@@ -50,17 +51,46 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+ Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int type);
static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
+ int type);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, List *stats,
SpecialJoinInfo *sjinfo);
+static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
+
+static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid,
+ List **mvclauses, MVStatisticInfo *mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,isor) \
+ (m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -195,15 +225,19 @@ clauselist_selectivity(PlannerInfo *root,
Bitmapset *mvattnums = NULL;
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
varRelid, jointype, sjinfo);
- /* collect attributes referenced by mv-compatible clauses */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+ /*
+ * Collect attributes referenced by mv-compatible clauses (looking
+ * for clauses compatible with functional dependencies for now).
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_FDEP);
/*
* If there are mv-compatible clauses, referencing at least two
@@ -232,6 +266,58 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Recollect attributes from mv-compatible clauses (maybe we've
+ * removed so many clauses we have a single mv-compatible attnum).
+ * From now on we're only interested in MCV-compatible clauses.
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_MCV);
+
+ /*
+ * If there still are at least two columns, we'll try to select
+ * a suitable multivariate stats.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /*
+ * fetch info from the catalog (not the serialized stats yet)
+ *
+ * TODO We may need to repeat this, because the previous load only
+ * happens if there are at least 2 clauses compatible with
+ * functional dependencies.
+ *
+ * TODO This is rather ugly - we get the stats as a list from
+ * RelOptInfo (thanks to relcache/syscache), but we transform
+ * it into an array (which the other methods use for now).
+ * This should not be necessary, I guess.
+ * */
+ List *stats = root->simple_rel_array[relid]->mvstatlist;
+
+ /* see choose_mv_statistics() for details */
+ if (stats != NIL)
+ {
+ MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+
+ if (mvstat != NULL) /* we have a matching stats */
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -906,12 +992,198 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Estimate selectivity for the list of MV-compatible clauses, using that
+ * particular histogram.
+ *
+ * When we hit a single bucket, we don't know what portion of it actually
+ * matches the clauses (e.g. equality), and we use 1/2 the bucket by
+ * default. However, the MV histograms are usually less detailed than
+ * the per-column ones, meaning the sum of buckets is often quite high
+ * (thanks to combining a lot of "partially hit" buckets).
+ *
+ * There are several ways to improve this, usually with cases when it
+ * won't really help. Also, the more complex the process, the worse
+ * the failures (i.e. misestimates).
+ *
+ * (1) Use the MV histogram only as a way to combine multiple
+ * per-column histograms, essentially rewriting
+ *
+ * P(A & B) = P(A) * P(B|A)
+ *
+ * where P(B|A) may be computed using a proper "slice" of the
+ * histogram, by first selecting only buckets where A is true, and
+ * then using the boundaries to 'restrict' the per-colunm histogram.
+ *
+ * With more clauses, it gets more complicated, of course
+ *
+ * P(A & B & C) = P(A & C) * P(B|A & C)
+ * = P(A) * P(C|A) * P(B|A & C)
+ *
+ * and so on.
+ *
+ * Of course, the question is how well and efficiently we can
+ * compute the conditional probabilities - whether this approach
+ * can improve the estimates (instead of amplifying the errors).
+ *
+ * Also, this does not eliminate the need for histogram on [A,B,C].
+ *
+ * (2) Use multiple smaller (and more accurate) histograms, and combine
+ * them using a process similar to the above. E.g. by assuming that
+ * B and C are independent, we can rewrite
+ *
+ * P(B|A & C) = P(B|A)
+ *
+ * so we can rewrite the whole formula to
+ *
+ * P(A & B & C) = P(A) * P(C|A) * P(B|A)
+ *
+ * and we're OK with two 2D histograms [A,C] and [A,B].
+ *
+ * It'd be nice to perform some sort of statistical test (Fisher
+ * or another chi-squared test) to identify independent components
+ * and automatically separate them into smaller histograms.
+ *
+ * (3) Using the estimated number of distinct values in a bucket to
+ * decide the selectivity of equality in the bucket (instead of
+ * blindly using 1/2 of the bucket, we may use 1/ndistinct).
+ * Of course, if the ndistinct estimate is way off, or when the
+ * distribution is not uniform (one distict items get much more
+ * items), this will fail. Also, we currently don't have ndistinct
+ * estimate available at this moment (but it shouldn't be that
+ * difficult to compute as ndistinct and ntuples should be available).
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities
+ * (i.e. the selectivity of the most restrictive clause), because
+ * that's the maximum we can ever get from ANDed list of clauses.
+ * This may probably prevent issues with hitting too many buckets
+ * and low precision histograms.
+ *
+ * TODO We may support some additional conditions, most importantly
+ * those matching multiple columns (e.g. "a = b" or "a < b").
+ * Ultimately we could track multi-table histograms for join
+ * cardinality estimation.
+ *
+ * TODO Currently this is only estimating all clauses, or clauses
+ * matching varRelid (when it's not 0). I'm not sure what's the
+ * purpose of varRelid, but my assumption is this is used for
+ * join conditions and such. In that case we can use those clauses
+ * to restrict the other (i.e. filter the histogram buckets first,
+ * before estimating the other clauses). This is essentially equal
+ * to computing P(A|B) where "B" are the clauses not matching the
+ * varRelid.
+ *
+ * TODO Further thoughts on processing equality clauses - maybe it'd be
+ * better to look for stats (with MCV) covered by the equality
+ * clauses, because then we have a chance to find an exact match
+ * in the MCV list, which is pretty much the best we can do. We may
+ * also look at the least frequent MCV item, and use it as a upper
+ * boundary for the selectivity (had there been a more frequent
+ * item, it'd be in the MCV list).
+ *
+ * These conditions may then be used as a condition for the other
+ * selectivities, i.e. we may estimate P(A,B) first, and then
+ * compute P(C|A,B) from another histogram. This may be useful when
+ * we can estimate P(A,B) accurately (e.g. because it's a complete
+ * equality match evaluated on MCV list), and then compute the
+ * conditional probability P(C|A,B), giving us the requested stats
+ *
+ * P(A,B,C) = P(A,B) * P(C|A,B)
+ *
+ * TODO There are several options for 'sanity clamping' the estimates.
+ *
+ * First, if we have selectivities for each condition, then
+ *
+ * P(A,B) <= MIN(P(A), P(B))
+ *
+ * Because additional conditions (connected by AND) can only lower
+ * the probability.
+ *
+ * So we can do some basic sanity checks using the single-variate
+ * stats (the ones we have right now).
+ *
+ * Second, when we have multivariate stats with a MCV list, then
+ *
+ * (a) if we have a full equality condition (one equality condition
+ * on each column) and we found a match in the MCV list, this is
+ * the selectivity (and it's supposed to be exact)
+ *
+ * (b) if we have a full equality condition and we haven't found a
+ * match in the MCV list, then the selectivity is below the
+ * lowest selectivity in the MCV list
+ *
+ * (c) if we have a equality condition (not full), we can still
+ * search the MCV for matches and use the sum of probabilities
+ * as a lower boundary for the histogram (if there are no
+ * matches in the MCV list, then we have no boundary)
+ *
+ * Third, if there are multiple multivariate stats for a set of
+ * clauses, we may compute all of them and then somehow aggregate
+ * them - e.g. by choosing the minimum, median or average. The
+ * multi-variate stats are susceptible to overestimation (because
+ * we take 50% of the bucket for partial matches). Some stats may
+ * give better estimates than others, but it's very difficult to
+ * say determine that in advance which one is the best (it depends
+ * on the number of buckets, number of additional columns not
+ * referenced in the clauses etc.) so we may compute all and then
+ * choose a sane aggregation (minimum seems like a good approach).
+ * Of course, this may result in longer / more expensive estimation
+ * (CPU-wise), but it may be worth it.
+ *
+ * There are ways to address this, though. First, it's possible to
+ * add a GUC choosing whether to do a 'simple' (using a single
+ * stats expected to give the best estimate) and 'complex' (combining
+ * the multiple estimates).
+ *
+ * multivariate_estimates = (simple|full)
+ *
+ * Also, this might be enabled at a table level, by something like
+ *
+ * ALTER TABLE ... SET STATISTICS (simple|full)
+ *
+ * Which would make it possible to use this only for the tables
+ * where the simple approach does not work.
+ *
+ * Also, there are ways to optimize this algorithmically. E.g. we
+ * may try to get an estimate from a matching MCV list first, and
+ * if we happen to get a "full equality match" we may stop computing
+ * the estimates from other stats (for this condition) because
+ * that's probably the best estimate we can really get.
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive).
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
- Index *relid, SpecialJoinInfo *sjinfo)
+ Index *relid, SpecialJoinInfo *sjinfo, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
@@ -927,12 +1199,11 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
+ sjinfo, types);
}
/*
@@ -951,6 +1222,188 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
}
/*
+ * We're looking for statistics matching at least 2 attributes,
+ * referenced in the clauses compatible with multivariate statistics.
+ * The current selection criteria is very simple - we choose the
+ * statistics referencing the most attributes.
+ *
+ * If there are multiple statistics referencing the same number of
+ * columns (from the clauses), the one with less source columns
+ * (as listed in the ADD STATISTICS when creating the statistics) wins.
+ * Other wise the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns,
+ * but one has 100 buckets and the other one has 1000 buckets (thus
+ * likely providing better estimates), this is not currently
+ * considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list,
+ * another one with just a histogram and a third one with both,
+ * this is not considered.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts,
+ * so if there are multiple clauses on a single attribute, this
+ * still counts as a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example
+ * equality clauses probably work better with MCV lists than with
+ * histograms. But IS [NOT] NULL conditions may often work better
+ * with histograms (thanks to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
+ * selected as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list
+ * of clauses into two parts - conditions that are compatible with the
+ * selected stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while
+ * the last condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting
+ * conditions instead of just referenced attributes), but eventually
+ * the best option should be to combine multiple statistics. But that's
+ * much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing
+ * the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses,
+ * because 'dependencies' will probably work only with equality
+ * clauses.
+ */
+static MVStatisticInfo *
+choose_mv_statistics(List *stats, Bitmapset *attnums)
+{
+ int i;
+ ListCell *lc;
+
+ MVStatisticInfo *choice = NULL;
+
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements)
+ * and for each one count the referenced attributes (encoded in
+ * the 'attnums' bitmap).
+ */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = info->stakeys;
+ int numattrs = attrs->dim1;
+
+ /* skip dependencies-only stats */
+ if (! info->mcv_built)
+ continue;
+
+ /* count columns covered by the histogram */
+ for (i = 0; i < numattrs; i++)
+ if (bms_is_member(attrs->values[i], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = info;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen statistics, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes covered by the stats, so we can
+ * do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(root, clause, varRelid, NULL,
+ &attnums, sjinfo, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list
+ * of mv-compatible clauses. Otherwise, keep it in the list of
+ * 'regular' clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
* into simple_rte_array) and a bitmap of attributes. This is then
@@ -969,93 +1422,197 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
static bool
clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+ Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int types)
{
+ Relids clause_relids;
+ Relids left_relids;
+ Relids right_relids;
if (IsA(clause, RestrictInfo))
{
RestrictInfo *rinfo = (RestrictInfo *) clause;
- /* Pseudoconstants are not really interesting here. */
- if (rinfo->pseudoconstant)
+ if (! IsA(clause, RestrictInfo))
+ {
+ elog(WARNING, "expected RestrictInfo, got type %d", clause->type);
return false;
+ }
- /* no support for OR clauses at this point */
- if (rinfo->orclause)
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
return false;
/* get the actual clause from the RestrictInfo (it's not an OR clause) */
clause = (Node*)rinfo->clause;
- /* only simple opclauses are compatible with multivariate stats */
- if (! is_opclause(clause))
- return false;
-
/* we don't support join conditions at this moment */
if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
return false;
+ clause_relids = rinfo->clause_relids;
+ left_relids = rinfo->left_relids;
+ right_relids = rinfo->right_relids;
+ }
+ else if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
+ {
+ left_relids = pull_varnos(get_leftop((Expr*)clause));
+ right_relids = pull_varnos(get_rightop((Expr*)clause));
+
+ clause_relids = bms_union(left_relids,
+ right_relids);
+ }
+ else
+ {
+ /* Not a binary opclause, so mark left/right relid sets as empty */
+ left_relids = NULL;
+ right_relids = NULL;
+ /* and get the total relid set the hard way */
+ clause_relids = pull_varnos((Node *) clause);
+ }
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
/* is it 'variable op constant' ? */
- if (list_length(((OpExpr *) clause)->args) == 2)
+
+ ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ left_relids)));
+
+ if (ok)
{
- OpExpr *expr = (OpExpr *) clause;
- bool varonleft = true;
- bool ok;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
- ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
- (is_pseudo_constant_clause_relids(lsecond(expr->args),
- rinfo->right_relids) ||
- (varonleft = false,
- is_pseudo_constant_clause_relids(linitial(expr->args),
- rinfo->left_relids)));
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
- if (ok)
- {
- Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
- /*
- * Simple variables only - otherwise the planner_rt_fetch seems to fail
- * (return NULL).
- *
- * TODO Maybe use examine_variable() would fix that?
- */
- if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
- return false;
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
- /*
- * Only consider this variable if (varRelid == 0) or when the varno
- * matches varRelid (see explanation at clause_selectivity).
- *
- * FIXME I suspect this may not be really necessary. The (varRelid == 0)
- * part seems to be enforced by treat_as_join_clause().
- */
- if (! ((varRelid == 0) || (varRelid == var->varno)))
- return false;
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
+ *relid = var->varno;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ /* not compatible with functional dependencies */
+ if (types & MV_CLAUSE_TYPE_MCV)
+ {
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return (types & MV_CLAUSE_TYPE_MCV);
+ }
+ return false;
+
+ case F_EQSEL:
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return true;
+ }
+ }
+ }
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ Var * var = (Var*)((NullTest*)clause)->arg;
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
- /* Also skip special varno values, and system attributes ... */
- if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
- return false;
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
*relid = var->varno;
- /*
- * If it's not a "<" or ">" or "=" operator, just ignore the
- * clause. Otherwise note the relid and attnum for the variable.
- * This uses the function for estimating selectivity, ont the
- * operator directly (a bit awkward, but well ...).
- */
- switch (get_oprrest(expr->opno))
- {
- case F_EQSEL:
- *attnum = var->varattno;
- return true;
- }
- }
+ *attnums = bms_add_member(*attnums, var->varattno);
+
+ return true;
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /*
+ * AND/OR-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses
+ * are supported and some are not, and treat all supported
+ * subclauses as a single clause, compute it's selectivity
+ * using mv stats, and compute the total selectivity using
+ * the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the
+ * orclause with nested RestrictInfo - we won't have to
+ * call pull_varnos() for each clause, saving time.
+ */
+ Bitmapset *tmp = NULL;
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ if (! clause_is_mv_compatible(root, (Node*)lfirst(l),
+ varRelid, relid, &tmp, sjinfo, types))
+ return false;
}
+
+ /* add the attnums from the OR-clause to the set of attnums */
+ *attnums = bms_join(*attnums, tmp);
+
+ return true;
}
return false;
-
}
/*
@@ -1117,6 +1674,13 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
*
* TODO Merge this docs to dependencies.c, as it's saying mostly the
* same things as the comments there.
+ *
+ * TODO Currently this is applied only to the top-level clauses, but
+ * maybe we could apply it to lists at subtrees too, e.g. to the
+ * two AND-clauses in
+ *
+ * (x=1 AND y=2) OR (z=3 AND q=10)
+ *
*/
static List *
clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
@@ -1200,17 +1764,27 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
*/
foreach (lc, clauses)
{
- AttrNumber attnum;
+ Bitmapset *attnums = NULL;
Node *clause = (Node *) lfirst(lc);
- if (! clause_is_mv_compatible(root, clause, varRelid, &relid, &attnum, sjinfo))
+ if (! clause_is_mv_compatible(root, clause, varRelid, &relid, &attnums,
+ sjinfo, MV_CLAUSE_TYPE_FDEP))
+ reduced_clauses = lappend(reduced_clauses, clause);
+ else if (bms_num_members(attnums) > 1)
+ /* FIXME This may happen thanks to OR-clauses, which should
+ * really be handled differently for functional
+ * dependencies.
+ */
reduced_clauses = lappend(reduced_clauses, clause);
else
{
+ /* functional dependencies support only [Var = Const] */
+ Assert(bms_num_members(attnums) == 1);
mvclauses[nmvclauses] = clause;
- mvattnums[nmvclauses] = attnum;
+ mvattnums[nmvclauses] = bms_singleton_member(attnums);
nmvclauses++;
- clause_attnums = bms_add_member(clause_attnums, attnum);
+ clause_attnums = bms_add_member(clause_attnums,
+ bms_singleton_member(attnums));
}
}
@@ -1439,3 +2013,454 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
return reduced_clauses;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = load_mv_mcvlist(mvstats->mvoid);
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* used to 'scale' for MCV lists not covering all tuples */
+ u += mcvlist->items[i]->frequency;
+
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s*u;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /* frequency of the lowest MCV item */
+ *lowsel = 1.0;
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ *
+ * FIXME This would probably deserve a refactoring, I guess. Unify
+ * the two loops and put the checks inside, or something like
+ * that.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ /* operator */
+ FmgrInfo opproc;
+
+ /* get procedure computing operator selectivity */
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo ltproc, gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype,
+ TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* TODO consider bsearch here (list is sorted by values)
+ * TODO handle other operators too (LT, GT)
+ * TODO identify "full match" when the clauses fully
+ * match the whole MCV list (so that checking the
+ * histogram is not needed)
+ */
+ if (oprrest == F_EQSEL)
+ {
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ bool match = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (match)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ mismatch = (! match);
+ }
+ else if (oprrest == F_SCALARLTSEL) /* column < constant */
+ {
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ } /* (get_oprrest(expr->opno) == F_SCALARLTSEL) */
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+ }
+ else if (oprrest == F_SCALARGTSEL) /* column > constant */
+ {
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARGTSEL) */
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! mcvlist->items[i]->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (mcvlist->items[i]->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 1cf64f8..c196ca0 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -406,7 +406,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built)
+ if (mvstat->deps_built || mvstat->mcv_built)
{
info = makeNode(MVStatisticInfo);
@@ -415,9 +415,11 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
+ info->mcv_enabled = mvstat->mcv_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
+ info->mcv_built = mvstat->mcv_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..3c0aff4 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o mcv.o dependencies.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index bd200bc..d1da714 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,12 +16,14 @@
#include "common.h"
+#include "utils/array.h"
+
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+ int natts,
+ VacAttrStats **vacattrstats);
static List* list_mv_stats(Oid relid);
-
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -49,6 +51,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int j;
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -87,8 +91,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (stat->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, attrs);
+ update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -166,6 +174,8 @@ list_mv_stats(Oid relid)
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
+ info->mcv_enabled = stats->mcv_enabled;
+ info->mcv_built = stats->mcv_built;
result = lappend(result, info);
}
@@ -180,8 +190,56 @@ list_mv_stats(Oid relid)
return result;
}
+
+/*
+ * Find attnims of MV stats using the mvoid.
+ */
+int2vector*
+find_mv_attnums(Oid mvoid, Oid *relid)
+{
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ HeapTuple htup;
+ int2vector *keys;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ htup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+ /* starelid */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_starelid, &isnull);
+ Assert(!isnull);
+
+ *relid = DatumGetObjectId(adatum);
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+ ReleaseSysCache(htup);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return keys;
+}
+
+
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -206,18 +264,29 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
@@ -246,6 +315,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -256,11 +340,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index 6d5465b..f4309f7 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -46,7 +46,15 @@ typedef struct
Datum value; /* a data value */
int tupno; /* position index for tuple it came from */
} ScalarItem;
-
+
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -71,5 +79,6 @@ int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..96bdf41
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1232 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+
+/*
+ * Multivariate MCVs (most-common values lists) are a straightforward
+ * extension of regular MCV list by tracking combinations of values for
+ * several attributes (columns), including NULL flags, and frequency
+ * of the combination.
+ *
+ * For columns small number of distinct values, this works quite well
+ * and may represent the distribution pretty exactly. For columns with
+ * large number of distinct values (e.g. stored as FLOAT), this does
+ * not work that well.
+ *
+ * If we can represent the distribution as a MCV list, we can estimate
+ * some clauses (e.g. equality clauses) much accurately than using
+ * histograms for example.
+ *
+ * Discrete distributions are also easier to combine into a larger
+ * distribution (but this is not yet implemented).
+ *
+ *
+ * TODO For types that don't reasonably support ordering (either because
+ * the type does not support that or when the user adds some option
+ * to the ADD STATISTICS command - e.g. UNSORTED_STATS), building
+ * the histogram may be pointless and inefficient. This is esp.
+ * true for varlena types that may be quite large and a large MCV
+ * list may be a better choice, because it makes equality estimates
+ * more accurate. Due to the unsorted nature, range queries on those
+ * attributes are rather useless anyway.
+ *
+ * Another thing is that by restricting to MCV list and equality
+ * conditions, we can use hash values instead of long varlena values.
+ * The equality estimation will be very accurate.
+ *
+ * This however complicates matching the columns to available
+ * statistics, as it will require matching clauses (not columns) to
+ * stats. And it may get quite complex - e.g. what if there are
+ * multiple clauses, each compatible with different stats subset?
+ *
+ *
+ * Selectivity estimation
+ * ----------------------
+ * The estimation, implemented in clauselist_mv_selectivity_mcvlist(),
+ * is quite simple in principle - walk through the MCV items and sum
+ * frequencies of all the items that match all the clauses.
+ *
+ * The current implementation uses MCV lists to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ *
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (a) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ *
+ * (b) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ *
+ * Estimating equality clauses
+ * ---------------------------
+ * When computing selectivity estimate for equality clauses
+ *
+ * (a = 1) AND (b = 2)
+ *
+ * we can do this estimate pretty exactly assuming that two conditions
+ * are met:
+ *
+ * (1) there's an equality condition on each attribute
+ *
+ * (2) we find a matching item in the MCV list
+ *
+ * In that case we know the MCV item represents all the tuples matching
+ * the clauses, and the selectivity estimate is complete. This is what
+ * we call 'full match'.
+ *
+ * When only (1) holds, but there's no matching MCV item, we don't know
+ * whether there are no such rows or just are not very frequent. We can
+ * however use the frequency of the least frequent MCV item as an upper
+ * bound for the selectivity.
+ *
+ * If the equality conditions match only a subset of the attributes
+ * the MCV list is built on (i.e. we can't get a full match - we may get
+ * multiple MCV items matching the clauses, but even if we get a single
+ * match there may be items that did not get into the MCV list. But in
+ * this case we can still use the frequency of the last MCV item to clam
+ * the 'additional' selectivity not accounted for by the matching items.
+ *
+ * If there's no histogram, because the MCV list approximates the
+ * distribution accurately (not because the histogram was disabled),
+ * it does not really matter whether there are equality conditions on
+ * all the columns - we can do pretty accurate estimation using the MCV.
+ *
+ * TODO For a combination of equality conditions (not full-match case)
+ * we probably can clamp the selectivity by the minimum of
+ * selectivities for each condition. For example if we know the
+ * number of distinct values for each column, we can use 1/ndistinct
+ * as a per-column estimate. Or rather 1/ndistinct + selectivity
+ * derived from the MCV list.
+ *
+ * If we know the estimate of number of combinations of the columns
+ * (i.e. ndistinct(A,B)), we may estimate the average frequency of
+ * items in the remaining 10% as [10% / ndistinct(A,B)].
+ *
+ *
+ * Bounding estimates
+ * ------------------
+ * In general the MCV lists may not provide estimates as accurate as
+ * for the full-match equality case, but may provide some useful
+ * lower/upper boundaries for the estimation error.
+ *
+ * With equality clauses we can do a few more tricks to narrow this
+ * error range (see the previous section and TODO), but with inequality
+ * clauses (or generally non-equality clauses), it's rather dificult.
+ * There's nothing like a 'full match' - we have to consider both the
+ * MCV items and the remaining part every time. We can't use the minimum
+ * selectivity of MCV items, as the clauses may match multiple items.
+ *
+ * For example with a MCV list on columns (A, B), covering 90% of the
+ * table (computed while building the MCV list), about ~10% of the table
+ * is not represented by the MCV list. So even if the conditions match
+ * all the remaining rows (not represented by the MCV items), we can't
+ * get selectivity higher than those 10%. We may use 1/2 the remaining
+ * selectivity as an estimate (minimizing average error).
+ *
+ * TODO Most of these ideas (error limiting) are not yet implemented.
+ *
+ *
+ * General TODO
+ * ------------
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * TODO Add support for IS [NOT] NULL clauses, and clauses referencing
+ * multiple columns (a < b).
+ *
+ * TODO It's possible to build a special case of MCV list, storing not
+ * the actual values but only 32/64-bit hash. This is only useful
+ * for estimating equality clauses and for large varlena types,
+ * which are very impractical for plain MCV list because of size.
+ * But for those data types we really want just the equality
+ * clauses, so it's actually a good solution.
+ *
+ * TODO Currently there's no logic to consider building only a MCV list
+ * (and not building the histogram at all), except for doing this
+ * decision manually in ADD STATISTICS.
+ */
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(int32))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(int32) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/* pointers into a flat serialized item of ITEM_SIZE(n) bytes */
+#define ITEM_INDEXES(item) ((uint16*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+/*
+ * Builds MCV list from sample rows, and removes rows represented by
+ * the MCV list from the sample (the number of remaining sample rows is
+ * returned by the numrows_filtered parameter).
+ *
+ * The method is quite simple - in short it does about these steps:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * For more details, see the comments in the code.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we
+ * want to check the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or
+ * float4). Maybe we could save some space here, but the bytea
+ * compression should handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed
+ * from the table, but rather estimate the number of distinct
+ * values in the table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * Preallocate space for all the items as a single chunk, and point
+ * the items to the appropriate parts of the array.
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ /* keep all the rows by default (as if there was no MCV list) */
+ *numrows_filtered = numrows;
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ items[j].values[i] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Count the number of distinct groups - just walk through the
+ * sorted list and count the number of key changes. We use this to
+ * determine the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array. We'll always
+ * require at least 4 rows per group.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e.
+ * if there are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS),
+ * we'll require only 2 rows per group.
+ *
+ * TODO For now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ * for the multivariate case.
+ *
+ * TODO We can do this only if we believe we got all the distinct
+ * values of the table.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog)
+ * instead of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /*
+ * Walk through the sorted data again, and see how many groups
+ * reach the mcv_threshold (and become an item in the MCV list).
+ */
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or new group, so check if we exceed mcv_threshold */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* group hits the threshold, count the group as MCV item */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* within group, so increase the number of items */
+ count += 1;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as
+ * we'll pass this outside this method and thus it needs to be
+ * easy to pfree() the data (and we wouldn't know where the
+ * arrays start).
+ *
+ * TODO Maybe the reasoning that we can't allocate a single
+ * piece because we're passing it out is bogus? Who'd
+ * free a single item of the MCV list, anyway?
+ *
+ * TODO Maybe with a proper encoding (stuffing all the values
+ * into a list-level array, this will be untrue)?
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /*
+ * Repeat the same loop as above, but this time copy the data
+ * into the MCV list (for items exceeding the threshold).
+ *
+ * TODO Maybe we could simply remember indexes of the last item
+ * in each group (from the previous loop)?
+ */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[nitems];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, items[(i-1)].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, items[(i-1)].isnull, sizeof(bool) * numattrs);
+
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ /* next item */
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows
+ * that are not represented by the MCV list).
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ *
+ * A better option would be keeping the ID of the row in
+ * the sort item, and then just walk through the items and
+ * mark rows to remove (in a bitmap of the same size).
+ * There's not space for that in SortItem at this moment,
+ * but it's trivial to add 'private' pointer, or just
+ * using another structure with extra field (starting with
+ * SortItem, so that the comparators etc. still work).
+ *
+ * Another option is to use the sorted array of items
+ * (because that's how we sorted the source data), and
+ * simply do a bsearch() into it. If we find a matching
+ * item, the row belongs to the MCV list.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* used for the searches */
+ SortItem item, mcvitem;;
+
+ item.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ item.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * FIXME we don't need to allocate this, we can reference
+ * the MCV item directly ...
+ */
+ mcvitem.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ mcvitem.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ item.values[j] = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &item.isnull[j]);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /*
+ * TODO Create a SortItem/MCVItem comparator so that
+ * we don't need to do memcpy() like crazy.
+ */
+ memcpy(mcvitem.values, mcvlist->items[j]->values,
+ numattrs * sizeof(Datum));
+ memcpy(mcvitem.isnull, mcvlist->items[j]->isnull,
+ numattrs * sizeof(bool));
+
+ if (multi_sort_compare(&item, &mcvitem, mss) == 0)
+ {
+ match = true;
+ break;
+ }
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the rows and remember how many rows we kept */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(rows_filtered);
+ pfree(item.values);
+ pfree(item.isnull);
+ pfree(mcvitem.values);
+ pfree(mcvitem.isnull);
+ }
+ }
+
+ pfree(values);
+ pfree(items);
+ pfree(isnull);
+
+ return mcvlist;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+MCVList
+load_mv_mcvlist(Oid mvoid)
+{
+ bool isnull = false;
+ Datum mcvlist;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->mcv_enabled && mvstat->mcv_built);
+#endif
+
+ mcvlist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_mcvlist(DatumGetByteaP(mcvlist));
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Serialize MCV list into a bytea value. The basic algorithm is simple:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use 32-bit values for the indexes in step (3), although we
+ * could probably use just 16 bits as we don't allow more than 8k
+ * items in the MCV list max_mcv_items (well, we might increase this to
+ * 32k and still fit into signed 16-bits). But let's be lazy and rely
+ * on the varlena compression to kick in. If most bytes will be 0x00
+ * so it should work nicely.
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider using 16-bit values for the indexes in step (3).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ Size total_length = 0;
+
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for all values, including NULLs (won't use them) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ if (! mcvlist->items[j]->isnull[i]) /* skip NULL values */
+ {
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* do not exceed UINT16_MAX */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval || (info[i].typlen > 0))
+ /* by value pased by reference, but fixed length */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data.
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized MCV exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'ptr' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the values for each item */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! mcvlist->items[i]->isnull[j])
+ {
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ v = (Datum*)bsearch(&mcvlist->items[i]->values[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims),
+ mcvlist->items[i]->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims),
+ &mcvlist->items[i]->frequency, sizeof(double));
+
+ /* copy the item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/* inverse to serialize_mv_mcvlist() - see the comment there */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ uint16 *indexes = NULL;
+ Datum **values = NULL;
+
+ /* local allocation buffer (used only for deserialization) */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* buffer used for the result */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert(nitems > 0);
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MCVListData,items) +
+ ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * We'll allocate one large chunk of memory for the intermediate
+ * data, needed only for deserializing the MCV list, and we'll pack
+ * use a local dense allocation to minimize the palloc overhead.
+ *
+ * Let's see how much space we'll actually need, and also include
+ * space for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ /* for full-size byval types, we reuse the serialized value */
+ if (! (info[i].typbyval && info[i].typlen == sizeof(Datum)))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ /* pased by reference, but fixed length (name, tid, ...) */
+ if (info[i].typlen > 0)
+ {
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should exhaust the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for the MCV items in a single piece */
+ rbufflen = (sizeof(MCVItem) + sizeof(MCVItemData) +
+ sizeof(Datum)*ndims + sizeof(bool)*ndims) * nitems;
+
+ rbuff = palloc(rbufflen);
+ rptr = rbuff;
+
+ mcvlist->items = (MCVItem*)rbuff;
+ rptr += (sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)rptr;
+ rptr += (sizeof(MCVItemData));
+
+ item->values = (Datum*)rptr;
+ rptr += (sizeof(Datum)*ndims);
+
+ item->isnull = (bool*)rptr;
+ rptr += (sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+ for (j = 0; j < ndims; j++)
+ Assert(indexes[j] <= UINT16_MAX);
+#endif
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ /* release the temporary buffer */
+ pfree(buff);
+
+ return mcvlist;
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_mcv_items);
+
+Datum
+pg_mv_mcv_items(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MCVList mcvlist;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ mcvlist = load_mv_mcvlist(PG_GETARG_OID(0));
+
+ funcctx->user_fctx = mcvlist;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = mcvlist->nitems;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MCVList mcvlist;
+ MCVItem item;
+
+ mcvlist = (MCVList)funcctx->user_fctx;
+
+ Assert(call_cntr < mcvlist->nitems);
+
+ item = mcvlist->items[call_cntr];
+
+ stakeys = find_mv_attnums(PG_GETARG_OID(0), &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(4 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+
+ /* frequency */
+ values[3] = (char *) palloc(64 * sizeof(char));
+
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * mcvlist->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* item ID */
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ Datum val, valout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == mcvlist->ndimensions-1)
+ format = "%s, %s}";
+
+ val = item->values[i];
+ valout = FunctionCall1(&fmgrinfo[i], val);
+
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 0b3518c..448cf35 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2101,8 +2101,8 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stakeys,\n"
- " deps_enabled,\n"
- " deps_built,\n"
+ " deps_enabled, mcv_enabled,\n"
+ " deps_built, mcv_built,\n"
" mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
@@ -2121,14 +2121,28 @@ describeOneTableDetails(const char *schemaname,
printTableAddFooter(&cont, _("Statistics:"));
for (i = 0; i < tuples; i++)
{
+ bool first = true;
+
printfPQExpBuffer(&buf, " ");
/* options */
if (!strcmp(PQgetvalue(result, i, 2), "t"))
- appendPQExpBuffer(&buf, "(dependencies)");
+ {
+ appendPQExpBuffer(&buf, "(dependencies");
+ first = false;
+ }
+
+ if (!strcmp(PQgetvalue(result, i, 3), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", mcv");
+ else
+ appendPQExpBuffer(&buf, "(mcv");
+ first = false;
+ }
- appendPQExpBuffer(&buf, " ON (%s)",
- PQgetvalue(result, i, 6));
+ appendPQExpBuffer(&buf, ") ON (%s)",
+ PQgetvalue(result, i, 8));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 81ec23b..c6e7d74 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -35,15 +35,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -59,11 +65,15 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-#define Natts_pg_mv_statistic 5
+#define Natts_pg_mv_statistic 9
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_deps_enabled 2
-#define Anum_pg_mv_statistic_deps_built 3
-#define Anum_pg_mv_statistic_stakeys 4
-#define Anum_pg_mv_statistic_stadeps 5
+#define Anum_pg_mv_statistic_mcv_enabled 3
+#define Anum_pg_mv_statistic_mcv_max_items 4
+#define Anum_pg_mv_statistic_deps_built 5
+#define Anum_pg_mv_statistic_mcv_built 6
+#define Anum_pg_mv_statistic_stakeys 7
+#define Anum_pg_mv_statistic_stadeps 8
+#define Anum_pg_mv_statistic_stamcv 9
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 2178f6c..0d12dd3 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2728,6 +2728,10 @@ DATA(insert OID = 3377 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3378 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index f6c4932..6fab94a 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -564,9 +564,11 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
+ bool mcv_enabled; /* MCV list enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
+ bool mcv_built; /* MCV list built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 02a7dda..b028192 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -50,30 +50,89 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
*/
MVDependencies load_mv_dependencies(Oid mvoid);
+MCVList load_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..85e8499
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE mcv_list ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+ALTER TABLE mcv_list ADD STATISTICS (dependencies, max_mcv_items 200) ON (a, b, c);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10) ON (a, b, c);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10000) ON (a, b, c);
+ERROR: max number of MCV items is 8192
+-- correct command
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c, d);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2f9758f..fc27d34 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1357,7 +1357,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
c.relname AS tablename,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 00c6ddf..63727a4 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -111,4 +111,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index b818be9..5b07b3b 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -154,3 +154,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..5de3d29
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,178 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (unknown_column);
+
+-- single column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a);
+
+-- single column, duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE mcv_list ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- missing MCV statistics
+ALTER TABLE mcv_list ADD STATISTICS (dependencies, max_mcv_items 200) ON (a, b, c);
+
+-- invalid mcv_max_items value / too low
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10) ON (a, b, c);
+
+-- invalid mcv_max_items value / too high
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10000) ON (a, b, c);
+
+-- correct command
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c, d);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DROP TABLE mcv_list;
--
1.9.3
0004-multivariate-histograms.patchtext/x-patch; name=0004-multivariate-histograms.patchDownload
>From 1ba83086a428bac548adba934bbb0c3909983978 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 4/6] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/tablecmds.c | 108 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 751 ++++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 41 +-
src/backend/utils/mvstats/histogram.c | 2486 ++++++++++++++++++++++++++++
src/bin/psql/describe.c | 15 +-
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 133 +-
src/test/regress/expected/mv_histogram.out | 207 +++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 176 ++
18 files changed, 3921 insertions(+), 45 deletions(-)
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 74fedf0..a9e761e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,7 +158,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 545b595..831bd2f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11889,15 +11889,19 @@ static int compare_int16(const void *a, const void *b)
* The code is an unholy mix of pieces that really belong to other parts
* of the source tree.
*
- * FIXME Check that the types are pass-by-value and support sort,
- * although maybe we can live without the sort (and only build
- * MCV list / association rules).
- *
- * FIXME This should probably check for duplicate stats (i.e. same
- * keys, same options). Although maybe it's useful to have
- * multiple stats on the same columns with different options
- * (say, a detailed MCV-only stats for some queries, histogram
- * for others, etc.)
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ *
+ * TODO It might be useful to have ALTER TABLE DROP STATISTICS too, but
+ * it's tricky because there may be multiple kinds of stats for the
+ * same list of columns, with different options (e.g. one just MCV
+ * list, another with histogram, etc.).
*/
static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
StatisticsDef *def, LOCKMODE lockmode)
@@ -11915,12 +11919,15 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
/* by default build everything */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(def, StatisticsDef));
@@ -11998,6 +12005,29 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -12006,10 +12036,10 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -12017,6 +12047,11 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -12034,10 +12069,14 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
@@ -12060,6 +12099,7 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
return;
}
+
/*
* Implements the ALTER TABLE ... DROP STATISTICS in two forms:
*
@@ -12085,12 +12125,16 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
/* checking whether the statistics matches / should be dropped */
bool build_dependencies = false;
bool build_mcv = false;
+ bool build_histogram = false;
bool max_mcv_items = 0;
+ bool max_buckets = 0;
bool check_dependencies = false;
bool check_mcv = false;
bool check_mcv_items = false;
+ bool check_histogram = false;
+ bool check_buckets = false;
if (def != NULL)
{
@@ -12144,6 +12188,18 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
build_mcv = true;
max_mcv_items = defGetInt32(opt);
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ {
+ check_histogram = true;
+ build_histogram = defGetBoolean(opt);
+ }
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ check_histogram = true;
+ check_buckets = true;
+ max_buckets = defGetInt32(opt);
+ build_histogram = true;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -12207,6 +12263,30 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
(DatumGetInt32(adatum) == max_mcv_items);
}
+ if (delete && check_histogram)
+ {
+ bool isnull;
+ Datum adatum = heap_getattr(tuple,
+ Anum_pg_mv_statistic_hist_enabled,
+ RelationGetDescr(statrel),
+ &isnull);
+
+ delete = (! isnull) &&
+ (DatumGetBool(adatum) == build_histogram);
+ }
+
+ if (delete && check_buckets)
+ {
+ bool isnull;
+ Datum adatum = heap_getattr(tuple,
+ Anum_pg_mv_statistic_hist_max_buckets,
+ RelationGetDescr(statrel),
+ &isnull);
+
+ delete = (! isnull) &&
+ (DatumGetInt32(adatum) == max_buckets);
+ }
+
/* check that the columns match the statistics definition */
if (delete && (numcols > 0))
{
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 635ccc1..162b1be 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1852,10 +1852,12 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
WRITE_BOOL_FIELD(mcv_enabled);
+ WRITE_BOOL_FIELD(hist_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
WRITE_BOOL_FIELD(mcv_built);
+ WRITE_BOOL_FIELD(hist_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index abffb0a..2d3cf09 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -53,6 +53,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
@@ -77,6 +78,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStatisticInfo *mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -84,6 +87,12 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
#define MIN(x, y) (((x) < (y)) ? (x) : (y))
@@ -271,7 +280,7 @@ clauselist_selectivity(PlannerInfo *root,
* From now on we're only interested in MCV-compatible clauses.
*/
mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- MV_CLAUSE_TYPE_MCV);
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/*
* If there still are at least two columns, we'll try to select
@@ -306,7 +315,7 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, sjinfo, clauses,
varRelid, &mvclauses, mvstat,
- MV_CLAUSE_TYPE_MCV);
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -1160,6 +1169,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -1173,9 +1183,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1317,7 +1342,7 @@ choose_mv_statistics(List *stats, Bitmapset *attnums)
int numattrs = attrs->dim1;
/* skip dependencies-only stats */
- if (! info->mcv_built)
+ if (! (info->mcv_built || info->hist_built))
continue;
/* count columns covered by the histogram */
@@ -1483,7 +1508,6 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
bool ok;
/* is it 'variable op constant' ? */
-
ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
(is_pseudo_constant_clause_relids(lsecond(expr->args),
right_relids) ||
@@ -1533,10 +1557,10 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
case F_SCALARLTSEL:
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (types & MV_CLAUSE_TYPE_MCV)
+ if (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
*attnums = bms_add_member(*attnums, var->varattno);
- return (types & MV_CLAUSE_TYPE_MCV);
+ return (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
}
return false;
@@ -2464,3 +2488,714 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all buckets, and increase the match level
+ * for the clauses (and skip buckets that are 'full match').
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ int nmatches = 0;
+ char *matches = NULL;
+
+ MVSerializedHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = load_mv_histogram2(mvstats->mvoid);
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ /*
+ * Used for caching function calls, only once per deduplicated value.
+ *
+ * We know may have up to (2 * nbuckets) values per dimension. It's
+ * probably overkill, but let's allocate that once for all clauses,
+ * to minimize overhead.
+ *
+ * Also, we only need two bits per value, but this allocates byte
+ * per value. Might be worth optimizing.
+ *
+ * 0x00 - not yet called
+ * 0x01 - called, result is 'false'
+ * 0x03 - called, result is 'true'
+ */
+ char *callcache = palloc(mvhist->nbuckets);
+
+ int calls = 0, hits = 0;
+
+ Assert (mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert (clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ /* reset the cache (per clause) */
+ memset(callcache, 0, mvhist->nbuckets);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ *
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR
+ | TYPECACHE_GT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ bool tmp;
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /* histogram boundaries */
+ Datum minval, maxval;
+
+ /* values from the call cache */
+ char mincached, maxcached;
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* lookup the values and cache of function calls */
+ minval = mvhist->values[idx][bucket->min[idx]];
+ maxval = mvhist->values[idx][bucket->max[idx]];
+
+ mincached = callcache[bucket->min[idx]];
+ maxcached = callcache[bucket->max[idx]];
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ *
+ * FIXME Once the min/max values are deduplicated, we can easily minimize
+ * the number of calls to the comparator (assuming we keep the
+ * deduplicated structure). See the note on compression at MVBucket
+ * serialize/deserialize methods.
+ */
+ switch (oprrest)
+ {
+ case F_SCALARLTSEL: /* column < constant */
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ ++calls;
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /*
+ * Update the cache (but in reverse, because we keep the
+ * cache for calls with (minval, constvalue).
+ */
+ if (tmp)
+ callcache[bucket->min[idx]] = 0x01; /* cached, false */
+ else
+ callcache[bucket->min[idx]] = 0x03; /* cached, true */
+ }
+ else
+ {
+ ++hits;
+ tmp = !(mincached & 0x02); /* extract the result (reverse) */
+ }
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ ++calls;
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ maxval));
+
+ /* update the cache */
+ if (tmp)
+ callcache[bucket->max[idx]] = 0x01; /* cached, false */
+ else
+ callcache[bucket->max[idx]] = 0x03; /* cached, true */
+ }
+ else
+ {
+ ++hits;
+ tmp = !(maxcached & 0x02); /* extract the result (reverse) */
+ }
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+
+ }
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ ++calls;
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* update the cache */
+ if (tmp)
+ callcache[bucket->max[idx]] = 0x03; /* cached, true */
+ else
+ callcache[bucket->max[idx]] = 0x01; /* cached, false */
+ }
+ else
+ {
+ ++hits;
+ tmp = (maxcached & 0x02); /* extract the result */
+ }
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ ++calls;
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ minval,
+ cst->constvalue));
+
+ /* update the cache */
+ if (tmp)
+ callcache[bucket->min[idx]] = 0x03; /* cached, true */
+ else
+ callcache[bucket->min[idx]] = 0x01; /* cached, false */
+ }
+ else
+ {
+ ++hits;
+ tmp = (mincached & 0x02); /* extract the result */
+ }
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ break;
+
+ case F_SCALARGTSEL: /* column > constant */
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ ++calls;
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ maxval));
+
+ /* update the cache */
+ if (tmp)
+ callcache[bucket->max[idx]] = 0x01; /* cached, false */
+ else
+ callcache[bucket->max[idx]] = 0x03; /* cached, true */
+ }
+ else
+ {
+ ++hits;
+ tmp = !(maxcached & 0x02); /* extract the result */
+ }
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ ++calls;
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /* update the cache */
+ if (tmp)
+ callcache[bucket->min[idx]] = 0x01; /* cached, false */
+ else
+ callcache[bucket->min[idx]] = 0x03; /* cached, true */
+ }
+ else
+ {
+ ++hits;
+ tmp = !(mincached & 0x02); /* extract the result */
+ }
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ minval,
+ cst->constvalue));
+
+ /* update the cache */
+ if (tmp)
+ callcache[bucket->min[idx]] = 0x03; /* cached, true */
+ else
+ callcache[bucket->min[idx]] = 0x01; /* cached, false */
+ }
+ else
+ {
+ ++hits;
+ tmp = (mincached & 0x02); /* extract the result */
+ }
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* update the cache */
+ if (tmp)
+ callcache[bucket->max[idx]] = 0x03; /* cached, true */
+ else
+ callcache[bucket->max[idx]] = 0x01; /* cached, false */
+ }
+ else
+ {
+ ++hits;
+ tmp = (maxcached & 0x02); /* extract the result */
+ }
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the lt/gt
+ * operators fetched from type cache.
+ *
+ * TODO We'll use the default 50% estimate, but that's probably way off
+ * if there are multiple distinct values. Consider tweaking this a
+ * somehow, e.g. using only a part inversely proportional to the
+ * estimated number of distinct values in the bucket.
+ *
+ * TODO This does not handle inclusion flags at the moment, thus counting
+ * some buckets twice (when hitting the boundary).
+ *
+ * TODO Optimization is that if max[i] == min[i], it's effectively a MCV
+ * item and we can count the whole bucket as a complete match (thus
+ * using 100% bucket selectivity and not just 50%).
+ *
+ * TODO Technically some buckets may "degenerate" into single-value
+ * buckets (not necessarily for all the dimensions) - maybe this
+ * is better than keeping a separate MCV list (multi-dimensional).
+ * Update: Actually, that's unlikely to be better than a separate
+ * MCV list for two reasons - first, it requires ~2x the space
+ * (because of storing lower/upper boundaries) and second because
+ * the buckets are ranges - depending on the partitioning algorithm
+ * it may not even degenerate into (min=max) bucket. For example the
+ * the current partitioning algorithm never does that.
+ */
+ ++calls;
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /* update the cache */
+ if (tmp)
+ callcache[bucket->min[idx]] = 0x03; /* cached, true */
+ else
+ callcache[bucket->min[idx]] = 0x01; /* cached, false */
+ }
+ else
+ {
+ ++hits;
+ tmp = (mincached & 0x02); /* extract the result */
+ }
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ ++calls;
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* update the cache */
+ if (tmp)
+ callcache[bucket->max[idx]] = 0x03; /* cached, true */
+ else
+ callcache[bucket->max[idx]] = 0x01; /* cached, false */
+ }
+ else
+ {
+ ++hits;
+ tmp = (maxcached & 0x02); /* extract the result */
+ }
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+
+ break;
+ }
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ elog(WARNING, "calls=%d hits=%d hit ratio %.2f",
+ calls, hits, hits * 100.0 / calls);
+
+ /* free the call cache */
+ pfree(callcache);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index c196ca0..a05c811 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -406,7 +406,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
{
info = makeNode(MVStatisticInfo);
@@ -416,10 +416,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
+ info->hist_enabled = mvstat->hist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
+ info->hist_built = mvstat->hist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 3c0aff4..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o mcv.o dependencies.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index d1da714..6499357 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -13,11 +13,11 @@
*
*-------------------------------------------------------------------------
*/
+#include "postgres.h"
+#include "utils/array.h"
#include "common.h"
-#include "utils/array.h"
-
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts,
VacAttrStats **vacattrstats);
@@ -52,7 +52,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -95,8 +96,16 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (stat->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
+
+#ifdef MVSTATS_DEBUG
+ print_mv_histogram_info(histogram);
+#endif
}
}
@@ -176,6 +185,8 @@ list_mv_stats(Oid relid)
info->deps_built = stats->deps_built;
info->mcv_enabled = stats->mcv_enabled;
info->mcv_built = stats->mcv_built;
+ info->hist_enabled = stats->hist_enabled;
+ info->hist_built = stats->hist_built;
result = lappend(result, info);
}
@@ -190,7 +201,6 @@ list_mv_stats(Oid relid)
return result;
}
-
/*
* Find attnims of MV stats using the mvoid.
*/
@@ -236,9 +246,16 @@ find_mv_attnums(Oid mvoid, Oid *relid)
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -271,22 +288,34 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..4a7f4b2
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,2486 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+#include <math.h>
+
+/*
+ * Multivariate histograms
+ *
+ * Histograms are a collection of buckets, represented by n-dimensional
+ * rectangles. Each rectangle is delimited by a min/max value in each
+ * dimension, stored in an array, so that the bucket includes values
+ * fulfilling condition
+ *
+ * min[i] <= value[i] <= max[i]
+ *
+ * where 'i' is the dimension. In 1D this corresponds to a simple
+ * interval, in 2D to a rectangle, and in 3D to a block. If you can
+ * imagine this in 4D, congrats!
+ *
+ * In addition to the bounaries, each bucket tracks additional details:
+ *
+ * * frequency (fraction of tuples it matches)
+ * * whether the boundaries are inclusive or exclusive
+ * * whether the dimension contains only NULL values
+ * * number of distinct values in each dimension (for building)
+ *
+ * and possibly some additional information.
+ *
+ * We do expect to support multiple histogram types, with different
+ * features etc. The 'type' field is used to identify those types.
+ * Technically some histogram types might use completely different
+ * bucket representation, but that's not expected at the moment.
+ *
+ * Although the current implementation builds non-overlapping buckets,
+ * the code does not rely on the non-overlapping nature - there are
+ * interesting types of histograms / histogram building algorithms
+ * producing overlapping buckets.
+ *
+ * TODO Currently the histogram does not include information about what
+ * part of the table it covers (because the frequencies are
+ * computed from the rows that may be filtered by MCV list). Seems
+ * wrong, possibly causing misestimates (when not matching the MCV
+ * list, we'll probably get much higher selectivity).
+ *
+ *
+ * Estimating selectivity
+ * ----------------------
+ * With histograms, we always "match" a whole bucket, not indivitual
+ * rows (or values), irrespectedly of the type of clause. Therefore we
+ * can't use the optimizations for equality clauses, as in MCV lists.
+ *
+ * The current implementation uses histograms to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (a) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ * (b) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ * When used on low-cardinality data, histograms usually perform
+ * considerably worse than MCV lists (which are a good fit for this
+ * kind of data). This is especially true on categorical data, where
+ * ordering of the values is only loosely related to meaning of the
+ * data, as proper ordering is crucial for histograms.
+ *
+ * On high-cardinality data the histograms are usually a better choice,
+ * because MCV lists can't accurately represent the distribution.
+ *
+ * By evaluating a clause on a bucket, we may get one of three results:
+ *
+ * (a) FULL_MATCH - The bucket definitely matches the clause.
+ *
+ * (b) PARTIAL_MATCH - The bucket matches the clause, but not
+ * necessarily all the tuples it represents.
+ *
+ * (c) NO_MATCH - The bucket definitely does not match the clause.
+ *
+ * This may be illustrated using a range [1, 5], which is essentially
+ * a 1D bucket. With clause
+ *
+ * WHERE (a < 10) => FULL_MATCH (all range values are below
+ * 10, so the whole bucket matches)
+ *
+ * WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ * the clause, but we don't know how many)
+ *
+ * WHERE (a < 0) => NO_MATCH (all range values are above 1, so
+ * no values from the bucket match)
+ *
+ * Some clauses may produce only some of those results - for example
+ * equality clauses may never produce FULL_MATCH as we always hit only
+ * part of the bucket, not all the values. This results in less accurate
+ * estimates compared to MCV lists, where we can hit a MCV items exactly
+ * (an extreme case of that is 'full match').
+ *
+ * There are clauses that may not produce any PARTIAL_MATCH results.
+ * A nice example of that is 'IS [NOT] NULL' clause, which either
+ * matches the bucket completely (FULL_MATCH) or not at all (NO_MATCH),
+ * thanks to how the NULL-buckets are constructed.
+ *
+ * TODO The IS [NOT] NULL clause is not yet implemented, but should be
+ * rather trivial to.
+ *
+ * Computing the total selectivity estimate is trivial - simply sum
+ * selectivities from all the FULL_MATCH and PARTIAL_MATCH buckets, but
+ * multiply the PARTIAL_MATCH buckets by 0.5 to minimize average error.
+ *
+ *
+ * NULL handling
+ * -------------
+ * Buckets may not contain tuples with NULL and non-NULL values in
+ * a single dimension (attribute). To handle this, the histogram may
+ * contain NULL-buckets, i.e. buckets with one or more NULL-only
+ * dimensions.
+ *
+ * The maximum number of NULL-buckets is determined by the number of
+ * attributes the histogram is built on. For N-dimensional histogram,
+ * the maximum number of NULL-buckets is 2^N. So for 8 attributes
+ * (which is the current value of MVSTATS_MAX_DIMENSIONS), there may be
+ * up to 256 NULL-buckets.
+ *
+ * Those buckets are only built if needed - if there are no NULL values
+ * in the data, no such buckets are built.
+ *
+ *
+ * Serialization
+ * -------------
+ * After building, the histogram is serialized into a more efficient
+ * form (dedup boundary values etc.). See serialize_mv_histogram() for
+ * more details about how it's done.
+ *
+ * Serialized histograms are marked with 'magic' constant, to make it
+ * easier to check the bytea really is a histogram in serialized form.
+ *
+ *
+ * TODO This structure is used both when building the histogram, and
+ * then when using it to compute estimates. That's why the last
+ * few elements are not used once the histogram is built.
+ *
+ * Add pointer to 'private' data, meant for private data for
+ * other algorithms for building the histogram. It also removes
+ * the bogus / unnecessary fields.
+ *
+ * TODO The limit on number of buckets is quite arbitrary, aiming for
+ * sufficient accuracy while still being fast. Probably should be
+ * replaced with a dynamic limit dependent on statistics target,
+ * number of attributes (dimensions) and statistics target
+ * associated with the attributes. Also, this needs to be related
+ * to the number of sampled rows, by either clamping it to a
+ * reasonable number (after seeing the number of rows) or using
+ * it when computing the number of rows to sample. Something like
+ * 10 rows per bucket seems reasonable.
+ *
+ * TODO Add MVSTAT_HIST_ROWS_PER_BUCKET tracking minimal number of
+ * tuples per bucket (also, see the previous TODO).
+ *
+ * TODO We may replace the bool arrays with a suitably large data type
+ * (say, uint16 or uint32) and get rid of the allocations. It's
+ * unlikely we'll ever support more than 32 columns as that'd
+ * result in poor precision, huge histograms (splitting each
+ * dimension once would mean 2^32 buckets), and very expensive
+ * estimation. MCVItem already does it this way.
+ *
+ * Update: Actually, this is not 100% true, because we're splitting
+ * a single bucket, not all the buckets at the same time. So each
+ * split simply adds one new bucket, and we choose the bucket that
+ * is most in need of a slit. So even with 32 columns this might
+ * give reasonable accuracy, maybe? After 1000 splits we'll get
+ * about 1001 buckets, and some may be quite large (if that area
+ * frequency has low frequency of tuples).
+ *
+ * There are other challenges though - e.g. with this many columns
+ * it's more likely to reference both label/non-label columns,
+ * which is rather quirky (especially with histograms).
+ *
+ * However, while this would save some space for histograms built
+ * on many columns, it won't save anything for up to 4 columns
+ * (actually, on less than 3 columns it's probably wasteful).
+ *
+ * TODO Maybe the distinct stats (both for combination of all columns
+ * and for combinations of various subsets of columns) should be
+ * moved to a separate structure (next to histogram/MCV/...) to
+ * make it useful even without a histogram computed etc.
+ */
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(int32))
+ * - max boundary indexes (2 * ndim * sizeof(int32))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(int32) + 3 * sizeof(bool)) +
+ * 2 * sizeof(float)
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) ((float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* can't split bucket with less than 10 rows */
+#define MIN_BUCKET_ROWS 10
+
+/* some debugging methods */
+#ifdef MVSTATS_DEBUG
+static void print_mv_histogram_info(MVHistogram histogram);
+#endif
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /* index of the dimension the bucket was split previously */
+ int last_split_dimension;
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ *
+ * XXX Maybe it could be useful for improving ndistinct estimates for
+ * combinations of columns (e.g. in GROUP BY queries). It would
+ * probably mean tracking 2^N values for each bucket, and even if
+ * those values might be stores in 1B (which is unlikely) it's
+ * still a lot of space (considering the expected number of
+ * buckets). So maybe that might be tracked just at the top level.
+ *
+ * TODO Consider tracking ndistincts for all attribute combinations.
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, by looking at the number of
+ * distinct values (combination of column values for bucket, column
+ * values for a dimension). This is somehow naive, but seems to work
+ * quite well. See the discussion at select_bucket_to_partition and
+ * partition_bucket for more details about alternative algorithms.
+ *
+ * So the current algorithm looks like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (max distinct combinations)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (max distinct values)
+ * split the bucket into two buckets
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ int *ndistvalues;
+ Datum **distvalues;
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets
+ = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * Collect info on distinct values in each dimension (used later
+ * to select dimension to partition).
+ */
+ ndistvalues = (int*)palloc0(sizeof(int) * numattrs);
+ distvalues = (Datum**)palloc0(sizeof(Datum*) * numattrs);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ int j;
+ int nvals;
+ Datum *tmp;
+
+ SortSupportData ssup;
+ StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ nvals = 0;
+ tmp = (Datum*)palloc0(sizeof(Datum) * numrows);
+
+ for (j = 0; j < numrows; j++)
+ {
+ bool isnull;
+
+ /* remember the index of the sample row, to make the partitioning simpler */
+ Datum value = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+
+ if (isnull)
+ continue;
+
+ tmp[nvals++] = value;
+ }
+
+ /* do the sort and stuff only if there are non-NULL values */
+ if (nvals > 0)
+ {
+ /* sort the array of values */
+ qsort_arg((void *) tmp, nvals, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /* count distinct values */
+ ndistvalues[i] = 1;
+ for (j = 1; j < nvals; j++)
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ ndistvalues[i] += 1;
+
+ /* FIXME allocate only needed space (count ndistinct first) */
+ distvalues[i] = (Datum*)palloc0(sizeof(Datum) * ndistvalues[i]);
+
+ /* now collect distinct values into the array */
+ distvalues[i][0] = tmp[0];
+ ndistvalues[i] = 1;
+
+ for (j = 1; j < nvals; j++)
+ {
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ {
+ distvalues[i][ndistvalues[i]] = tmp[j];
+ ndistvalues[i] += 1;
+ }
+ }
+ }
+
+ pfree(tmp);
+ }
+
+ /*
+ * The initial bucket may contain NULL values, so we have to create
+ * buckets with NULL-only dimensions.
+ *
+ * FIXME We may need up to 2^ndims buckets - check that there are
+ * enough buckets (MVSTAT_HIST_MAX_BUCKETS >= 2^ndims).
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets]
+ = partition_bucket(bucket, attrs, stats,
+ ndistvalues, distvalues);
+
+ histogram->nbuckets += 1;
+ }
+
+ /* finalize the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data = ((HistogramBuild)histogram->buckets[i]->build_data);
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVHistogram
+load_mv_histogram(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram(DatumGetByteaP(histogram));
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVSerializedHistogram
+load_mv_histogram2(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram_2(DatumGetByteaP(histogram));
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm
+ * is simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use 32-bit values for the indexes in step (3), although we
+ * could probably use just 16 bits as we don't allow more than 8k
+ * buckets in the histogram max_buckets (well, we might increase this
+ * to 16k and still fit into signed 16-bits). But let's be lazy and rely
+ * on the varlena compression to kick in. If most bytes will be 0x00
+ * so it should work nicely.
+ *
+ *
+ * Deduplication in serialization
+ * ------------------------------
+ * The deduplication is very effective and important here, because every
+ * time we split a bucket, we keep all the boundary values, except for
+ * the dimension that was used for the split. Another way to look at
+ * this is that each split introduces 1 new value (the value used to do
+ * the split). A histogram with M buckets was created by (M-1) splits
+ * of the initial bucket, and each bucket has 2*N boundary values. So
+ * assuming the initial bucket does not have any 'collapsed' dimensions,
+ * the number of distinct values is
+ *
+ * (2*N + (M-1))
+ *
+ * but the total number of boundary values is
+ *
+ * 2*N*M
+ *
+ * which is clearly much higher. For a histogram on two columns, with
+ * 1024 buckets, it's 1027 vs. 4096. Of course, we're not saving all
+ * the difference (because we'll use 32-bit indexes into the values).
+ * But with large values (e.g. stored as varlena), this saves a lot.
+ *
+ * An interesting feature is that the total number of distinct values
+ * does not really grow with the number of dimensions, except for the
+ * size of the initial bucket. After that it only depends on number of
+ * buckets (i.e. number of splits).
+ *
+ * XXX Of course this only holds for the current histogram building
+ * algorithm. Algorithms doing the splits differently (e.g.
+ * producing overlapping buckets) may behave differently.
+ *
+ * TODO This only confirms we can use the uint16 indexes. The worst
+ * that could happen is if all the splits happened by a single
+ * dimension. To exhaust the uint16 this would require ~64k
+ * splits (needs to be reflected in MVSTAT_HIST_MAX_BUCKETS).
+ *
+ * TODO We don't need to use a separate boolean for each flag, instead
+ * use a single char and set bits.
+ *
+ * TODO We might get a bit better compression by considering the actual
+ * data type length. The current implementation treats all data
+ * types passed by value as requiring 8B, but for INT it's actually
+ * just 4B etc.
+ *
+ * OTOH this is only related to the lookup table, and most of the
+ * space is occupied by the buckets (with int16 indexes).
+ *
+ *
+ * Varlena compression
+ * -------------------
+ * This encoding may prevent automatic varlena compression (similarly
+ * to JSONB), because first part of the serialized bytea will be an
+ * array of unique values (although sorted), and pglz decides whether
+ * to compress by trying to compress the first part (~1kB or so). Which
+ * is likely to be poor, due to the lack of repetition.
+ *
+ * One possible cure to that might be storing the buckets first, and
+ * then the deduplicated arrays. The buckets might be better suited
+ * for compression.
+ *
+ * On the other hand the encoding scheme is a context-aware compression,
+ * usually compressing to ~30% (or less, with large data types). So the
+ * lack of pglz compression may be OK.
+ *
+ * XXX But maybe we don't really want to compress this, to save on
+ * planning time?
+ *
+ * TODO Try storing the buckets / deduplicated arrays in reverse order,
+ * measure impact on compression.
+ *
+ *
+ * Deserialization
+ * ---------------
+ * The deserialization is currently implemented so that it reconstructs
+ * the histogram back into the same structures - this involves quite
+ * a few of memcpy() and palloc(), but maybe we could create a special
+ * structure for the serialized histogram, and access the data directly,
+ * without the unpacking.
+ *
+ * Not only it would save some memory and CPU time, but might actually
+ * work better with CPU caches (not polluting the caches).
+ *
+ * TODO Try to keep the compressed form, instead of deserializing it to
+ * MVHistogram/MVBucket.
+ *
+ *
+ * General TODOs
+ * -------------
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs
+ * (we won't use them, but we don't know how many are there),
+ * and then collect all non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* make sure we fit into uint16 */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typlen > 0)
+ /* byval or byref, but with fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (10 * 1024 * 1024))
+ elog(ERROR, "serialized histogram exceeds 10MB (%ld > %d)",
+ total_length, (10 * 1024 * 1024));
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typlen > 0)
+ {
+ /* pased by value or reference, but fixed length */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the histogram buckets */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ *BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ uint16 idx;
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ /* min boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* FIXME free the values/counts arrays here */
+
+ return output;
+}
+
+/*
+ * Reverse to serialize histogram. This essentially expands the serialized
+ * form back to MVHistogram / MVBucket.
+ */
+MVHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+ Datum **values = NULL;
+
+ MVHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* temporary deserialization buffer */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram = (MVHistogram)palloc(sizeof(MVHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVHistogramData, buckets));
+ tmp += offsetof(MVHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * We'll allocate one large chunk of memory for the intermediate
+ * data, needed only for deserializing the MCV list, and we'll pack
+ * use a local dense allocation to minimize the palloc overhead.
+ *
+ * Let's see how much space we'll actually need, and also include
+ * space for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ if (! info[i].typbyval)
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should exhaust the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for the histogram buckets in a single piece */
+ rbufflen = (sizeof(MVBucket) + sizeof(MVBucket) +
+ (sizeof(MVBucketData) + 2*sizeof(Datum)
+ + 3*sizeof(bool))*ndims) * nbuckets;
+
+ rbuff = palloc(rbufflen);
+ rptr = rbuff;
+
+ histogram->buckets = (MVBucket*)rbuff;
+ rptr += (sizeof(MVBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+
+ MVBucket bucket = (MVBucket)rptr;
+ rptr += sizeof(MVBucketData);
+
+ bucket->nullsonly = (bool*)rptr;
+ rptr += (sizeof(bool) * ndims);
+
+ bucket->min_inclusive = (bool*)rptr;
+ rptr += (sizeof(bool) * ndims);
+
+ bucket->max_inclusive = (bool*)rptr;
+ rptr += (sizeof(bool) * ndims);
+
+ bucket->min = (Datum*) rptr;
+ rptr += (sizeof(Datum) * ndims);
+
+ bucket->max = (Datum*) rptr;
+ rptr += (sizeof(Datum) * ndims);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+
+ memcpy(bucket->nullsonly, BUCKET_NULLS_ONLY(tmp, ndims),
+ sizeof(bool) * ndims);
+
+ memcpy(bucket->min_inclusive, BUCKET_MIN_INCL(tmp, ndims),
+ sizeof(bool) * ndims);
+
+ memcpy(bucket->max_inclusive, BUCKET_MAX_INCL(tmp, ndims),
+ sizeof(bool) * ndims);
+
+ /* translate the indexes to values */
+ for (j = 0; j < ndims; j++)
+ {
+ if (! bucket->nullsonly[j])
+ {
+ bucket->min[j] = values[j][BUCKET_MIN_INDEXES(tmp, ndims)[j]];
+ bucket->max[j] = values[j][BUCKET_MAX_INDEXES(tmp, ndims)[j]];
+ }
+ }
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ pfree(buff);
+
+ return histogram;
+}
+
+
+
+/*
+ * Returns histogram in a partially-serialized form (keeps the boundary
+ * values deduplicated, so that it's possible to optimize the estimation
+ * part by caching function call results between buckets etc.).
+ */
+MVSerializedHistogram
+deserialize_mv_histogram_2(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+
+ MVSerializedHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram
+ = (MVSerializedHistogram)palloc(sizeof(MVSerializedHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVSerializedHistogramData, buckets));
+ tmp += offsetof(MVSerializedHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVSerializedHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* now let's allocate a single buffer for all the values and counts */
+
+ bufflen = (sizeof(int) + sizeof(Datum*)) * ndims;
+ for (i = 0; i < ndims; i++)
+ {
+ /* don't allocate space for byval types, matching Datum */
+ if (! (info[i].typbyval && (info[i].typlen == sizeof(Datum))))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+ }
+
+ /* also, include space for the result, tracking the buckets */
+ bufflen += nbuckets * (
+ sizeof(MVSerializedBucket) + /* bucket pointer */
+ sizeof(MVSerializedBucketData)); /* bucket data */
+
+ buff = palloc(bufflen);
+ ptr = buff;
+
+ histogram->nvalues = (int*)ptr;
+ ptr += (sizeof(int) * ndims);
+
+ histogram->values = (Datum**)ptr;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ histogram->nvalues[i] = info[i].nvalues;
+
+ if (info[i].typbyval && info[i].typlen == sizeof(Datum))
+ {
+ /* passed by value / Datum - simply reuse the array */
+ histogram->values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typbyval)
+ {
+ /* pased by value, but smaller than Datum */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&histogram->values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ histogram->buckets = (MVSerializedBucket*)ptr;
+ ptr += (sizeof(MVSerializedBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVSerializedBucket bucket = (MVSerializedBucket)ptr;
+ ptr += sizeof(MVSerializedBucketData);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+ bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+ bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+ bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+ bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+ bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ /* we should exhaust the output buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller
+ * buckets.
+ *
+ * TODO Add ndistinct estimation, probably the one described in "Towards
+ * Estimation Error Guarantees for Distinct Values, PODS 2000,
+ * p. 268-279" (the ones called GEE, or maybe AE).
+ *
+ * TODO The "combined" ndistinct is more likely to scale with the number
+ * of rows (in the table), because a single column behaving this
+ * way is sufficient for such behavior.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ /*
+ * The initial bucket was not split at all, so we'll start with the
+ * first dimension in the next round (index = 0).
+ */
+ data->last_split_dimension = -1;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * TODO Fix to handle arbitrarily-sized histograms (not just 2D ones)
+ * and call the right output procedures (for the particular type).
+ *
+ * TODO This should somehow fetch info about the data types, and use
+ * the appropriate output functions to print the boundary values.
+ * Right now this prints the 8B value as an integer.
+ *
+ * TODO Also, provide a special function for 2D histogram, printing
+ * a gnuplot script (with rectangles).
+ *
+ * TODO For string types (once supported) we can sort the strings first,
+ * assign them a sequence of integers and use the original values
+ * as labels.
+ */
+#ifdef MVSTATS_DEBUG
+static void
+print_mv_histogram_info(MVHistogram histogram)
+{
+ int i = 0;
+
+ elog(WARNING, "histogram nbuckets=%d", histogram->nbuckets);
+
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ MVBucket bucket = histogram->buckets[i];
+ elog(WARNING, " bucket %d : ndistinct=%f ntuples=%d min=[%ld, %ld], max=[%ld, %ld] distinct=[%d,%d]",
+ i, bucket->ndistinct, bucket->numrows,
+ bucket->min[0], bucket->min[1], bucket->max[0], bucket->max[1],
+ bucket->ndistincts[0], bucket->ndistincts[1]);
+ }
+}
+#endif
+
+/*
+ * A very simple partitioning selection criteria - choose the bucket
+ * with the highest number of distinct values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ // int ndistinct = 1; /* if ndistinct=1, we can't split the bucket */
+ int numrows = 0;
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+ /* if the ndistinct count is higher, use this bucket */
+ if ((data->ndistinct > 2) &&
+ (data->numrows > numrows) &&
+ (data->numrows >= MIN_BUCKET_ROWS)) {
+ bucket = buckets[i];
+// ndistinct = data->ndistinct;
+ numrows = data->numrows;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - splits the dimensions in
+ * a round-robin manner (considering only those with ndistinct>1). That
+ * is first a dimension 0 is split, then 1, 2, ... until reaching the
+ * end of attribute list, and then wrapping back to 0. Of course,
+ * dimensions with a single distinct value are skipped.
+ *
+ * This is essentially what Muralikrishna/DeWitt described in their SIGMOD
+ * article (M. Muralikrishna, David J. DeWitt: Equi-Depth Histograms For
+ * Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD
+ * Conference 1988: 28-36).
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * This splits the bucket by tweaking the existing one, and returning the
+ * new bucket (essentially shrinking the existing one in-place and returning
+ * the other "half" as a new bucket). The caller is responsible for adding
+ * the new bucket into the list of buckets.
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case of
+ * strongly dependent columns - e.g. y=x).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g. to
+ * split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ // int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+ double delta;
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split.
+ *
+ * If we happen to wrap around, something clearly went wrong (we
+ * can't mess with the last_split_dimension directly, because we
+ * couldn't do this check).
+ */
+ delta = 0.0;
+ dimension = -1;
+
+ for (i = 0; i < numattrs; i++)
+ {
+ Datum *a, *b;
+
+ mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ /* can't split NULL-only dimension */
+ if (bucket->nullsonly[i])
+ continue;
+
+ /* can't split dimension with a single ndistinct value */
+ if (data->ndistincts[i] <= 1)
+ continue;
+
+ /* sort support for the bsearch_comparator */
+ ssup_private = &ssup;
+
+ /* search for min boundary in the distinct list */
+ a = (Datum*)bsearch(&bucket->min[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ b = (Datum*)bsearch(&bucket->max[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ /* if this dimension is 'larger' then partition by it */
+ if (((b-a)*1.0 / ndistvalues[i]) > delta)
+ {
+ delta = ((b-a)*1.0 / ndistvalues[i]);
+ dimension = i;
+ }
+ }
+
+ /*
+ * If we haven't found a dimension here, we've done something
+ * wrong in select_bucket_to_partition.
+ */
+ Assert(dimension != -1);
+
+ /* Remember the dimension for the next split of this bucket. */
+ data->last_split_dimension = dimension;
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ delta = fabs(data->numrows);
+ split_value = values[0].value;
+
+ for (i = 1; i < data->numrows; i++)
+ {
+ if (values[i].value != values[i-1].value)
+ {
+ /* are we closer to splitting the bucket in half? */
+ if (fabs(i - data->numrows/2.0) < delta)
+ {
+ /* let's assume we'll use this value for the split */
+ split_value = values[i].value;
+ delta = fabs(i - data->numrows/2.0);
+ nrows = i;
+ }
+ }
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->last_split_dimension = ((HistogramBuild)bucket->build_data)->last_split_dimension;
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and
+ * non-NULL values in a single dimension. Each dimension may either be
+ * marked as 'nulls only', and thus containing only NULL values, or
+ * it must not contain any NULL values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns,
+ * it's necessary to build those NULL-buckets. This is done in an
+ * iterative way using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL
+ * and non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not
+ * marked as NULL-only, mark it as NULL-only and run the
+ * algorithm again (on this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the
+ * bucket into two parts - one with NULL values, one with
+ * non-NULL values (replacing the current one). Then run
+ * the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions
+ * should be quite low - limited by the number of NULL-buckets. Also,
+ * in each branch the number of nested calls is limited by the number
+ * of dimensions (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The
+ * number of buckets produced by this algorithm is rather limited - with
+ * N dimensions, there may be only 2^N such buckets (each dimension may
+ * be either NULL or non-NULL). So with 8 dimensions (current value of
+ * MVSTATS_MAX_DIMENSIONS) there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further
+ * optimizing the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL
+ * in a dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only,
+ * but is not yet marked like that. It's enough to mark it and
+ * repeat the process recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in
+ * the dimension, one with non-NULL values. We don't need to sort
+ * the data or anything, but otherwise it's similar to what's done
+ * in partition_bucket().
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each
+ * bucket (NULL is not a value, so 0, and the other bucket got
+ * all the ndistinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_histogram_buckets);
+
+Datum
+pg_mv_histogram_buckets(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ Oid mvoid = PG_GETARG_OID(0);
+ int otype = PG_GETARG_INT32(1);
+
+ if ((otype < 0) || (otype > 2))
+ elog(ERROR, "invalid output type specified");
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MVSerializedHistogram histogram;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ histogram = load_mv_histogram2(mvoid);
+
+ funcctx->user_fctx = histogram;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = histogram->nbuckets;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+ double bucket_size = 1.0;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MVSerializedHistogram histogram;
+ MVSerializedBucket bucket;
+
+ histogram = (MVSerializedHistogram)funcctx->user_fctx;
+
+ Assert(call_cntr < histogram->nbuckets);
+
+ bucket = histogram->buckets[call_cntr];
+
+ stakeys = find_mv_attnums(mvoid, &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(9 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+ values[3] = (char *) palloc0(1024 * sizeof(char));
+ values[4] = (char *) palloc0(1024 * sizeof(char));
+ values[5] = (char *) palloc0(1024 * sizeof(char));
+
+ values[6] = (char *) palloc(64 * sizeof(char));
+ values[7] = (char *) palloc(64 * sizeof(char));
+ values[8] = (char *) palloc(64 * sizeof(char));
+
+ /* we need to do this only when printing the actual values */
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * histogram->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* bucket ID */
+
+ /*
+ * currently we only print array of indexes, but the deduplicated
+ * values should be sorted, so this is actually quite useful
+ *
+ * TODO print the actual min/max values, using the output
+ * function of the attribute type
+ */
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bucket_size *= (bucket->max[i] - bucket->min[i]) * 1.0
+ / (histogram->nvalues[i]-1);
+
+ /* print the actual values, i.e. use output function etc. */
+ if (otype == 0)
+ {
+ Datum minval, maxval;
+ Datum minout, maxout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ minval = histogram->values[i][bucket->min[i]];
+ minout = FunctionCall1(&fmgrinfo[i], minval);
+
+ maxval = histogram->values[i][bucket->max[i]];
+ maxout = FunctionCall1(&fmgrinfo[i], maxval);
+
+ // snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(minout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ // snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ snprintf(buff, 1024, format, values[2], DatumGetPointer(maxout));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else if (otype == 1)
+ {
+ format = "%s, %d";
+ if (i == 0)
+ format = "{%s%d";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %d}";
+
+ // snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ // snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else
+ {
+ format = "%s, %f";
+ if (i == 0)
+ format = "{%s%f";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %f}";
+
+ // snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ snprintf(buff, 1024, format, values[1],
+ bucket->min[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ // snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ snprintf(buff, 1024, format, values[2],
+ bucket->max[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ snprintf(buff, 1024, format, values[3], bucket->nullsonly[i] ? "t" : "f");
+ strncpy(values[3], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[4], bucket->min_inclusive[i] ? "t" : "f");
+ strncpy(values[4], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[5], bucket->max_inclusive[i] ? "t" : "f");
+ strncpy(values[5], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[6], 64, "%f", bucket->ntuples); /* frequency */
+ snprintf(values[7], 64, "%f", bucket->ntuples / bucket_size); /* density */
+ snprintf(values[8], 64, "%f", bucket_size); /* bucket_size */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+ pfree(values[4]);
+ pfree(values[5]);
+ pfree(values[6]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 448cf35..0699d6c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2101,8 +2101,8 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stakeys,\n"
- " deps_enabled, mcv_enabled,\n"
- " deps_built, mcv_built,\n"
+ " deps_enabled, mcv_enabled, hist_enabled,\n"
+ " deps_built, mcv_built, hist_built,\n"
" mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
@@ -2141,8 +2141,17 @@ describeOneTableDetails(const char *schemaname,
first = false;
}
+ if (!strcmp(PQgetvalue(result, i, 4), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", histogram");
+ else
+ appendPQExpBuffer(&buf, "(histogram");
+ first = false;
+ }
+
appendPQExpBuffer(&buf, ") ON (%s)",
- PQgetvalue(result, i, 8));
+ PQgetvalue(result, i, 10));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index c6e7d74..84579da 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -36,13 +36,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -50,6 +53,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -65,15 +69,19 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-#define Natts_pg_mv_statistic 9
+#define Natts_pg_mv_statistic 13
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_deps_enabled 2
#define Anum_pg_mv_statistic_mcv_enabled 3
-#define Anum_pg_mv_statistic_mcv_max_items 4
-#define Anum_pg_mv_statistic_deps_built 5
-#define Anum_pg_mv_statistic_mcv_built 6
-#define Anum_pg_mv_statistic_stakeys 7
-#define Anum_pg_mv_statistic_stadeps 8
-#define Anum_pg_mv_statistic_stamcv 9
+#define Anum_pg_mv_statistic_hist_enabled 4
+#define Anum_pg_mv_statistic_mcv_max_items 5
+#define Anum_pg_mv_statistic_hist_max_buckets 6
+#define Anum_pg_mv_statistic_deps_built 7
+#define Anum_pg_mv_statistic_mcv_built 8
+#define Anum_pg_mv_statistic_hist_built 9
+#define Anum_pg_mv_statistic_stakeys 10
+#define Anum_pg_mv_statistic_stadeps 11
+#define Anum_pg_mv_statistic_stamcv 12
+#define Anum_pg_mv_statistic_stahist 13
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 0d12dd3..9cd3e5a 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2732,6 +2732,10 @@ DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f
DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
DESCR("details about MCV list items");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3374 ( pg_mv_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_size}" _null_ _null_ pg_mv_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 6fab94a..b776962 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -565,10 +565,12 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
+ bool hist_enabled; /* histogram enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
+ bool hist_built; /* histogram built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index b028192..1cb9400 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -91,6 +91,123 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ *
+ * TODO add more detailed description here
+ */
+
+typedef struct MVSerializedBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ uint16 *min;
+ bool *min_inclusive;
+
+ /* indexes of upper boundaries - values and information about the
+ * inequalities (exclusive vs. inclusive) */
+ uint16 *max;
+ bool *max_inclusive;
+
+} MVSerializedBucketData;
+
+typedef MVSerializedBucketData *MVSerializedBucket;
+
+typedef struct MVSerializedHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ /*
+ * keep this the same with MVHistogramData, because of
+ * deserialization (same offset)
+ */
+ MVSerializedBucket *buckets; /* array of buckets */
+
+ /*
+ * serialized boundary values, one array per dimension, deduplicated
+ * (the min/max indexes point into these arrays)
+ */
+ int *nvalues;
+ Datum **values;
+
+} MVSerializedHistogramData;
+
+typedef MVSerializedHistogramData *MVSerializedHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -98,20 +215,27 @@ typedef MCVListData *MCVList;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
+MVHistogram load_mv_histogram(Oid mvoid);
+MVSerializedHistogram load_mv_histogram2(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVHistogram deserialize_mv_histogram(bytea * data);
+MVSerializedHistogram deserialize_mv_histogram_2(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
* dimension within the stats).
*/
int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
@@ -120,6 +244,8 @@ extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_histogram_buckets(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -129,10 +255,15 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..a3d3fd8
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE mv_histogram ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+ALTER TABLE mv_histogram ADD STATISTICS (dependencies, max_buckets 200) ON (a, b, c);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 10) ON (a, b, c);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 100000) ON (a, b, c);
+ERROR: minimum number of buckets is 16384
+-- correct command
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c, d);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index fc27d34..b02d06e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1359,7 +1359,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 63727a4..aeb89f8 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -111,4 +111,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 5b07b3b..ee1468d 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -155,3 +155,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..31c627a
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,176 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (unknown_column);
+
+-- single column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a);
+
+-- single column, duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE mv_histogram ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- missing histogram statistics
+ALTER TABLE mv_histogram ADD STATISTICS (dependencies, max_buckets 200) ON (a, b, c);
+
+-- invalid max_buckets value / too low
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 10) ON (a, b, c);
+
+-- invalid max_buckets value / too high
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 100000) ON (a, b, c);
+
+-- correct command
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c, d);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DROP TABLE mv_histogram;
--
1.9.3
0005-multi-statistics-estimation.patchtext/x-patch; name=0005-multi-statistics-estimation.patchDownload
>From fb6240254c3fb2311c3ae91597ae29bcbf18f20b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 5/6] multi-statistics estimation
The general idea is that a probability (which
is what selectivity is) can be split into a product of
conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part
may be simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute
the original probability.
The implementation works in the other direction, though.
We know what probability P(A & B & C) we need to compute,
and also what statistics are available.
So we search for a combinations of statistics, covering
the clauses in an optimal way (most clauses covered, most
dependencies exploited).
There are two possible approaches - exhaustive and greedy.
The exhaustive one walks through all permutations of
stats using dynamic programming, so it's guaranteed to
find the optimal solution, but it soon gets very slow as
it's roughly O(N!). The dynamic programming may improve
that a bit, but it's still far too expensive for large
numbers of statistics (on a single table).
The greedy algorithm is very simple - in every step choose
the best solution. That may not guarantee the best solution
globally (but maybe it does?), but it only needs N steps
to find the solution, so it's very fast (processing the
selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with
respect to runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply
them to the clauses using the conditional probabilities.
We process the selected stats one by one, and for each
we select the estimated clauses and conditions. See
clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to
be covered by a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single
multivariate statistics.
Clauses not covered by a single statistics at this level
will be passed to clause_selectivity() but this will treat
them as a collection of simpler clauses (connected by AND
or OR), and the clauses from the previous level will be
used as conditions.
So using the same example, the last clause will be passed
to clause_selectivity() with 'clause1' and 'clause2' as
conditions, and it will be processed using multivariate
stats if possible.
The other limitation is that all the expressions have to
be mv-compatible, i.e. there can't be a mix of expressions.
Fixing this should be relatively simple - just split the
list into two parts (mv-compatible/incompatible), as at
the top level.
---
contrib/file_fdw/file_fdw.c | 3 +-
contrib/postgres_fdw/postgres_fdw.c | 6 +-
src/backend/optimizer/path/clausesel.c | 2182 +++++++++++++++++++++++++++++---
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
9 files changed, 2068 insertions(+), 201 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 4368897..7b4839b 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -947,7 +947,8 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
baserel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 478e124..ff6b438 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -478,7 +478,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
@@ -1770,7 +1771,8 @@ estimate_path_cost_size(PlannerInfo *root,
local_join_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 2d3cf09..7eb53b9 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -30,7 +30,8 @@
#include "utils/typcache.h"
#include "parser/parsetree.h"
-
+#include "access/sysattr.h"
+#include "miscadmin.h"
#include <stdio.h>
@@ -48,6 +49,13 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
+
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -63,23 +71,29 @@ static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
int type);
+static Bitmapset *clause_mv_get_attnums(PlannerInfo *root, Node *clause);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, List *stats,
SpecialJoinInfo *sjinfo);
-static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
-
static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
List *clauses, Oid varRelid,
List **mvclauses, MVStatisticInfo *mvstats, int types);
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats, List *clauses,
+ List *conditions, bool is_or);
+
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats,
- bool *fullmatch, Selectivity *lowsel);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or, bool *fullmatch,
+ Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -93,6 +107,33 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static List *choose_mv_statistics(PlannerInfo *root,
+ List *mvstats,
+ List *clauses, List *conditions,
+ Oid varRelid,
+ SpecialJoinInfo *sjinfo, int type);
+
+static Bitmapset * get_varattnos(Node * node, Index relid);
+
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
+
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
#define MIN(x, y) (((x) < (y)) ? (x) : (y))
@@ -221,112 +262,296 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
/* processing mv stats */
- Oid relid = InvalidOid;
+ bool has_mv_stats;
+ Index relid = InvalidOid;
/* attributes in mv-compatible clauses */
Bitmapset *mvattnums = NULL;
- /*
- * If there's exactly one clause, then no use in trying to match up
- * pairs, so just go directly to clause_selectivity().
- */
- if (list_length(clauses) == 1)
- return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
-
- /*
- * Collect attributes referenced by mv-compatible clauses (looking
- * for clauses compatible with functional dependencies for now).
- */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- MV_CLAUSE_TYPE_FDEP);
+ /* local conditions, accumulated and passed to clauses in this list */
+ List *conditions_local = NIL;
/*
- * If there are mv-compatible clauses, referencing at least two
- * different columns (otherwise it makes no sense to use mv stats),
- * try to reduce the clauses using functional dependencies, and
- * recollect the attributes from the reduced list.
+ * Check whether there are multivariate stats on the table.
*
- * We don't need to select a single statistics for this - we can
- * apply all the functional dependencies we have.
- */
- if (bms_num_members(mvattnums) >= 2)
+ * FIXME This seems not to be working as expected. Sometimes there
+ * are multiple relids even when (varRelid==0).
+ * */
+ if (varRelid == 0)
{
- /*
- * fetch info from the catalog (not the serialized stats yet)
- *
- * TODO This is rather ugly - we get the stats as a list from
- * RelOptInfo (thanks to relcache/syscache), but we transform
- * it into an array (which the other methods use for now).
- * This should not be necessary, I guess.
- * */
- List *stats = root->simple_rel_array[relid]->mvstatlist;
+ /* find the (single) relid */
+ Index relidx;
+ Relids relids = pull_varnos((Node*)clauses);
- /* reduce clauses by applying functional dependencies rules */
- clauses = clauselist_apply_dependencies(root, clauses, varRelid,
- stats, sjinfo);
+ if (bms_num_members(relids) == 1)
+ {
+ relidx = bms_singleton_member(relids);
+ has_mv_stats
+ = (root->simple_rel_array[relidx]->mvstatlist != NIL);
+ }
+ else
+ has_mv_stats = false;
}
+ else
+ has_mv_stats
+ = (root->simple_rel_array[varRelid]->mvstatlist != NIL);
- /*
- * Recollect attributes from mv-compatible clauses (maybe we've
- * removed so many clauses we have a single mv-compatible attnum).
- * From now on we're only interested in MCV-compatible clauses.
- */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
-
- /*
- * If there still are at least two columns, we'll try to select
- * a suitable multivariate stats.
- */
- if (bms_num_members(mvattnums) >= 2)
+ /* skip the processing if there are no mv stats */
+ if (has_mv_stats)
{
+ conditions_local = list_copy(conditions);
+
/*
- * fetch info from the catalog (not the serialized stats yet)
- *
- * TODO We may need to repeat this, because the previous load only
- * happens if there are at least 2 clauses compatible with
- * functional dependencies.
+ * Collect attributes referenced by mv-compatible clauses (looking
+ * for clauses compatible with functional dependencies for now).
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_FDEP);
+
+ /*
+ * If there are mv-compatible clauses, referencing at least two
+ * different columns (otherwise it makes no sense to use mv stats),
+ * try to reduce the clauses using functional dependencies, and
+ * recollect the attributes from the reduced list.
*
- * TODO This is rather ugly - we get the stats as a list from
- * RelOptInfo (thanks to relcache/syscache), but we transform
- * it into an array (which the other methods use for now).
- * This should not be necessary, I guess.
- * */
- List *stats = root->simple_rel_array[relid]->mvstatlist;
-
- /* see choose_mv_statistics() for details */
- if (stats != NIL)
+ * We don't need to select a single statistics for this - we can
+ * apply all the functional dependencies we have.
+ */
+ if (bms_num_members(mvattnums) >= 2)
{
- MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+ /*
+ * fetch info from the catalog (not the serialized stats yet)
+ *
+ * TODO This is rather ugly - we get the stats as a list from
+ * RelOptInfo (thanks to relcache/syscache), but we transform
+ * it into an array (which the other methods use for now).
+ * This should not be necessary, I guess.
+ * */
+ List *stats = root->simple_rel_array[relid]->mvstatlist;
+
+ /* reduce clauses by applying functional dependencies rules */
+ clauses = clauselist_apply_dependencies(root, clauses, varRelid,
+ stats, sjinfo);
+ }
- if (mvstat != NULL) /* we have a matching stats */
+ /*
+ * Recollect attributes from mv-compatible clauses (maybe we've
+ * removed so many clauses we have a single mv-compatible attnum).
+ * From now on we're only interested in MCV-compatible clauses.
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * If there still are at least two columns, we'll try to select
+ * a suitable multivariate stats.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /*
+ * fetch info from the catalog (not the serialized stats yet)
+ *
+ * TODO We may need to repeat this, because the previous load only
+ * happens if there are at least 2 clauses compatible with
+ * functional dependencies.
+ *
+ * TODO This is rather ugly - we get the stats as a list from
+ * RelOptInfo (thanks to relcache/syscache), but we transform
+ * it into an array (which the other methods use for now).
+ * This should not be necessary, I guess.
+ * */
+ List *stats = root->simple_rel_array[relid]->mvstatlist;
+
+ /* see choose_mv_statistics() for details */
+ if (stats != NIL)
{
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
+ int k;
+ ListCell *s;
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+ List *solution
+ = choose_mv_statistics(root, stats,
+ clauses, conditions,
+ varRelid, sjinfo,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /* we have a good solution stats */
+ foreach (s, solution)
+ {
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
- /* we've chosen the histogram to match the clauses */
- Assert(mvclauses != NIL);
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+ List *mvclauses_new = NIL;
+ List *mvclauses_conditions = NIL;
+ Bitmapset *stat_attnums = NULL;
- /* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ /* build attnum bitmapset for this statistics */
+ for (k = 0; k < mvstat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ mvstat->stakeys->values[k]);
+
+ /*
+ * Append the compatible conditions (passed from above)
+ * to mvclauses_conditions.
+ */
+ foreach (l, conditions)
+ {
+ Node *c = (Node*)lfirst(l);
+ Bitmapset *tmp = clause_mv_get_attnums(root, c);
+
+ if (bms_is_subset(tmp, stat_attnums))
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, c);
+
+ bms_free(tmp);
+ }
+
+ /* split the clauselist into regular and mv-clauses
+ *
+ * We keep the list of clauses (we don't remove the
+ * clauses yet, because we want to use the clauses
+ * as conditions of other clauses).
+ *
+ * FIXME Do this only once, i.e. filter the clauses
+ * once (selecting clauses covered by at least
+ * one statistics) and then convert them into
+ * smaller per-statistics lists of conditions
+ * and estimated clauses.
+ */
+ clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * We've chosen the statistics to match the clauses, so
+ * each statistics from the solution should have at least
+ * one new clause (not covered by the previous stats).
+ */
+ Assert(mvclauses != NIL);
+
+ /*
+ * Mvclauses now contains only clauses compatible
+ * with the currently selected stats, but we have to
+ * split that into conditions (already matched by
+ * the previous stats), and the new clauses we need
+ * to estimate using this stats.
+ */
+ foreach (l, mvclauses)
+ {
+ ListCell *p;
+ bool covered = false;
+ Node *clause = (Node *) lfirst(l);
+ Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
+
+ /*
+ * If already covered by previous stats, add it to
+ * conditions.
+ *
+ * TODO Maybe this could be relaxed a bit? Because
+ * with complex and/or clauses, this might
+ * mean no statistics actually covers such
+ * complex clause.
+ */
+ foreach (p, solution)
+ {
+ int k;
+ Bitmapset *stat_attnums = NULL;
+
+ MVStatisticInfo *prev_stat
+ = (MVStatisticInfo *)lfirst(p);
+
+ /* break if we've ran into current statistic */
+ if (prev_stat == mvstat)
+ break;
+
+ for (k = 0; k < prev_stat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ prev_stat->stakeys->values[k]);
+
+ covered = bms_is_subset(clause_attnums, stat_attnums);
+
+ bms_free(stat_attnums);
+
+ if (covered)
+ break;
+ }
+
+ if (covered)
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, clause);
+ else
+ mvclauses_new
+ = lappend(mvclauses_new, clause);
+ }
+
+ /*
+ * We need at least one new clause (not just conditions).
+ */
+ Assert(mvclauses_new != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ mvclauses_new,
+ mvclauses_conditions,
+ false); /* AND */
+ }
+
+ /*
+ * And now finally remove all the mv-compatible clauses.
+ *
+ * This only repeats the same split as above, but this
+ * time we actually use the result list (and feed it to
+ * the next call).
+ */
+ foreach (s, solution)
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
+ /* split the list into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * Add the clauses to the conditions (to be passed
+ * to regular clauses), irrespectedly whether it
+ * will be used as a condition or a clause here.
+ *
+ * We only keep the remaining conditions in the
+ * clauses (we keep what clauselist_mv_split returns)
+ * so we add each MV condition exactly once.
+ */
+ conditions_local = list_concat(conditions_local, mvclauses);
+ }
}
}
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ {
+ Selectivity s = clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo,
+ conditions_local);
+ list_free(conditions_local);
+ return s;
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -338,7 +563,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions_local);
/*
* Check for being passed a RestrictInfo.
@@ -493,6 +719,293 @@ clauselist_selectivity(PlannerInfo *root,
rqlist = rqnext;
}
+ /* free the local conditions */
+ list_free(conditions_local);
+
+ return s1;
+}
+
+/*
+ * Similar to clauselist_selectivity(), but for clauses connected by OR.
+ *
+ * That means a few differences:
+ *
+ * - functional dependencies don't apply to OR-clauses
+ *
+ * - we can't add the previous clauses to conditions
+ *
+ * - combined selectivity is computed as (s1+s2 - s1*s2) and not as
+ * a multiplication (s1*s2)
+ *
+ * Another way to evaluate this might be turning
+ *
+ * (a OR b OR c)
+ *
+ * into
+ *
+ * NOT ((NOT a) AND (NOT b) AND (NOT c))
+ *
+ * and computing selectivity of that using clauselist_selectivity().
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
+{
+ Selectivity s1 = 0.0;
+ ListCell *l;
+
+ /* processing mv stats */
+ Index relid = InvalidOid;
+
+ /* attributes in mv-compatible clauses */
+ Bitmapset *mvattnums = NULL;
+ bool has_mv_stats;
+
+ /*
+ * Check whether there are multivariate stats on the table.
+ *
+ * FIXME This seems not to be working as expected. Sometimes there
+ * are multiple relids even when (varRelid==0).
+ * */
+ if (varRelid == 0)
+ {
+ /* find the (single) relid */
+ Index relidx;
+ Relids relids = pull_varnos((Node*)clauses);
+
+ if (bms_num_members(relids) == 1)
+ {
+ relidx = bms_singleton_member(relids);
+ has_mv_stats
+ = (root->simple_rel_array[relidx]->mvstatlist != NIL);
+ }
+ else
+ has_mv_stats = false;
+ }
+ else
+ has_mv_stats
+ = (root->simple_rel_array[varRelid]->mvstatlist != NIL);
+
+ if (has_mv_stats)
+ {
+ /*
+ * Recollect attributes from mv-compatible clauses (maybe we've
+ * removed so many clauses we have a single mv-compatible attnum).
+ * From now on we're only interested in MCV-compatible clauses.
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * If there still are at least two columns, we'll try to select
+ * a suitable multivariate stats.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /*
+ * fetch info from the catalog (not the serialized stats yet)
+ *
+ * TODO We may need to repeat this, because the previous load only
+ * happens if there are at least 2 clauses compatible with
+ * functional dependencies.
+ *
+ * TODO This is rather ugly - we get the stats as a list from
+ * RelOptInfo (thanks to relcache/syscache), but we transform
+ * it into an array (which the other methods use for now).
+ * This should not be necessary, I guess.
+ * */
+ List *stats = root->simple_rel_array[relid]->mvstatlist;
+
+ /* see choose_mv_statistics() for details */
+ if (stats != NIL)
+ {
+ int k;
+ ListCell *s;
+
+ List *solution
+ = choose_mv_statistics(root, stats,
+ clauses, conditions,
+ varRelid, sjinfo,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /* we have a good solution stats */
+ foreach (s, solution)
+ {
+ Selectivity s2;
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+ List *mvclauses_new = NIL;
+ List *mvclauses_conditions = NIL;
+ Bitmapset *stat_attnums = NULL;
+
+ /* build attnum bitmapset for this statistics */
+ for (k = 0; k < mvstat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ mvstat->stakeys->values[k]);
+
+ /*
+ * Append the compatible conditions (passed from above)
+ * to mvclauses_conditions.
+ */
+ foreach (l, conditions)
+ {
+ Node *c = (Node*)lfirst(l);
+ Bitmapset *tmp = clause_mv_get_attnums(root, c);
+
+ if (bms_is_subset(tmp, stat_attnums))
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, c);
+
+ bms_free(tmp);
+ }
+
+ /* split the clauselist into regular and mv-clauses
+ *
+ * We keep the list of clauses (we don't remove the
+ * clauses yet, because we want to use the clauses
+ * as conditions of other clauses).
+ *
+ * FIXME Do this only once, i.e. filter the clauses
+ * once (selecting clauses covered by at least
+ * one statistics) and then convert them into
+ * smaller per-statistics lists of conditions
+ * and estimated clauses.
+ */
+ clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * We've chosen the statistics to match the clauses, so
+ * each statistics from the solution should have at least
+ * one new clause (not covered by the previous stats).
+ */
+ Assert(mvclauses != NIL);
+
+ /*
+ * Mvclauses now contains only clauses compatible
+ * with the currently selected stats, but we have to
+ * split that into conditions (already matched by
+ * the previous stats), and the new clauses we need
+ * to estimate using this stats.
+ *
+ * XXX We'll only use the new clauses, but maybe we
+ * should use the conditions too, somehow. We can't
+ * use that directly in conditional probability, but
+ * maybe we might use them in a different way?
+ *
+ * If we have a clause (a OR b OR c), then knowing
+ * that 'a' is TRUE means (b OR c) can't make the
+ * whole clause FALSE.
+ *
+ * This is pretty much what
+ *
+ * (a OR b) == NOT ((NOT a) AND (NOT b))
+ *
+ * implies.
+ */
+ foreach (l, mvclauses)
+ {
+ ListCell *p;
+ bool covered = false;
+ Node *clause = (Node *) lfirst(l);
+ Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
+
+ /*
+ * If already covered by previous stats, add it to
+ * conditions.
+ *
+ * TODO Maybe this could be relaxed a bit? Because
+ * with complex and/or clauses, this might
+ * mean no statistics actually covers such
+ * complex clause.
+ */
+ foreach (p, solution)
+ {
+ int k;
+ Bitmapset *stat_attnums = NULL;
+
+ MVStatisticInfo *prev_stat
+ = (MVStatisticInfo *)lfirst(p);
+
+ /* break if we've ran into current statistic */
+ if (prev_stat == mvstat)
+ break;
+
+ for (k = 0; k < prev_stat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ prev_stat->stakeys->values[k]);
+
+ covered = bms_is_subset(clause_attnums, stat_attnums);
+
+ bms_free(stat_attnums);
+
+ if (covered)
+ break;
+ }
+
+ if (! covered)
+ mvclauses_new = lappend(mvclauses_new, clause);
+ }
+
+ /*
+ * We need at least one new clause (not just conditions).
+ */
+ Assert(mvclauses_new != NIL);
+
+ /* compute the multivariate stats */
+ s2 = clauselist_mv_selectivity(root, mvstat,
+ mvclauses_new,
+ mvclauses_conditions,
+ true); /* OR */
+
+ s1 = s1 + s2 - s1 * s2;
+ }
+
+ /*
+ * And now finally remove all the mv-compatible clauses.
+ *
+ * This only repeats the same split as above, but this
+ * time we actually use the result list (and feed it to
+ * the next call).
+ */
+ foreach (s, solution)
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
+ /* split the list into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+ }
+ }
+ }
+ }
+
+ /*
+ * Handle the remaining clauses (either using regular statistics,
+ * or by multivariate stats at the next level).
+ */
+ foreach(l, clauses)
+ {
+ Selectivity s2 = clause_selectivity(root,
+ (Node *) lfirst(l),
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
+ s1 = s1 + s2 - s1 * s2;
+ }
+
return s1;
}
@@ -703,7 +1216,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -833,7 +1347,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -842,29 +1357,18 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
- /*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
- */
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(arg),
- varRelid,
- jointype,
- sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ /* just call to clauselist_selectivity_or() */
+ s1 = clauselist_selectivity_or(root,
+ ((BoolExpr *) clause)->args,
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -973,7 +1477,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -982,7 +1487,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
/* Cache the result if possible */
@@ -1164,9 +1670,67 @@ clause_selectivity(PlannerInfo *root,
* that from the most selective clauses first, because that'll
* eliminate the buckets/items sooner (so we'll be able to skip
* them without inspection, which is more expensive).
+ *
+ * TODO All this is based on the assumption that the statistics represent
+ * the necessary dependencies, i.e. that if two colunms are not in
+ * the same statistics, there's no dependency. If that's not the
+ * case, we may get misestimates, just like before. For example
+ * assume we have a table with three columns [a,b,c] with exactly
+ * the same values, and statistics on [a,b] and [b,c]. So somthing
+ * like this:
+ *
+ * CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+ *
+ * ALTER TABLE test ADD STATISTICS (mcv) ON (a,b);
+ * ALTER TABLE test ADD STATISTICS (mcv) ON (b,c);
+ *
+ * ANALYZE test;
+ *
+ * EXPLAIN ANALYZE SELECT * FROM test
+ * WHERE (a < 10) AND (b < 20) AND (c < 10);
+ *
+ * The problem here is that the only shared column between the two
+ * statistics is 'b' so the probability will be computed like this
+ *
+ * P[(a < 10) & (b < 20) & (c < 10)]
+ * = P[(a < 10) & (b < 20)] * P[(c < 10) | (a < 10) & (b < 20)]
+ * = P[(a < 10) & (b < 20)] * P[(c < 10) | (b < 20)]
+ *
+ * or like this
+ *
+ * P[(a < 10) & (b < 20) & (c < 10)]
+ * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20) & (c < 10)]
+ * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20)]
+ *
+ * In both cases the conditional probabilities will be evaluated as
+ * 0.5, because they lack the other column (which would make it 1.0).
+ *
+ * Theoretically it might be possible to transfer the dependency,
+ * e.g. by building bitmap for [a,b] and then combine it with [b,c]
+ * by doing something like this:
+ *
+ * 1) build bitmap on [a,b] using [(a<10) & (b < 20)]
+ * 2) for each element in [b,c] check the bitmap
+ *
+ * But that's certainly nontrivial - for example the statistics may
+ * be different (MCV list vs. histogram) and/or the items may not
+ * match (e.g. MCV items or histogram buckets will be built
+ * differently). Also, for one value of 'b' there might be multiple
+ * MCV items (because of the other column values) with different
+ * bitmap values (some will match, some won't) - so it's not exactly
+ * bitmap but a partial match.
+ *
+ * Maybe a hash table with number of matches and mismatches (or
+ * maybe sums of frequencies) would work? The step (2) would then
+ * lookup the values and use that to weight the item somehow.
+ *
+ * Currently the only solution is to build statistics on all three
+ * columns.
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
@@ -1184,7 +1748,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
*/
/* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions, is_or,
&fullmatch, &mcv_low);
/*
@@ -1197,7 +1762,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
/* FIXME if (fullmatch) without matching MCV item, use the mcv_low
* selectivity as upper bound */
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions, is_or);
/* TODO clamp to <= 1.0 (or more strictly, when possible) */
return s1 + s2;
@@ -1226,24 +1792,683 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
{
Node *clause = (Node *) lfirst(l);
- /* ignore the result here - we only need the attnums */
- clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
- sjinfo, types);
- }
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
+ sjinfo, types);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ *relid = InvalidOid;
+ }
+
+ return attnums;
+}
+
+/*
+ * Selects the best combination of multivariate statistics, where
+ * 'best' means:
+ *
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ *
+ * There may be other optimality criteria, not considered in the initial
+ * implementation (more on that 'weaknesses' section).
+ *
+ * This is pretty much equal to splitting the probability of clauses
+ * (aka selectivity) into a sequence of conditional probabilities, like
+ * this
+ *
+ * P(A,B,C,D) = P(A,B) * P(C|A,B) * P(D|A,B,C)
+ *
+ * and removing the attributes not referenced by the existing stats,
+ * under the assumption that there's no dependency (otherwise the DBA
+ * would create the stats).
+ *
+ *
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with
+ * maximum 'depth' equal to the number of multi-variate statistics
+ * available on the table.
+ *
+ * It explores all the possible permutations of the stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it
+ * matches are divided into 'conditions' (clauses already matched by at
+ * least one previous statistics) and clauses that are estimated.
+ *
+ * Then several checks are performed:
+ *
+ * (a) The statistics covers at least 2 columns, referenced in the
+ * estimated clauses (otherwise multi-variate stats are useless).
+ *
+ * (b) The statistics covers at least 1 new column, i.e. column not
+ * refefenced by the already used stats (and the new column has
+ * to be referenced by the clauses, of couse). Otherwise the
+ * statistics would not add any new information.
+ *
+ * There are some other sanity checks (e.g. that the stats must not be
+ * used twice etc.).
+ *
+ * Finally the new solution is compared to the currently best one, and
+ * if it's considered better, it's used instead.
+ *
+ *
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a somewhat simple optimality criteria,
+ * suffering by the following weaknesses.
+ *
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but
+ * with statistics in a different order). It's unclear which solution
+ * is the best one - in a sense all of them are equal.
+ *
+ * TODO It might be possible to compute estimate for each of those
+ * solutions, and then combine them to get the final estimate
+ * (e.g. by using average or median).
+ *
+ * (b) Does not consider that some types of stats are a better match for
+ * some types of clauses (e.g. MCV list is a good match for equality
+ * than a histogram).
+ *
+ * XXX Maybe MCV is almost always better / more accurate?
+ *
+ * But maybe this is pointless - generally, each column is either
+ * a label (it's not important whether because of the data type or
+ * how it's used), or a value with ordering that makes sense. So
+ * either a MCV list is more appropriate (labels) or a histogram
+ * (values with orderings).
+ *
+ * Now sure what to do with statistics on columns mixing columns of
+ * both types - maybe it'd be beeter to invent a new type of stats
+ * combining MCV list and histogram (keeping a small histogram for
+ * each MCV item, and a separate histogram for values not on the
+ * MCV list). But that's not implemented at this moment.
+ *
+ * (c) Does not consider that some solutions may better exploit the
+ * dependencies. For example with clauses on columns [A,B,C,D] and
+ * statistics on [A,B,C] and [C,D] cover all the columns just like
+ * [A,B,C] and [B,C,D], but the latter probably exploits additional
+ * dependencies thanks to having 'B' in both stats (thus allowing
+ * using it as a condition for the second stats). Of course, if
+ * B and [C,D] are independent, this is untrue - but if we have that
+ * statistics created, it's a sign that the DBA/developer believes
+ * there's a dependency.
+ *
+ * (d) Does not consider the order of clauses, which may be significant.
+ * For example, when there's a mix of simple and complex clauses,
+ * i.e. something like
+ *
+ * (a=2) AND (b=3 OR (c=3 AND d=4)) AND (c=3)
+ *
+ * It may be better to evaluate the simple clauses first, and then
+ * use them as conditions for the complex clause.
+ *
+ * We can for example count number of different attributes
+ * referenced in the clause, and use that as a metric of complexity
+ * (lower number -> simpler). Maybe use ratio (#vars/#atts) or
+ * (#clauses/#atts) as secondary metrics? Also the general complexity
+ * of the clause (levels of nesting etc.) might be useful.
+ *
+ * Hopefully most clauses will be reasonably simple, though.
+ *
+ * Update: On second thought, I believe the order of clauses is
+ * determined by choosing the order of statistics, and therefore
+ * optimized by the current algorithm.
+ *
+ * TODO Consider adding a counter of attributes covered by previous
+ * stats (possibly tracking the number of how many stats reference
+ * it too), and use this 'dependency_count' when selecting the best
+ * solution (not sure how). Similarly to (a) it might be possible
+ * to build estimate for each solution (different criteria) and then
+ * combine them somehow.
+ *
+ * TODO The current implementation repeatedly walks through the previous
+ * stats, just to compute the number of covered attributes over and
+ * over. With non-trivial number of statistics this might be an
+ * issue, so maybe we should keep track of 'covered' attributes by
+ * each step, so that we can get rid of this. We'll need this
+ * information anyway (when splitting clauses into condition and
+ * the estimated part).
+ *
+ * TODO This needs to consider the conditions passed from the preceding
+ * and upper clauses (in complex cases), but only as conditions
+ * and not as estimated clauses. So it needs to somehow affect the
+ * score (the more conditions we use the better).
+ *
+ * TODO The algorithm should probably count number of Vars (not just
+ * attnums) when computing the 'score' of each solution. Computing
+ * the ratio of (num of all vars) / (num of condition vars) as a
+ * measure of how well the solution uses conditions might be
+ * useful.
+ *
+ * TODO This might be much easier if we kept Bitmapset of attributes
+ * covered by the stats up to that step.
+ *
+ * FIXME When comparing the solutions, we currently use this condition:
+ *
+ * ((current->nstats > (*best)->nstats))
+ *
+ * i.e. we're choosing solution with more stats, because with
+ * clauses
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * and stats on [a,b], [b,c], [c,d] we want to choose the solution
+ * with all three stats, and not just [a,b], [c,d]. Otherwise we'd
+ * fail to exploit one of the dependencies.
+ *
+ * This is however a workaround for another issue - we're not
+ * tracking number of 'dependencies' covered by the solution, only
+ * number of clauses, and that's the same for both solutions.
+ * ([a,b], [c,d]) and ([a,b], [b,c], [c,d]) both cover all 4 clauses.
+ *
+ * Once a suitable metric is added, we want to choose the solution
+ * with less stats, assuming it covers the same number of clauses
+ * and exploits the same number of dependencies.
+ */
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ /*
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int c;
+
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
+
+ Bitmapset *all_attnums = NULL;
+ Bitmapset *new_attnums = NULL;
+
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nclauses; c++)
+ {
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
+
+ if (covered)
+ break;
+ }
+
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
+
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
+
+ /* add the attnums into attnums from 'new clauses' */
+ // new_attnums = bms_union(new_attnums, clause_attnums);
+ }
+
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+ bms_free(new_attnums);
+
+ all_attnums = NULL;
+ new_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
+ {
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+ }
+
+ /*
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
+ */
+ ruled_out[i] = step;
+
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
+
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
+
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ /* we can't get more conditions that clauses and conditions combined
+ *
+ * FIXME This assert does not work because we count the conditions
+ * repeatedly (once for each statistics covering it).
+ */
+ /* Assert((nconditions + nclauses) >= current->nconditions); */
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics
+ * covering the clauses. This chooses the "best" statistics at each step,
+ * so the resulting solution may not be the best solution globally, but
+ * this produces the solution in only N steps (where N is the number of
+ * statistics), while the exhaustive approach may have to walk through
+ * ~N! combinations (although some of those are terminated early).
+ *
+ * TODO There are probably other metrics we might use - e.g. using
+ * number of columns (num_cond_columns / num_cov_columns), which
+ * might work better with a mix of simple and complex clauses.
+ *
+ * TODO Also the choice at the very first step should be handled
+ * in a special way, because there will be 0 conditions at that
+ * moment, so there needs to be some other criteria - e.g. using
+ * the simplest (or most complex?) clause might be a good idea.
+ *
+ * TODO We might also select multiple stats using different criteria,
+ * and branch the search. This is however tricky, because if we
+ * choose k statistics at each step, we get k^N branches to
+ * walk through (with N steps). That's not really good with
+ * large number of stats (yet better than exhaustive search).
+ */
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
+
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
+
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
+ */
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *new = bms_union(attnums_covered, clauses_attnums[j]);
+
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = new;
+
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
+
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
+
+ if (gain > max_gain)
+ {
+ max_gain = gain;
+ best_stat = i;
+ }
+ }
+
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
- /*
- * If there are not at least two attributes referenced by the clause(s),
- * we can throw everything out (as we'll revert to simple stats).
- */
- if (bms_num_members(attnums) <= 1)
- {
- if (attnums != NULL)
- pfree(attnums);
- attnums = NULL;
- *relid = InvalidOid;
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
}
- return attnums;
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
}
/*
@@ -1314,56 +2539,498 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
* TODO This will probably have to consider compatibility of clauses,
* because 'dependencies' will probably work only with equality
* clauses.
+ *
+ * TODO Another way to make the optimization problems smaller might
+ * be splitting the statistics into several disjoint subsets, i.e.
+ * if we can split the graph of statistics (after the elimination)
+ * into multiple components (so that stats in different components
+ * share no attributes), we can do the optimization for each
+ * component separately.
+ *
+ * TODO Another possible optimization might be removing redundant
+ * statistics - if statistics S1 covers S2 (covers S2 attributes
+ * and possibly some more), we can probably remove S2. What
+ * actually matters are attributes from covered clauses (not all
+ * the original attributes). This might however prefer larger,
+ * and thus less accurate, statistics.
+ *
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew
+ * that we can cover 10 clauses and reuse 8 dependencies, maybe
+ * covering 9 clauses and 7 dependencies would be OK?
*/
-static MVStatisticInfo *
-choose_mv_statistics(List *stats, Bitmapset *attnums)
+static List*
+choose_mv_statistics(PlannerInfo *root, List *stats,
+ List *clauses, List *conditions,
+ Oid varRelid, SpecialJoinInfo *sjinfo, int type)
{
- int i;
- ListCell *lc;
+ int i, j;
+ mv_solution_t *best = NULL;
+ List *result = NIL;
+ ListCell *l;
+
+ int nmvstats = list_length(stats);
+ MVStatisticInfo *mvstats
+ = (MVStatisticInfo *)palloc0(nmvstats * sizeof(MVStatisticInfo));
+
+ /* pass only stats matching at least two attributes (from clauses) */
+ MVStatisticInfo *mvstats_filtered
+ = (MVStatisticInfo*)palloc0(nmvstats * sizeof(MVStatisticInfo));
- MVStatisticInfo *choice = NULL;
+ int nmvstats_filtered;
+ bool repeat = true;
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
+
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
+
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
+
+ /* convert the list of stats into array, to make it easier/faster */
+ nmvstats = 0;
+ foreach (l, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(l);
+
+ /* we only care about stats with MCV/histogram in this part */
+ if (! (info->mcv_built || info->hist_built))
+ continue;
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ memcpy(&mvstats[nmvstats], info, sizeof(MVStatisticInfo));
+ nmvstats++;
+ }
/*
- * Walk through the statistics (simple array with nmvstats elements)
- * and for each one count the referenced attributes (encoded in
- * the 'attnums' bitmap).
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
*/
- foreach (lc, stats)
+ while (repeat)
{
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+ /* pass only mv-compatible clauses covered by at least one statistics */
+ List *compatible_clauses = NIL;
+ List *compatible_conditions = NIL;
- /* columns matching this statistics */
- int matches = 0;
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
+
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new.
+ */
+ foreach (l, clauses)
+ {
+ Node *clause = (Node*)lfirst(l);
+ Bitmapset *clause_attnums = NULL;
+ Index relid;
- int2vector * attrs = info->stakeys;
- int numattrs = attrs->dim1;
+ /*
+ * The clause has to be mv-compatible (suitable operators etc.).
+ */
+ if (! clause_is_mv_compatible(root, clause, varRelid,
+ &relid, &clause_attnums, sjinfo, type))
+ continue;
- /* skip dependencies-only stats */
- if (! (info->mcv_built || info->hist_built))
- continue;
+ /* is there a statistics covering this clause? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int k, matches = 0;
+ for (k = 0; k < mvstats[i].stakeys->dim1; k++)
+ {
+ if (bms_is_member(mvstats[i].stakeys->values[k],
+ clause_attnums))
+ matches += 1;
+ }
+
+ /*
+ * The clause is compatible if all attributes it references
+ * are covered by the statistics.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ compatible_attnums = bms_union(compatible_attnums,
+ clause_attnums);
+ compatible_clauses = lappend(compatible_clauses,
+ clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible clauses that source clauses */
+ Assert(list_length(clauses) >= list_length(compatible_clauses));
+
+ /* work with only compatible clauses from now */
+ list_free(clauses);
+ clauses = compatible_clauses;
+
+ /*
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new.
+ */
+
+ /* next, generate bitmap of attnums from all mv_compatible conditions */
+ foreach (l, conditions)
+ {
+ Node *clause = (Node*)lfirst(l);
+ Bitmapset *clause_attnums = NULL;
+ Index relid;
+
+ /*
+ * The clause has to be mv-compatible (suitable operators etc.).
+ */
+ if (! clause_is_mv_compatible(root, clause, varRelid,
+ &relid, &clause_attnums, sjinfo, type))
+ continue;
+
+ /* is there a statistics covering this clause? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int k, matches = 0;
+ for (k = 0; k < mvstats[i].stakeys->dim1; k++)
+ {
+ if (bms_is_member(mvstats[i].stakeys->values[k],
+ clause_attnums))
+ matches += 1;
+ }
+
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ condition_attnums = bms_union(condition_attnums,
+ clause_attnums);
+ compatible_conditions = lappend(compatible_conditions,
+ clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(conditions) >= list_length(compatible_conditions));
+
+ /* keep only compatible clauses */
+ list_free(conditions);
+ conditions = compatible_conditions;
+
+ /* get a union of attnums (from conditions and clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ nmvstats_filtered = 0;
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ for (k = 0; k < mvstats[i].stakeys->dim1; k++)
+ {
+ /* attribute covered by new clause(s) */
+ if (bms_is_member(mvstats[i].stakeys->values[k],
+ compatible_attnums))
+ matches_new += 1;
+
+ /* attribute covered by clause(s) or ondition(s) */
+ if (bms_is_member(mvstats[i].stakeys->values[k],
+ all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ {
+ mvstats_filtered[nmvstats_filtered] = mvstats[i];
+ nmvstats_filtered += 1;
+ }
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(nmvstats >= nmvstats_filtered);
+
+ /* if we've eliminated a statistics, trigger another round */
+ repeat = (nmvstats > nmvstats_filtered);
+
+ /* work only with filtered statistics from now */
+ if (nmvstats_filtered < nmvstats)
+ {
+ nmvstats = nmvstats_filtered;
+ memcpy(mvstats, mvstats_filtered, sizeof(MVStatisticInfo)*nmvstats);
+ nmvstats_filtered = 0;
+ }
+ }
+
+ /* only do the optimization if we have clauses/statistics */
+ if ((nmvstats == 0) || (list_length(clauses) == 0))
+ return NULL;
+
+ stats_attnums
+ = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset *));
+
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i] = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+ }
+
+ /*
+ * Now let's remove redundant statistics, covering the same attributes
+ * as some other stats, when restricted to the attributes from
+ * remaining clauses.
+ *
+ * When a redundancy is detected, we simply keep the smaller
+ * statistics (less number of columns), on the assumption that it's
+ * more accurate and faster to process. That might be incorrect for
+ * two reasons - first, the accuracy really depends on number of
+ * buckets/MCV items, not the number of columns. Second, we might
+ * prefer MCV lists over histograms or something like that.
+ *
+ * XXX This might be done in the while loop above, but it does not
+ * change the result at all (or is not supposed to), so let's do
+ * that only once.
+ */
+ {
+ /* by default, none of the stats is redundant */
+ bool *redundant = palloc0(nmvstats * sizeof(bool));
+
+ /* we only expect a single varno here */
+ Relids varnos = pull_varnos((Node*)clauses);
+
+ /* get the varattnos (skip system attributes, although that
+ * should be impossible thanks to previous filtering out of
+ * incompatible clauses) */
+ Bitmapset *varattnos = get_varattnos((Node*)clauses,
+ bms_singleton_member(varnos));
+
+ for (i = 1; i < nmvstats; i++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
+
+ /* walk through 'previous' stats and check redundancy */
+ for (j = 0; j < i; j++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *prev;
+
+ /* skip stats already identified as redundant */
+ if (redundant[j])
+ continue;
- /* count columns covered by the histogram */
- for (i = 0; i < numattrs; i++)
- if (bms_is_member(attrs->values[i], attnums))
- matches++;
+ prev = bms_intersect(stats_attnums[j], varattnos);
+
+ switch (bms_subset_compare(curr, prev))
+ {
+ case BMS_EQUAL:
+ /*
+ * Use the smaller one (hopefully more accurate).
+ * If both have the same size, use the first one.
+ */
+ if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
+ redundant[i] = TRUE;
+ else
+ redundant[j] = TRUE;
+
+ break;
+
+ case BMS_SUBSET1: /* curr is subset of prev */
+ redundant[i] = TRUE;
+ break;
+
+ case BMS_SUBSET2: /* prev is subset of curr */
+ redundant[j] = TRUE;
+ break;
+
+ case BMS_DIFFERENT:
+ /* do nothing - keep both stats */
+ break;
+ }
+
+ bms_free(prev);
+ }
+
+ bms_free(curr);
+ }
+
+ /* now, let's remove the reduced statistics from the arrays */
+ j = 0;
+ for (i = 0; i < nmvstats; i++)
+ {
+ if (redundant[i])
+ continue;
+
+ stats_attnums[j] = stats_attnums[i];
+ mvstats[j] = mvstats[i];
+
+ j++;
+ }
+
+ nmvstats = j;
+ }
+
+ /* collect clauses an bitmap of attnums */
+ nclauses = 0;
+ clauses_attnums = (Bitmapset **)palloc0(list_length(clauses)
+ * sizeof(Bitmapset *));
+ clauses_array = (Node **)palloc0(list_length(clauses)
+ * sizeof(Node *));
+
+ foreach (l, clauses)
+ {
+ Index relid;
+ Bitmapset * attnums = NULL;
/*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
+ * The clause has to be mv-compatible (suitable operators etc.).
*/
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
+ if (! clause_is_mv_compatible(root, (Node *)lfirst(l), varRelid,
+ &relid, &attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ clauses_attnums[nclauses] = attnums;
+ clauses_array[nclauses] = (Node *)lfirst(l);
+ nclauses += 1;
+ }
+
+ /* collect conditions and bitmap of attnums */
+ nconditions = 0;
+ conditions_attnums = (Bitmapset **)palloc0(list_length(conditions)
+ * sizeof(Bitmapset *));
+ conditions_array = (Node **)palloc0(list_length(conditions)
+ * sizeof(Node *));
+
+ foreach (l, conditions)
+ {
+ Index relid;
+ Bitmapset * attnums = NULL;
+
+ /* conditions are mv-compatible (thanks to the reduction) */
+ if (! clause_is_mv_compatible(root, (Node *)lfirst(l), varRelid,
+ &relid, &attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ conditions_attnums[nconditions] = attnums;
+ conditions_array[nconditions] = (Node *)lfirst(l);
+ nconditions += 1;
+ }
+
+ /*
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
+ */
+ clause_cover_map = (bool*)palloc0(nclauses * nmvstats);
+ condition_cover_map = (bool*)palloc0(nconditions * nmvstats);
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ ruled_out[i] = -1; /* not ruled out by default */
+ for (j = 0; j < nclauses; j++)
+ {
+ clause_cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j],
+ stats_attnums[i]);
+ }
+
+ for (j = 0; j < nconditions; j++)
+ {
+ condition_cover_map[i * nconditions + j]
+ = bms_is_subset(conditions_attnums[j],
+ stats_attnums[i]);
+ }
+ }
+
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* maybe we should leave the cleanup up to the memory context */
+ pfree(mvstats_filtered);
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(clauses_array);
+ pfree(conditions_attnums);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+
+ if (best != NULL)
+ {
+ for (i = 0; i < best->nstats; i++)
{
- choice = info;
- current_matches = matches;
- current_dims = numattrs;
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
+ result = lappend(result, info);
}
+ pfree(best);
}
- return choice;
+ pfree(mvstats);
+
+ return result;
}
@@ -1639,6 +3306,51 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
return false;
}
+
+static Bitmapset *
+clause_mv_get_attnums(PlannerInfo *root, Node *clause)
+{
+ Bitmapset * attnums = NULL;
+
+ /* Extract clause from restrict info, if needed. */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+
+ if (IsA(linitial(expr->args), Var))
+ attnums = bms_add_member(attnums,
+ ((Var*)linitial(expr->args))->varattno);
+ else
+ attnums = bms_add_member(attnums,
+ ((Var*)lsecond(expr->args))->varattno);
+ }
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ attnums = bms_add_member(attnums,
+ ((Var*)((NullTest*)clause)->arg)->varattno);
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ attnums = bms_join(attnums,
+ clause_mv_get_attnums(root, (Node*)lfirst(l)));
+ }
+ }
+
+ return attnums;
+}
+
/*
* Performs reduction of clauses using functional dependencies, i.e.
* removes clauses that are considered redundant. It simply walks
@@ -2071,22 +3783,26 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -2097,17 +3813,44 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+
+ /* conditions (always AND-connected) */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+ /* by default all the MCV items match the clauses fully (AND) or
+ * not at all (OR) */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (is_or)
+ memset(matches, MVSTATS_MATCH_NONE, sizeof(char)*nmatches);
+ else
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
- nmatches, matches,
- lowsel, fullmatch, false);
+ ((is_or) ? 0 : nmatches), matches,
+ lowsel, fullmatch, is_or);
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
@@ -2115,14 +3858,25 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
/* used to 'scale' for MCV lists not covering all tuples */
u += mcvlist->items[i]->frequency;
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s*u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -2520,13 +4274,16 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
MVSerializedHistogram mvhist = NULL;
@@ -2539,36 +4296,77 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
Assert (mvhist != NULL);
Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
/*
* Bitmap of bucket matches (mismatch, partial, full). by default
* all buckets fully match (and we'll eliminate them).
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+ matches = palloc0(sizeof(char) * nmatches);
- nmatches = mvhist->nbuckets;
+ if (is_or)
+ memset(matches, MVSTATS_MATCH_NONE, sizeof(char)*nmatches);
+ else
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /* build the match bitmap for the conditions */
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, is_or);
- /* build the match bitmap */
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
- nmatches, matches, false);
+ ((is_or) ? 0 : nmatches), matches,
+ is_or);
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t);
}
/*
@@ -3191,11 +4989,35 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
}
}
- elog(WARNING, "calls=%d hits=%d hit ratio %.2f",
- calls, hits, hits * 100.0 / calls);
+// elog(WARNING, "calls=%d hits=%d hit ratio %.2f",
+// calls, hits, hits * 100.0 / calls);
/* free the call cache */
pfree(callcache);
return nmatches;
}
+
+static Bitmapset *
+get_varattnos(Node * node, Index relid)
+{
+ int k;
+ Bitmapset *varattnos = NULL;
+ Bitmapset *result = NULL;
+
+ /* get the varattnos */
+ pull_varattnos(node, relid, &varattnos);
+
+ k = -1;
+ while ((k = bms_next_member(varattnos, k)) >= 0)
+ {
+ if (k + FirstLowInvalidHeapAttributeNumber > 0)
+ result
+ = bms_add_member(result,
+ k + FirstLowInvalidHeapAttributeNumber);
+ }
+
+ bms_free(varattnos);
+
+ return result;
+}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 1a0d358..71beb2e 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3280,7 +3280,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3303,7 +3304,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3470,7 +3472,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3506,7 +3508,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3543,7 +3546,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3681,12 +3685,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3698,7 +3704,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index f0acc14..e41508b 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 4dd3f9f..326dd36 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1580,13 +1580,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6196,7 +6198,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6521,7 +6524,8 @@ btcostestimate(PG_FUNCTION_ARGS)
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7264,7 +7268,8 @@ gincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7496,7 +7501,7 @@ brincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 8727ee3..bd2c7a9 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -393,6 +394,15 @@ static const struct config_enum_entry row_security_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3648,6 +3658,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 9c2000b..7a3835b 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -182,11 +182,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 1cb9400..6909294 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -16,6 +16,14 @@
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
--
1.9.3
0006-teach-expression-walker-about-RestrictInfo-because-o.patchtext/x-patch; name=0006-teach-expression-walker-about-RestrictInfo-because-o.patchDownload
>From 08f19b674c35127d9c8a8f2cfa371fbf3c80ff00 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 6/6] teach expression walker about RestrictInfo (because of
pull_varnos)
---
src/backend/nodes/nodeFuncs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index d6f1f5b..843f06d 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -1933,6 +1933,8 @@ expression_tree_walker(Node *node,
return walker(((PlaceHolderInfo *) node)->ph_var, context);
case T_RangeTblFunction:
return walker(((RangeTblFunction *) node)->funcexpr, context);
+ case T_RestrictInfo:
+ return walker(((RestrictInfo *) node)->clause, context);
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(node));
--
1.9.3
Hello, this might be somewhat out of place but strongly related
to this patch so I'll propose this here.
This is a proposal of new feature for this patch or asking for
your approval for my moving on this as a different (but very
close) project.
===
Attached is v6 of the multivariate stats, with a number of
improvements:
...
2) fix of pg_proc issues (reported by Jeff)
3) rebase to current master
Unfortunately, the v6 patch suffers some system oid conflicts
with recently added ones. And what more unfortunate for me is
that the code for functional dependencies looks undone:)
I mention this because I recently had a issue from strong
correlation between two columns in dbt3 benchmark. Two columns in
some table are in strong correlation but not in functional
dependencies, there are too many values and the distribution of
them is very uniform so MCV is no use for the table (histogram
has nothing to do with equal conditions). As the result, planner
estimates the number of rows largely wrong as expected especially
for joins.
I, then, had a try calculating the ratio between the product of
distinctness of every column and the distinctness of the set of
the columns, call it multivariate coefficient here, and found
that it looks greately useful for the small storage space, less
calculation, and simple code.
The attached first is a script to generate problematic tables.
And the second is a patch to make use of the mv coef on current
master. The patch is a very primitive POC so no syntactical
interfaces involved.
For the case of your first example,
=# create table t (a int, b int, c int);
=# insert into t (select a/10000, a/10000, a/10000
from generate_series(0, 999999) a);
=# analyze t;
=# explain analyze select * from t where a = 1 and b = 1 and c = 1;
Seq Scan on t (cost=0.00..22906.00 rows=1 width=12)
(actual time=3.878..250.628 rows=10000 loops=1)
Make use of mv coefficient.
=# insert into pg_mvcoefficient values ('t'::regclass, 1, 2, 3, 0);
=# analyze t;
=# explain analyze select * from t where a = 1 and b = 1 and c = 1;
Seq Scan on t (cost=0.00..22906.00 rows=9221 width=12)
(actual time=3.740..242.330 rows=10000 loops=1)
Row number estimation was largely improved.
Well, my example,
$ perl gentbl.pl 10000 | psql postgres
$ psql postgres
=# explain analyze select * from t1 where a = 1 and b = 2501;
Seq Scan on t1 (cost=0.00..6216.00 rows=1 width=8)
(actual time=0.030..66.005 rows=8 loops=1)=# explain analyze select * from t1 join t2 on (t1.a = t2.a and t1.b = t2.b);
Hash Join (cost=1177.00..11393.76 rows=76 width=16)
(actual time=29.811..322.271 rows=320000 loops=1)
Too bad estimate for the join.
=# insert into pg_mvcoefficient values ('t1'::regclass, 1, 2, 0, 0);
=# analyze t1;
=# explain analyze select * from t1 where a = 1 and b = 2501;
Seq Scan on t1 (cost=0.00..6216.00 rows=8 width=8)
(actual time=0.032..104.144 rows=8 loops=1)=# explain analyze select * from t1 join t2 on (t1.a = t2.a and t1.b = t2.b);
Hash Join (cost=1177.00..11393.76 rows=305652 width=16)
(actual time=40.642..325.679 rows=320000 loops=1)
It gives almost correct estimations.
I think the result above shows that the multivariate coefficient
is significant to imporove estimates when correlated colums are
involved.
Would you consider this in your patch? Otherwise I move on this
as a different project from yours if you don't mind. Except user
interface won't conflict with yours, I suppose. But finally they
should need some labor of consolidation.
regards,
1) fix of the contrib compile-time errors (reported by Jeff)
2) fix of pg_proc issues (reported by Jeff)
3) rebase to current master
4) fix a bunch of issues in the previous patches, due to referencing
some parts too early (e.g. histograms in the first patch, etc.)5) remove the explicit DELETEs from pg_mv_statistic (in the regression
tests), this is now handled automatically by DROP TABLE etc.6) number of performance optimizations in selectivity estimations:
(a) minimize calls to get_oprrest, significantly reducing
syscache calls(b) significant reduction of palloc overhead in deserialization of
MCV lists and histograms(c) use more compact serialized representation of MCV lists and
histograms, often removing ~50% of the size(d) use histograms with limited deserialization, which also allows
caching function calls(e) modified histogram bucket partitioning, resulting in more even
bucket distribution (i.e. producing buckets with more equal
density and about equal size of each dimension)7) add functions for listing MCV list items and histogram buckets:
- pg_mv_mcvlist_items(oid)
- pg_mv_histogram_buckets(oid, type)This is quite useful when analyzing the MCV lists / histograms.
8) improved support for OR clauses
9) allow calling pull_varnos() on expression trees containing
RestrictInfo nodes (not sure if this is the right fix, it's being
discussed in another thread)
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
mvcoef-poc-20150513.patchtext/x-patch; charset=us-asciiDownload
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 37d05d1..d00835e 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -33,7 +33,8 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
- pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
+ pg_cast.h pg_enum.h pg_mvcoefficient.h pg_namespace.h pg_conversion.h \
+ pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
pg_authid.h pg_auth_members.h pg_shdepend.h pg_shdescription.h \
pg_ts_config.h pg_ts_config_map.h pg_ts_dict.h \
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 15ec0ad..9edaa0f 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mvcoefficient.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -45,7 +46,9 @@
#include "storage/procarray.h"
#include "utils/acl.h"
#include "utils/attoptcache.h"
+#include "utils/catcache.h"
#include "utils/datum.h"
+#include "utils/fmgroids.h"
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -110,6 +113,12 @@ static void update_attstats(Oid relid, bool inh,
int natts, VacAttrStats **vacattrstats);
static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
+static float4 compute_mv_distinct(int nattrs,
+ int *stacolnums,
+ VacAttrStats **stats,
+ AnalyzeAttrFetchFunc fetchfunc,
+ int samplerows,
+ double totalrows);
/*
@@ -552,6 +561,92 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
MemoryContextResetAndDeleteChildren(col_context);
}
+ /* Compute multivariate distinctness if ordered */
+ {
+ ScanKeyData scankey;
+ SysScanDesc sysscan;
+ Relation mvcrel;
+ HeapTuple oldtup, newtup;
+ int i;
+
+ mvcrel = heap_open(MvCoefficientRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&scankey,
+ Anum_pg_mvcoefficient_mvcreloid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(onerel->rd_id));
+ sysscan = systable_beginscan(mvcrel, MvCoefficientIndexId, true,
+ NULL, 1, &scankey);
+ oldtup = systable_getnext(sysscan);
+
+ while (HeapTupleIsValid(oldtup))
+ {
+ int colnums[3];
+ int ncols = 0;
+ float4 nd;
+ Datum values[Natts_pg_mvcoefficient];
+ bool nulls[Natts_pg_mvcoefficient];
+ bool replaces[Natts_pg_mvcoefficient];
+ float4 simple_mv_distinct;
+
+ Form_pg_mvcoefficient mvc =
+ (Form_pg_mvcoefficient) GETSTRUCT (oldtup);
+
+ if (mvc->mvcattr1 > 0)
+ colnums[ncols++] = mvc->mvcattr1 - 1;
+ if (mvc->mvcattr2 > 0)
+ colnums[ncols++] = mvc->mvcattr2 - 1;
+ if (mvc->mvcattr3 > 0)
+ colnums[ncols++] = mvc->mvcattr3 - 1;
+
+ if (ncols > 0)
+ {
+ int j;
+ float4 nd_coef;
+
+ simple_mv_distinct =
+ vacattrstats[colnums[0]]->stadistinct;
+ if (simple_mv_distinct < 0)
+ simple_mv_distinct = -simple_mv_distinct * totalrows;
+ for (j = 1 ; j < ncols ; j++)
+ {
+ float4 t = vacattrstats[colnums[j]]->stadistinct;
+
+ if (t < 0)
+ t = -t * totalrows;
+ simple_mv_distinct *= t;
+ }
+
+ nd = compute_mv_distinct(j, colnums, vacattrstats,
+ std_fetch_func, numrows, totalrows);
+
+ nd_coef = nd / simple_mv_distinct;
+
+ for (i = 0; i < Natts_pg_mvcoefficient ; ++i)
+ {
+ nulls[i] = false;
+ replaces[i] = false;
+ }
+ values[Anum_pg_mvcoefficient_mvccoefficient - 1] =
+ Float4GetDatum(nd_coef);
+ replaces[Anum_pg_mvcoefficient_mvccoefficient - 1] = true;
+ newtup = heap_modify_tuple(oldtup,
+ RelationGetDescr(mvcrel),
+ values,
+ nulls,
+ replaces);
+ simple_heap_update(mvcrel, &oldtup->t_self, newtup);
+
+ CatalogUpdateIndexes(mvcrel, newtup);
+
+ oldtup = systable_getnext(sysscan);
+ }
+ }
+
+ systable_endscan(sysscan);
+ heap_close(mvcrel, RowExclusiveLock);
+ }
+
if (hasindex)
compute_index_stats(onerel, totalrows,
indexdata, nindexes,
@@ -1911,6 +2006,7 @@ static void compute_scalar_stats(VacAttrStatsP stats,
int samplerows,
double totalrows);
static int compare_scalars(const void *a, const void *b, void *arg);
+static int compare_mv_scalars(const void *a, const void *b, void *arg);
static int compare_mcvs(const void *a, const void *b);
@@ -2840,6 +2936,207 @@ compute_scalar_stats(VacAttrStatsP stats,
}
/*
+ * compute_mv_distinct() -- compute multicolumn distinctness
+ */
+
+static float4
+compute_mv_distinct(int nattrs,
+ int *stacolnums,
+ VacAttrStats **stats,
+ AnalyzeAttrFetchFunc fetchfunc,
+ int samplerows,
+ double totalrows)
+{
+ int i, j;
+ int null_cnt = 0;
+ int nonnull_cnt = 0;
+ int toowide_cnt = 0;
+ double total_width = 0;
+ bool is_varlena[3];
+ SortSupportData ssup[3];
+ ScalarItem **values, *values2;
+ int values_cnt = 0;
+ int *tupnoLink;
+ StdAnalyzeData *mystats[3];
+ float4 fndistinct;
+
+ Assert (nattrs <= 3);
+ for (i = 0 ; i < nattrs ; i++)
+ {
+ VacAttrStats *vas = stats[stacolnums[i]];
+ is_varlena[i] =
+ !vas->attrtype->typbyval && vas->attrtype->typlen == -1;
+ mystats[i] =
+ (StdAnalyzeData*) vas->extra_data;
+ }
+
+ values2 = (ScalarItem *) palloc(nattrs * samplerows * sizeof(ScalarItem));
+ values = (ScalarItem **) palloc(samplerows * sizeof(ScalarItem*));
+ tupnoLink = (int *) palloc(samplerows * sizeof(int));
+
+ for (i = 0 ; i < samplerows ; i++)
+ values[i] = &values2[i * nattrs];
+
+ memset(ssup, 0, sizeof(ssup));
+ for (i = 0 ; i < nattrs ; i++)
+ {
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ /* We always use the default collation for statistics */
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+ ssup[i].abbreviate = true;
+ PrepareSortSupportFromOrderingOp(mystats[i]->ltopr, &ssup[i]);
+ }
+ ssup[nattrs].ssup_cxt = NULL;
+
+ /* Initial scan to find sortable values */
+ for (i = 0; i < samplerows; i++)
+ {
+ Datum value[2];
+ bool isnull = false;
+ bool toowide = false;
+
+ vacuum_delay_point();
+
+ for (j = 0 ; j < nattrs ; j++)
+ {
+
+ value[j] = fetchfunc(stats[stacolnums[j]], i, &isnull);
+
+ /* Check for null/nonnull */
+ if (isnull)
+ break;
+
+ if (is_varlena[j])
+ {
+ total_width += VARSIZE_ANY(DatumGetPointer(value[j]));
+ if (toast_raw_datum_size(value[j]) > WIDTH_THRESHOLD)
+ {
+ toowide = true;
+ break;
+ }
+ value[j] = PointerGetDatum(PG_DETOAST_DATUM(value[j]));
+ }
+ }
+ if (isnull)
+ {
+ null_cnt++;
+ continue;
+ }
+ else if (toowide)
+ {
+ toowide_cnt++;
+ continue;
+ }
+ nonnull_cnt++;
+
+ /* Add it to the list to be sorted */
+ for (j = 0 ; j < nattrs ; j++)
+ values[values_cnt][j].value = value[j];
+
+ values[values_cnt][0].tupno = values_cnt;
+ tupnoLink[values_cnt] = values_cnt;
+ values_cnt++;
+ }
+
+ /* We can only compute real stats if we found some sortable values. */
+ if (values_cnt > 0)
+ {
+ int ndistinct, /* # distinct values in sample */
+ nmultiple, /* # that appear multiple times */
+ dups_cnt;
+ CompareScalarsContext cxt;
+
+ /* Sort the collected values */
+ cxt.ssup = ssup;
+ cxt.tupnoLink = tupnoLink;
+ qsort_arg((void *) values, values_cnt, sizeof(ScalarItem*),
+ compare_mv_scalars, (void *) &cxt);
+
+ ndistinct = 0;
+ nmultiple = 0;
+ dups_cnt = 0;
+ for (i = 0; i < values_cnt; i++)
+ {
+ int tupno = values[i][0].tupno;
+
+ dups_cnt++;
+ if (tupnoLink[tupno] == tupno)
+ {
+ /* Reached end of duplicates of this value */
+ ndistinct++;
+ if (dups_cnt > 1)
+ nmultiple++;
+
+ dups_cnt = 0;
+ }
+ }
+
+ if (nmultiple == 0)
+ {
+ /* If we found no repeated values, assume it's a unique column */
+ fndistinct = totalrows;
+ }
+ else if (toowide_cnt == 0 && nmultiple == ndistinct)
+ {
+ /*
+ * Every value in the sample appeared more than once. Assume the
+ * column has just these values.
+ */
+ fndistinct = (float4)ndistinct;
+ }
+ else
+ {
+ /*----------
+ * Estimate the number of distinct values using the estimator
+ * proposed by Haas and Stokes in IBM Research Report RJ 10025:
+ * n*d / (n - f1 + f1*n/N)
+ * where f1 is the number of distinct values that occurred
+ * exactly once in our sample of n rows (from a total of N),
+ * and d is the total number of distinct values in the sample.
+ * This is their Duj1 estimator; the other estimators they
+ * recommend are considerably more complex, and are numerically
+ * very unstable when n is much smaller than N.
+ *
+ * Overwidth values are assumed to have been distinct.
+ *----------
+ */
+ int f1 = ndistinct - nmultiple + toowide_cnt;
+ int d = f1 + nmultiple;
+ double numer,
+ denom,
+ stadistinct;
+
+ numer = (double) samplerows *(double) d;
+
+ denom = (double) (samplerows - f1) +
+ (double) f1 *(double) samplerows / totalrows;
+
+ stadistinct = numer / denom;
+ /* Clamp to sane range in case of roundoff error */
+ if (stadistinct < (double) d)
+ stadistinct = (double) d;
+ if (stadistinct > totalrows)
+ stadistinct = totalrows;
+ fndistinct = floor(stadistinct + 0.5);
+ }
+ }
+ else if (nonnull_cnt > 0)
+ {
+ /* Assume all too-wide values are distinct, so it's a unique column */
+ fndistinct = totalrows;
+ }
+ else if (null_cnt > 0)
+ {
+ fndistinct = 0.0; /* "unknown" */
+ }
+
+ /* We don't need to bother cleaning up any of our temporary palloc's */
+ return fndistinct;
+}
+
+
+/*
* qsort_arg comparator for sorting ScalarItems
*
* Aside from sorting the items, we update the tupnoLink[] array
@@ -2876,6 +3173,43 @@ compare_scalars(const void *a, const void *b, void *arg)
return ta - tb;
}
+static int
+compare_mv_scalars(const void *a, const void *b, void *arg)
+{
+ CompareScalarsContext *cxt = (CompareScalarsContext *) arg;
+ ScalarItem *va = *(ScalarItem**)a;
+ ScalarItem *vb = *(ScalarItem**)b;
+ Datum da, db;
+ int ta, tb;
+ int compare;
+ int i;
+
+ for (i = 0 ; cxt->ssup[i].ssup_cxt ; i++)
+ {
+ da = va[i].value;
+ db = vb[i].value;
+
+ compare = ApplySortComparator(da, false, db, false, &cxt->ssup[i]);
+ if (compare != 0)
+ return compare;
+ }
+
+ /*
+ * The two datums are equal, so update cxt->tupnoLink[].
+ */
+ ta = va[0].tupno;
+ tb = vb[0].tupno;
+ if (cxt->tupnoLink[ta] < tb)
+ cxt->tupnoLink[ta] = tb;
+ if (cxt->tupnoLink[tb] < ta)
+ cxt->tupnoLink[tb] = ta;
+
+ /*
+ * For equal datums, sort by tupno
+ */
+ return ta - tb;
+}
+
/*
* qsort comparator for sorting ScalarMCVItems by position
*/
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index dcac1c1..43712ba 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -14,8 +14,14 @@
*/
#include "postgres.h"
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "catalog/indexing.h"
#include "catalog/pg_operator.h"
+#include "catalog/pg_mvcoefficient.h"
#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
@@ -43,6 +49,93 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+static bool
+collect_collist_walker(Node *node, Bitmapset **colsetlist)
+{
+ if (node == NULL)
+ return false;
+ if (IsA(node, Var))
+ {
+ Var *var = (Var*)node;
+
+ if (AttrNumberIsForUserDefinedAttr(var->varattno))
+ colsetlist[var->varno] =
+ bms_add_member(colsetlist[var->varno], var->varattno);
+ }
+ return expression_tree_walker(node, collect_collist_walker,
+ (void*)colsetlist);
+}
+
+/* Find multivariate distinctness coefficient for clauselist */
+static double
+find_mv_join_coeffeicient(PlannerInfo *root, List *clauses)
+{
+ int relid;
+ ListCell *l;
+ Bitmapset **colsetlist = NULL;
+ double mv_coef = 1.0;
+
+ /* Collect columns this clauselist on */
+ colsetlist = (Bitmapset**)
+ palloc0(root->simple_rel_array_size * sizeof(Bitmapset*));
+
+ foreach(l, clauses)
+ {
+ RestrictInfo *rti = (RestrictInfo *) lfirst(l);
+
+ /* Consider only EC-derived clauses between the joinrels */
+ if (rti->left_ec && rti->left_ec == rti->right_ec)
+ {
+ if (IsA(rti, RestrictInfo))
+ collect_collist_walker((Node*)rti->clause, colsetlist);
+ }
+ }
+
+ /* Find pg_mv_coefficient entries match this columlist */
+ for (relid = 1 ; relid < root->simple_rel_array_size ; relid++)
+ {
+ Relation mvcrel;
+ SysScanDesc sscan;
+ ScanKeyData skeys[1];
+ HeapTuple tuple;
+
+ if (bms_is_empty(colsetlist[relid])) continue;
+
+ if (root->simple_rte_array[relid]->rtekind != RTE_RELATION) continue;
+
+ ScanKeyInit(&skeys[0],
+ Anum_pg_mvcoefficient_mvcreloid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(root->simple_rte_array[relid]->relid));
+
+ mvcrel = heap_open(MvCoefficientRelationId, AccessShareLock);
+ sscan = systable_beginscan(mvcrel, MvCoefficientIndexId, true,
+ NULL, 1, skeys);
+ while (HeapTupleIsValid(tuple = systable_getnext(sscan)))
+ {
+ Bitmapset *mvccols = NULL;
+ Form_pg_mvcoefficient mvc =
+ (Form_pg_mvcoefficient) GETSTRUCT (tuple);
+
+ mvccols = bms_add_member(mvccols, mvc->mvcattr1);
+ mvccols = bms_add_member(mvccols, mvc->mvcattr2);
+ if (mvc->mvcattr3 > 0)
+ mvccols = bms_add_member(mvccols, mvc->mvcattr3);
+
+ if (!bms_is_subset(mvccols, colsetlist[relid]))
+ continue;
+
+ /* Prefer smaller one */
+ if (mvc->mvccoefficient > 0 && mvc->mvccoefficient < mv_coef)
+ mv_coef = mvc->mvccoefficient;
+ }
+ systable_endscan(sscan);
+ heap_close(mvcrel, AccessShareLock);
+ }
+
+ return mv_coef;
+}
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -200,6 +293,9 @@ clauselist_selectivity(PlannerInfo *root,
s1 = s1 * s2;
}
+ /* Try multivariate distinctness correction for clauses */
+ s1 /= find_mv_join_coeffeicient(root, clauses);
+
/*
* Now scan the rangequery pair list.
*/
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index f58e1ce..f4c1001 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mvcoefficient.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -501,6 +502,17 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvCoefficientRelationId, /* MVCOEFFICIENT */
+ MvCoefficientIndexId,
+ 4,
+ {
+ Anum_pg_mvcoefficient_mvcreloid,
+ Anum_pg_mvcoefficient_mvcattr1,
+ Anum_pg_mvcoefficient_mvcattr2,
+ Anum_pg_mvcoefficient_mvcattr3
+ },
+ 4
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index 71e0010..0c76f93 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,9 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mvcoefficient_index, 3578, on pg_mvcoefficient using btree(mvcreloid oid_ops, mvcattr1 int2_ops, mvcattr2 int2_ops, mvcattr3 int2_ops));
+#define MvCoefficientIndexId 3578
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/pg_mvcoefficient.h b/src/include/catalog/pg_mvcoefficient.h
new file mode 100644
index 0000000..56259fd
--- /dev/null
+++ b/src/include/catalog/pg_mvcoefficient.h
@@ -0,0 +1,68 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mvcoefficient.h
+ * definition of the system multivariate coefficient relation
+ * (pg_mvcoefficient) along with the relation's initial contents.
+ *
+ * Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ * src/include/catalog/pg_mvcoefficient.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ * XXX do NOT break up DATA() statements into multiple lines!
+ * the scripts are not as smart as you might think...
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MVCOEFFICIENT_H
+#define PG_MVCOEFFICIENT_H
+
+#include "catalog/genbki.h"
+#include "nodes/pg_list.h"
+
+/* ----------------
+ * pg_mvcoefficient definition. cpp turns this into
+ * typedef struct FormData_pg_mvcoefficient
+ * ----------------
+ */
+#define MvCoefficientRelationId 3577
+
+CATALOG(pg_mvcoefficient,3577) BKI_WITHOUT_OIDS
+{
+ Oid mvcreloid; /* OID of target relation */
+ int16 mvcattr1; /* Column numbers */
+ int16 mvcattr2;
+ int16 mvcattr3;
+ float4 mvccoefficient; /* multivariate distinctness coefficient */
+} FormData_pg_mvcoefficient;
+
+/* ----------------
+ * Form_pg_mvcoefficient corresponds to a pointer to a tuple with the
+ * format of pg_mvcoefficient relation.
+ * ----------------
+ */
+typedef FormData_pg_mvcoefficient *Form_pg_mvcoefficient;
+
+/* ----------------
+ * compiler constants for pg_mvcoefficient
+ * ----------------
+< */
+#define Natts_pg_mvcoefficient 5
+#define Anum_pg_mvcoefficient_mvcreloid 1
+#define Anum_pg_mvcoefficient_mvcattr1 2
+#define Anum_pg_mvcoefficient_mvcattr2 3
+#define Anum_pg_mvcoefficient_mvcattr3 4
+#define Anum_pg_mvcoefficient_mvccoefficient 5
+
+/* ----------------
+ * pg_mvcoefficient has no initial contents
+ * ----------------
+ */
+
+/*
+ * prototypes for functions in pg_enum.c
+ */
+#endif /* PG_MVCOEFFICIENT_H */
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 6634099..db8454c 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,7 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVDISTINCT,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index eb0bc88..7c77796 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mvcoefficient|t
pg_namespace|t
pg_opclass|t
pg_operator|t
On 05/13/15 10:31, Kyotaro HORIGUCHI wrote:
Hello, this might be somewhat out of place but strongly related
to this patch so I'll propose this here.This is a proposal of new feature for this patch or asking for
your approval for my moving on this as a different (but very
close) project.===
Attached is v6 of the multivariate stats, with a number of
improvements:...
2) fix of pg_proc issues (reported by Jeff)
3) rebase to current master
Unfortunately, the v6 patch suffers some system oid conflicts
with recently added ones. And what more unfortunate for me is
that the code for functional dependencies looks undone:)
I'll fix the OID conflicts once the CF completes, which should be in a
few days I guess. Until then you can apply it on top of master from
about May 6 (that's when the v6 was created, and there should be no
conflicts).
Regarding the functional dependencies - you're right there's room for
improvement. For example it only works with dependencies between pairs
of columns, not multi-column dependencies. Is this what you mean by
incomplete?
I mention this because I recently had a issue from strong
correlation between two columns in dbt3 benchmark. Two columns in
some table are in strong correlation but not in functional
dependencies, there are too many values and the distribution of
them is very uniform so MCV is no use for the table (histogram
has nothing to do with equal conditions). As the result, planner
estimates the number of rows largely wrong as expected especially
for joins.
I think the other statistics types (esp. histograms) might be more
useful here, but I assume you haven't tried that because of the conflicts.
The current patch does not handle joins at all, though.
I, then, had a try calculating the ratio between the product of
distinctness of every column and the distinctness of the set of
the columns, call it multivariate coefficient here, and found
that it looks greately useful for the small storage space, less
calculation, and simple code.
So when you have two columns A and B, you compute this:
ndistinct(A) * ndistinct(B)
---------------------------
ndistinct(A,B)
where ndistinc(...) means number of distinct values in the column(s)?
The attached first is a script to generate problematic tables.
And the second is a patch to make use of the mv coef on current
master. The patch is a very primitive POC so no syntactical
interfaces involved.For the case of your first example,
=# create table t (a int, b int, c int);
=# insert into t (select a/10000, a/10000, a/10000
from generate_series(0, 999999) a);
=# analyze t;
=# explain analyze select * from t where a = 1 and b = 1 and c = 1;
Seq Scan on t (cost=0.00..22906.00 rows=1 width=12)
(actual time=3.878..250.628 rows=10000 loops=1)Make use of mv coefficient.
=# insert into pg_mvcoefficient values ('t'::regclass, 1, 2, 3, 0);
=# analyze t;
=# explain analyze select * from t where a = 1 and b = 1 and c = 1;
Seq Scan on t (cost=0.00..22906.00 rows=9221 width=12)
(actual time=3.740..242.330 rows=10000 loops=1)Row number estimation was largely improved.
With my patch:
alter table t add statistics (mcv) on (a,b,c);
analyze t;
select * from pg_mv_stats;
tablename | attnums | mcvbytes | mcvinfo
-----------+---------+----------+------------
t | 1 2 3 | 2964 | nitems=100
explain (analyze,timing off)
select * from t where a = 1 and b = 1 and c = 1;
QUERY PLAN
------------------------------------------------------------
Seq Scan on t (cost=0.00..22906.00 rows=9533 width=12)
(actual rows=10000 loops=1)
Filter: ((a = 1) AND (b = 1) AND (c = 1))
Rows Removed by Filter: 990000
Planning time: 0.233 ms
Execution time: 93.212 ms
(5 rows)
alter table t drop statistics all;
alter table t add statistics (histogram) on (a,b,c);
analyze t;
explain (analyze,timing off)
select * from t where a = 1 and b = 1 and c = 1;
QUERY PLAN
--------------------------------------------------------------------
Seq Scan on t (cost=0.00..22906.00 rows=9667 width=12)
(actual rows=10000 loops=1)
Filter: ((a = 1) AND (b = 1) AND (c = 1))
Rows Removed by Filter: 990000
Planning time: 0.594 ms
Execution time: 109.917 ms
(5 rows)
So both the MCV list and histogram do quite a good work here, but there
are certainly cases when that does not work and the mvcoefficient works
better.
Well, my example,
$ perl gentbl.pl 10000 | psql postgres
$ psql postgres
=# explain analyze select * from t1 where a = 1 and b = 2501;
Seq Scan on t1 (cost=0.00..6216.00 rows=1 width=8)
(actual time=0.030..66.005 rows=8 loops=1)=# explain analyze select * from t1 join t2 on (t1.a = t2.a and t1.b = t2.b);
Hash Join (cost=1177.00..11393.76 rows=76 width=16)
(actual time=29.811..322.271 rows=320000 loops=1)Too bad estimate for the join.
=# insert into pg_mvcoefficient values ('t1'::regclass, 1, 2, 0, 0);
=# analyze t1;
=# explain analyze select * from t1 where a = 1 and b = 2501;
Seq Scan on t1 (cost=0.00..6216.00 rows=8 width=8)
(actual time=0.032..104.144 rows=8 loops=1)=# explain analyze select * from t1 join t2 on (t1.a = t2.a and t1.b = t2.b);
Hash Join (cost=1177.00..11393.76 rows=305652 width=16)
(actual time=40.642..325.679 rows=320000 loops=1)It gives almost correct estimations.
The current patch does not handle joins, but it's one of the TODO items.
I think the result above shows that the multivariate coefficient
is significant to imporove estimates when correlated colums are
involved.
Yes, it looks interesting. I'm wondering what are the "failure cases"
when the coefficient approach does not work. It seems to me it relies on
an assumption of consistency for all the ndistinct values. For example
lets assume you have two columns - A and B, each with 1000 distinct
values, and that each value in A has 100 matching values in B, so the
coefficient is ~10
1,000 * 1,000 / 100,000 = 10
Now, let's assume the distribution looks differently - with first 100
values in A matching all 1000 values of B, and the remaining 900 values
just a single B value. Then
1,000 * 1,000 / (100,000 + 900) = ~9,9
So a very different distribution, but almost the same coefficient.
Are there any other assumptions like this?
Also, does the coefficient work only for equality conditions only?
Would you consider this in your patch? Otherwise I move on this
as a different project from yours if you don't mind. Except user
interface won't conflict with yours, I suppose. But finally they
should need some labor of consolidation.
I think it's a neat idea, and I think it might be added to the patch. It
would fit in quite nicely, actually - I already do have other kinds of
stats for addition, but I'm not going to work on that in the near
future. It will require changes in some parts of the patch (selecting
the stats for a list of clauses) and I'd like to complete the current
patch first, and then add features in follow-up patches.
regards,
regards
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
At Thu, 14 May 2015 12:35:50 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <55547A86.8020400@2ndquadrant.com>
On 05/13/15 10:31, Kyotaro HORIGUCHI wrote:
Hello, this might be somewhat out of place but strongly related
to this patch so I'll propose this here.This is a proposal of new feature for this patch or asking for
your approval for my moving on this as a different (but very
close) project.===
Attached is v6 of the multivariate stats, with a number of
improvements:...
2) fix of pg_proc issues (reported by Jeff)
3) rebase to current master
Unfortunately, the v6 patch suffers some system oid conflicts
with recently added ones. And what more unfortunate for me is
that the code for functional dependencies looks undone:)I'll fix the OID conflicts once the CF completes, which should be in a
few days I guess. Until then you can apply it on top of master from
about May 6 (that's when the v6 was created, and there should be no
conflicts).
I applied it with further fixing. It wasn't a problem :)
Regarding the functional dependencies - you're right there's room for
improvement. For example it only works with dependencies between pairs
of columns, not multi-column dependencies. Is this what you mean by
incomplete?
No, It overruns dependencies->deps because build_mv_dependencies
stores many elements into dependencies->deps[n] although it
really has a room for only one element. I suppose that you paused
writing it when you noticed that the number of required elements
is unknown before finising walk through all pairs of
values. palloc'ing numattrs^2 is reasonable enough as POC code
for now. Am I looking wrong version of patch?
- dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData))
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData) +
+ sizeof(MVDependency) * numattrs * numattrs);
I mention this because I recently had a issue from strong
correlation between two columns in dbt3 benchmark. Two columns in
some table are in strong correlation but not in functional
dependencies, there are too many values and the distribution of
them is very uniform so MCV is no use for the table (histogram
has nothing to do with equal conditions). As the result, planner
estimates the number of rows largely wrong as expected especially
for joins.I think the other statistics types (esp. histograms) might be more
useful here, but I assume you haven't tried that because of the
conflicts.The current patch does not handle joins at all, though.
Well, that's one of the resons. But I understood that any
deterministic estimation cannot be applied for such distribution
when I saw what made the wrong estimation. eqsel and eqsel_join
finally relies on random match assumption on uniform distribution
when the value is not found in MCV list. And functional
dependencies stuff in your old patch (which works) (rightfully)
failed to find such relationship between the problematic
columns. So I tried ndistinct, which is not contained in your
patch to see how it works well.
I, then, had a try calculating the ratio between the product of
distinctness of every column and the distinctness of the set of
the columns, call it multivariate coefficient here, and found
that it looks greately useful for the small storage space, less
calculation, and simple code.So when you have two columns A and B, you compute this:
ndistinct(A) * ndistinct(B)
---------------------------
ndistinct(A,B)
Yes, I used the reciprocal of that, though.
where ndistinc(...) means number of distinct values in the column(s)?
Yes.
The attached first is a script to generate problematic tables.
And the second is a patch to make use of the mv coef on current
master. The patch is a very primitive POC so no syntactical
interfaces involved.
...
Make use of mv coefficient.
=# insert into pg_mvcoefficient values ('t'::regclass, 1, 2, 3, 0);
=# analyze t;
=# explain analyze select * from t where a = 1 and b = 1 and c = 1;
Seq Scan on t (cost=0.00..22906.00 rows=9221 width=12)
(actual time=3.740..242.330 rows=10000 loops=1)Row number estimation was largely improved.
With my patch:
alter table t add statistics (mcv) on (a,b,c);
...
Seq Scan on t (cost=0.00..22906.00 rows=9533 width=12)
Yes, your MV-MCV list should have one third of all possible (set
of) values so it works fine, I guess. But my original problem was
occurred on the condition that (the single column) MCVs contain
under 1% of possible values, MCV would not work for such cases,
but its very uniform distribution helps random assumption to
work.
$ perl gentbl.pl 200000 | psql postgres
<takes a while..>
posttres=# alter table t1 add statistics (mcv true) on (a, b);
postgres=# analyze t1;
postgres=# explain analyze select * from t1 where a = 1 and b = 2501;
Seq Scan on t1 (cost=0.00..124319.00 rows=1 width=8)
(actual time=0.051..1250.773 rows=8 loops=1)
The estimate "rows=1" is internally 2.4e-11, 3.33e+11 times
smaller than the real number. This will result in roughly the
same order of error for joins. This is because MV-MCV holds too
small part of the domain and then calculated using random
assumption. This won't be not saved by increasing
statistics_target to any sane amount.
alter table t drop statistics all;
alter table t add statistics (histogram) on (a,b,c);
...
Seq Scan on t (cost=0.00..22906.00 rows=9667 width=12)
So both the MCV list and histogram do quite a good work here,
I understand how you calculate selectivity for equality clauses
using histogram. And it calculates the result rows as 2.3e-11,
which is almost same as MV-MCV, and this comes the same cause
with it then yields the same result for joins.
but there are certainly cases when that does not work and the
mvcoefficient works better.
The condition mv-coef is effective where, as metioned above,
MV-MCV or MV-HISTO cannot hold sufficient part of the domain. The
appropriate combination of MV-MCV and mv-coef would be the same
as va_eq_(non_)const/eqjoinsel_inner for single column, which is,
applying mv-coef on the part of selectivity corresponding to
values not in MV-MCV. I have no idea to combinate it with
MV-HISTOGRAM right now.
The current patch does not handle joins, but it's one of the TODO
items.
Yes, but the result on the very large tables can be deduced from
the discussion above.
I think the result above shows that the multivariate coefficient
is significant to imporove estimates when correlated colums are
involved.Yes, it looks interesting. I'm wondering what are the "failure cases"
when the coefficient approach does not work. It seems to me it relies
on an assumption of consistency for all the ndistinct values. For
example lets assume you have two columns - A and B, each with 1000
distinct values, and that each value in A has 100 matching values in
B, so the coefficient is ~101,000 * 1,000 / 100,000 = 10
Now, let's assume the distribution looks differently - with first 100
values in A matching all 1000 values of B, and the remaining 900
values just a single B value. Then1,000 * 1,000 / (100,000 + 900) = ~9,9
So a very different distribution, but almost the same coefficient.
Are there any other assumptions like this?
I think no for now. Just like the current var_eq_(non_)const and
eqjoinsel_inner does, since no clue for *the true* distribution
available, we have no choice other than stand on the random (on
uniform dist) assumption. And it gives not so bad estimates for
not so extreme distributions. It's of course not perfect but good
enough.
Also, does the coefficient work only for equality conditions only?
The mvcoef is a parallel of ndistinct, (it is a bit wierd
expression though). So I guess it is appliable on the current
estimation codes where using ndistinct, almost of all of them
look to relate to equiality comparison.
Would you consider this in your patch? Otherwise I move on this
as a different project from yours if you don't mind. Except user
interface won't conflict with yours, I suppose. But finally they
should need some labor of consolidation.I think it's a neat idea, and I think it might be added to the
patch. It would fit in quite nicely, actually - I already do have
other kinds of stats for addition, but I'm not going to work on that
in the near future. It will require changes in some parts of the patch
(selecting the stats for a list of clauses) and I'd like to complete
the current patch first, and then add features in follow-up patches.
I see. Let's work on this for now.
regares,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
On 05/15/15 08:29, Kyotaro HORIGUCHI wrote:
Hello,
Regarding the functional dependencies - you're right there's room
for improvement. For example it only works with dependencies
between pairs of columns, not multi-column dependencies. Is this
what you mean by incomplete?No, It overruns dependencies->deps because build_mv_dependencies
stores many elements into dependencies->deps[n] although it
really has a room for only one element. I suppose that you paused
writing it when you noticed that the number of required elements
is unknown before finising walk through all pairs of
values. palloc'ing numattrs^2 is reasonable enough as POC code
for now. Am I looking wrong version of patch?- dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData)) + dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData) + + sizeof(MVDependency) * numattrs * numattrs);
Ah! That's clearly a bug. Thanks for noticing that, will fix in the next
version of the patch.
I mention this because I recently had a issue from strong
correlation between two columns in dbt3 benchmark. Two columns
in some table are in strong correlation but not in functional
dependencies, there are too many values and the distribution of
them is very uniform so MCV is no use for the table (histogram
has nothing to do with equal conditions). As the result, planner
estimates the number of rows largely wrong as expected
especially for joins.I think the other statistics types (esp. histograms) might be more
useful here, but I assume you haven't tried that because of the
conflicts.The current patch does not handle joins at all, though.
Well, that's one of the resons. But I understood that any
deterministic estimation cannot be applied for such distribution
when I saw what made the wrong estimation. eqsel and eqsel_join
finally relies on random match assumption on uniform distribution
when the value is not found in MCV list. And functional
dependencies stuff in your old patch (which works) (rightfully)
failed to find such relationship between the problematic
columns. So I tried ndistinct, which is not contained in your
patch to see how it works well.
Yes, that's certainly true. I think you're right that mv coefficient
might be quite useful in some cases.
With my patch:
alter table t add statistics (mcv) on (a,b,c);
...
Seq Scan on t (cost=0.00..22906.00 rows=9533 width=12)
Yes, your MV-MCV list should have one third of all possible (set
of) values so it works fine, I guess. But my original problem was
occurred on the condition that (the single column) MCVs contain
under 1% of possible values, MCV would not work for such cases,
but its very uniform distribution helps random assumption to
work.
Actually, I think the MCV list should contain all the items, as it
decides the sample contains all the values from the data. The usual 1D
MCV list uses the same logic. But you're right that on a data set with
more MCV items and mostly uniform distribution, this won't work.
$ perl gentbl.pl 200000 | psql postgres
<takes a while..>
posttres=# alter table t1 add statistics (mcv true) on (a, b);
postgres=# analyze t1;
postgres=# explain analyze select * from t1 where a = 1 and b = 2501;
Seq Scan on t1 (cost=0.00..124319.00 rows=1 width=8)
(actual time=0.051..1250.773 rows=8 loops=1)The estimate "rows=1" is internally 2.4e-11, 3.33e+11 times
smaller than the real number. This will result in roughly the
same order of error for joins. This is because MV-MCV holds too
small part of the domain and then calculated using random
assumption. This won't be not saved by increasing
statistics_target to any sane amount.
Yes, the MCV lists don't do work well with data sets like this.
alter table t drop statistics all;
alter table t add statistics (histogram) on (a,b,c);...
Seq Scan on t (cost=0.00..22906.00 rows=9667 width=12)
So both the MCV list and histogram do quite a good work here,
I understand how you calculate selectivity for equality clauses
using histogram. And it calculates the result rows as 2.3e-11,
which is almost same as MV-MCV, and this comes the same cause
with it then yields the same result for joins.but there are certainly cases when that does not work and the
mvcoefficient works better.
+1
The condition mv-coef is effective where, as metioned above,
MV-MCV or MV-HISTO cannot hold sufficient part of the domain. The
appropriate combination of MV-MCV and mv-coef would be the same
as va_eq_(non_)const/eqjoinsel_inner for single column, which is,
applying mv-coef on the part of selectivity corresponding to
values not in MV-MCV. I have no idea to combinate it with
MV-HISTOGRAM right now.The current patch does not handle joins, but it's one of the TODO
items.Yes, but the result on the very large tables can be deduced from
the discussion above.I think the result above shows that the multivariate coefficient
is significant to imporove estimates when correlated colums are
involved.Yes, it looks interesting. I'm wondering what are the "failure cases"
when the coefficient approach does not work. It seems to me it relies
on an assumption of consistency for all the ndistinct values. For
example lets assume you have two columns - A and B, each with 1000
distinct values, and that each value in A has 100 matching values in
B, so the coefficient is ~101,000 * 1,000 / 100,000 = 10
Now, let's assume the distribution looks differently - with first 100
values in A matching all 1000 values of B, and the remaining 900
values just a single B value. Then1,000 * 1,000 / (100,000 + 900) = ~9,9
So a very different distribution, but almost the same coefficient.
Are there any other assumptions like this?
I think no for now. Just like the current var_eq_(non_)const and
eqjoinsel_inner does, since no clue for *the true* distribution
available, we have no choice other than stand on the random (on
uniform dist) assumption. And it gives not so bad estimates for
not so extreme distributions. It's of course not perfect but good
enough.Also, does the coefficient work only for equality conditions only?
The mvcoef is a parallel of ndistinct, (it is a bit wierd
expression though). So I guess it is appliable on the current
estimation codes where using ndistinct, almost of all of them
look to relate to equiality comparison.
ISTM the estimation of GROUP BY might benefit tremendously from this
statistics. That is, helping with cardinality estimation of analytical
queries, etc.
Also, we've only discussed 2-column coefficients. Would it be useful to
track those coefficients for large groups of columns? For example
ndistinct(A,B,C)
--------------------------------------------
ndistinct(A) * ndistinct(B) * ndistinct(C)
which might work better for queries like
SELECT a,b,c FROM t GROUP BY a,b,c;
Would you consider this in your patch? Otherwise I move on this
as a different project from yours if you don't mind. Except user
interface won't conflict with yours, I suppose. But finally they
should need some labor of consolidation.I think it's a neat idea, and I think it might be added to the
patch. It would fit in quite nicely, actually - I already do have
other kinds of stats for addition, but I'm not going to work on
that in the near future. It will require changes in some parts of
the patch (selecting the stats for a list of clauses) and I'd like
to complete the current patch first, and then add features in
follow-up patches.I see. Let's work on this for now.
Thanks!
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
On 05/15/15 08:29, Kyotaro HORIGUCHI wrote:
Hello,
At Thu, 14 May 2015 12:35:50 +0200, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote in <55547A86.8020400@2ndquadrant.com>
...
Regarding the functional dependencies - you're right there's room for
improvement. For example it only works with dependencies between pairs
of columns, not multi-column dependencies. Is this what you mean by
incomplete?No, It overruns dependencies->deps because build_mv_dependencies
stores many elements into dependencies->deps[n] although it
really has a room for only one element. I suppose that you paused
writing it when you noticed that the number of required elements
is unknown before finising walk through all pairs of
values. palloc'ing numattrs^2 is reasonable enough as POC code
for now. Am I looking wrong version of patch?- dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData)) + dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData) + + sizeof(MVDependency) * numattrs * numattrs);
Actually, looking at this a bit more, I think the current behavior is
correct. I assume the line is from build_mv_dependencies(), but the
whole block looks like this:
if (dependencies == NULL)
{
dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
dependencies->magic = MVSTAT_DEPS_MAGIC;
}
else
dependencies = repalloc(dependencies,
offsetof(MVDependenciesData, deps) +
sizeof(MVDependency) * (dependencies->ndeps + 1));
which allocates space for a single element initially, and then extends
that when other dependencies are added.
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
attached is v7 of the multivariate stats patch. The main improvement is
major refactoring of the clausesel.c portion - splitting the awfully
long spaghetti-style functions into smaller pieces, making it much more
understandable etc.
I do assume some of those pieces are unnecessary because there already
is a helper function with the same purpose (but I'm not aware of that).
But IMHO this piece of code begins to look reasonable (especially when
compared to the previous state).
The other major improvement it review of the comments (including FIXMEs
and TODOs), and removal of the obsolete / misplaced ones. And there was
plenty of those ...
These changes made this version ~20k smaller than v6.
The patch also rebases to current master, which I assume shall be quite
stable - so hopefully no more duplicate OIDs for a while.
There are 6 files attached, but only 0002-0006 are actually part of the
multivariate statistics patch itself. The first part makes it possible
to use pull_varnos() with expression trees containing RestrictInfo
nodes, but maybe this is not the right way to fix this (there's another
thread where this was discussed).
Also, the regression tests testing plan choice with multivariate stats
(e.g. that a bitmap index scan is chosen instead of index scan) fail
from time to time. I suppose this happens because the invalidation after
ANALYZE is not processed before executing the query, so the optimizer
does not see the stats, or something like that.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-teach-expression-walker-about-RestrictInfo-v7.patchtext/x-patch; name=0001-teach-expression-walker-about-RestrictInfo-v7.patchDownload
>From 886edce86cbe571283ebe49177288e9978b10c81 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 1/6] teach expression walker about RestrictInfo
otherwise pull_varnos fails when processing OR clauses
---
src/backend/nodes/nodeFuncs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index a2bcca5..7dcc1c1 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -1995,6 +1995,8 @@ expression_tree_walker(Node *node,
return walker(((PlaceHolderInfo *) node)->ph_var, context);
case T_RangeTblFunction:
return walker(((RangeTblFunction *) node)->funcexpr, context);
+ case T_RestrictInfo:
+ return walker(((RestrictInfo *) node)->clause, context);
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(node));
--
1.9.3
0002-shared-infrastructure-and-functional-dependencies-v7.patchtext/x-patch; name=0002-shared-infrastructure-and-functional-dependencies-v7.patchDownload
>From 9e4e4141af44c03ccec77490c84f6c70e68e4449 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 2/6] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate
stats, most importantly:
- adds a new system catalog (pg_mv_statistic)
- ALTER TABLE ... ADD STATISTICS
- ALTER TABLE ... DROP STATISTICS
- implementation of functional dependencies (the simplest
type of multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e.
it does not influence the query planning (subject to
follow-up patches).
The current implementation requires a valid 'ltopr' for
the columns, so that we can sort the sample rows in various
ways, both in this patch and other kinds of statistics.
Maybe this restriction could be relaxed in the future,
requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
Maybe some of the stats (functional dependencies and MCV
list with limited functionality) might be made to work
with hashes of the values, which is sufficient for equality
comparisons. But the queries would require the equality
operator anyway, so it's not really a weaker requirement.
The hashes might reduce space requirements, though.
The algorithm detecting the dependencies is rather simple
and probably needs improvements, so that it detects more
complicated dependencies, and also validation of the math.
The name 'functional dependencies' is more correct (than
'association rules') as it's exactly the name used in
relational theory (esp. Normal Forms) for tracking
column-level dependencies.
The multivariate statistics are automatically removed in
two situations
(a) after a DROP TABLE (obviously)
(b) after ALTER TABLE ... DROP COLUMN, if the statistics
would be defined on less than 2 columns (remaining)
If there are more at least 2 columns remaining, we keep
the statistics but perform cleanup on the next ANALYZE.
The dropped columns are removed from stakeys, and the new
statistics is built on the smaller set.
We can't do this at DROP COLUMN, because that'd leave us
with invalid statistics, or we'd have to throw it away
although we can still use it. This lazy approach lets us
use the statistics although some of the columns are dead.
Dropping the statistics is done using DROP STATISTICS
ALTER TABLE ... DROP STATISTICS ALL;
ALTER TABLE ... DROP STATISTICS (opts) ON (cols);
The bad consequence of this is that 'statistics' becomes
a reserved keyword (was unreserved before), otherwise it
conflicts with DROP <columnname> in the grammar. Not sure
if there's a workaround to this.
This also adds a simple list of statistics to \d in psql.
---
src/backend/catalog/Makefile | 1 +
src/backend/catalog/heap.c | 102 +++++
src/backend/catalog/system_views.sql | 10 +
src/backend/commands/analyze.c | 21 +
src/backend/commands/tablecmds.c | 342 +++++++++++++++-
src/backend/nodes/copyfuncs.c | 13 +
src/backend/nodes/outfuncs.c | 18 +
src/backend/optimizer/util/plancat.c | 63 +++
src/backend/parser/gram.y | 83 +++-
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/relcache.c | 59 +++
src/backend/utils/cache/syscache.c | 12 +
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/common.c | 356 ++++++++++++++++
src/backend/utils/mvstats/common.h | 75 ++++
src/backend/utils/mvstats/dependencies.c | 638 +++++++++++++++++++++++++++++
src/bin/psql/describe.c | 40 ++
src/include/catalog/heap.h | 1 +
src/include/catalog/indexing.h | 5 +
src/include/catalog/pg_mv_statistic.h | 69 ++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 12 +-
src/include/nodes/relation.h | 28 ++
src/include/parser/kwlist.h | 2 +-
src/include/utils/mvstats.h | 69 ++++
src/include/utils/rel.h | 4 +
src/include/utils/relcache.h | 1 +
src/include/utils/syscache.h | 1 +
src/test/regress/expected/rules.out | 8 +
src/test/regress/expected/sanity_check.out | 1 +
32 files changed, 2053 insertions(+), 8 deletions(-)
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 3d1139b..c6de23c 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index d04e94d..1c28ca3 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -46,6 +46,7 @@
#include "catalog/pg_constraint.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_tablespace.h"
@@ -1611,7 +1612,10 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
if (attnum > 0)
+ {
RemoveStatistics(relid, attnum);
+ RemoveMVStatistics(relid, attnum);
+ }
relation_close(rel, NoLock);
}
@@ -1839,6 +1843,11 @@ heap_drop_with_catalog(Oid relid)
RemoveStatistics(relid, 0);
/*
+ * delete multi-variate statistics
+ */
+ RemoveMVStatistics(relid, 0);
+
+ /*
* delete attribute tuples
*/
DeleteAttributeTuples(relid);
@@ -2694,6 +2703,99 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
/*
+ * RemoveMVStatistics --- remove entries in pg_mv_statistic for a rel
+ *
+ * If attnum is zero, remove all entries for rel; else remove only the one(s)
+ * for that column.
+ */
+void
+RemoveMVStatistics(Oid relid, AttrNumber attnum)
+{
+ Relation pgmvstatistic;
+ TupleDesc tupdesc = NULL;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ /*
+ * When dropping a column, we'll drop statistics with a single
+ * remaining (undropped column). To do that, we need the tuple
+ * descriptor.
+ *
+ * We already have the relation locked (as we're running ALTER
+ * TABLE ... DROP COLUMN), so we'll just get the descriptor here.
+ */
+ if (attnum != 0)
+ {
+ Relation rel = relation_open(relid, NoLock);
+
+ /* multivariate stats are supported on tables and matviews */
+ if (rel->rd_rel->relkind == RELKIND_RELATION ||
+ rel->rd_rel->relkind == RELKIND_MATVIEW)
+ tupdesc = RelationGetDescr(rel);
+
+ relation_close(rel, NoLock);
+ }
+
+ if (tupdesc == NULL)
+ return;
+
+ pgmvstatistic = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ scan = systable_beginscan(pgmvstatistic,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ bool delete = true;
+
+ if (attnum != 0)
+ {
+ Datum adatum;
+ bool isnull;
+ int i;
+ int ncolumns = 0;
+ ArrayType *arr;
+ int16 *attnums;
+
+ /* get the columns */
+ adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+ attnums = (int16*)ARR_DATA_PTR(arr);
+
+ for (i = 0; i < ARR_DIMS(arr)[0]; i++)
+ {
+ /* count the column unless it's has been / is being dropped */
+ if ((! tupdesc->attrs[attnums[i]-1]->attisdropped) &&
+ (attnums[i] != attnum))
+ ncolumns += 1;
+ }
+
+ /* delete if there are less than two attributes */
+ delete = (ncolumns < 2);
+ }
+
+ if (delete)
+ simple_heap_delete(pgmvstatistic, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(pgmvstatistic, RowExclusiveLock);
+}
+
+
+/*
* RelationTruncateIndexes - truncate all indexes associated
* with the heap relation to zero tuples.
*
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 18921c4..0dedaba 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -150,6 +150,16 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 861048f..1f50036 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -55,7 +56,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Per-index data for ANALYZE */
typedef struct AnlIndexData
@@ -460,6 +465,19 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, because we need to build more
+ * detailed stats (more MCV items / histogram buckets) to get
+ * good accuracy. Maybe it'd be appropriate to use samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -562,6 +580,9 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 84dbee0..d6c6f8e 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -35,6 +35,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_tablespace.h"
@@ -92,7 +93,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -140,8 +141,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
@@ -416,6 +418,10 @@ static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKM
static void ATExecGenericOptions(Relation rel, List *options);
static void ATExecEnableRowSecurity(Relation rel);
static void ATExecDisableRowSecurity(Relation rel);
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode);
+static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode);
static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
ForkNumber forkNum, char relpersistence);
@@ -3013,6 +3019,8 @@ AlterTableGetLockLevel(List *cmds)
* updates.
*/
case AT_SetStatistics: /* Uses MVCC in getTableAttrs() */
+ case AT_AddStatistics: /* XXX not sure if the right level */
+ case AT_DropStatistics: /* XXX not sure if the right level */
case AT_ClusterOn: /* Uses MVCC in getIndexes() */
case AT_DropCluster: /* Uses MVCC in getIndexes() */
case AT_SetOptions: /* Uses MVCC in getTableAttrs() */
@@ -3169,6 +3177,8 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
pass = AT_PASS_ADD_CONSTR;
break;
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
+ case AT_AddStatistics: /* XXX maybe not the right place */
+ case AT_DropStatistics: /* XXX maybe not the right place */
ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode);
/* Performs own permission checks */
ATPrepSetStatistics(rel, cmd->name, cmd->def, lockmode);
@@ -3471,6 +3481,12 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
case AT_SetStatistics: /* ALTER COLUMN SET STATISTICS */
address = ATExecSetStatistics(rel, cmd->name, cmd->def, lockmode);
break;
+ case AT_AddStatistics: /* ADD STATISTICS */
+ ATExecAddStatistics(tab, rel, (StatisticsDef *) cmd->def, lockmode);
+ break;
+ case AT_DropStatistics: /* DROP STATISTICS */
+ ATExecDropStatistics(tab, rel, (StatisticsDef *) cmd->def, lockmode);
+ break;
case AT_SetOptions: /* ALTER COLUMN SET ( options ) */
address = ATExecSetOptions(rel, cmd->name, cmd->def, false, lockmode);
break;
@@ -11868,3 +11884,323 @@ RangeVarCallbackForAlterRelation(const RangeVar *rv, Oid relid, Oid oldrelid,
ReleaseSysCache(tuple);
}
+
+/* used for sorting the attnums in ATExecAddStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the ALTER TABLE ... ADD STATISTICS (options) ON (columns).
+ *
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+
+ /* by default build nothing */
+ bool build_dependencies = false;
+
+ Assert(IsA(def, StatisticsDef));
+
+ /* transform the column names to attnum values */
+
+ foreach(l, def->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, def->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ heap_freetuple(htup);
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ /*
+ * Invalidate relcache so that others see the new statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ return;
+}
+
+/*
+ * Implements the ALTER TABLE ... DROP STATISTICS in two forms:
+ *
+ * ALTER TABLE ... DROP STATISTICS (options) ON (columns)
+ * ALTER TABLE ... DROP STATISTICS ALL;
+ *
+ * The first one requires an exact match, the second one just drops
+ * all the statistics on a table.
+ */
+static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
+ StatisticsDef *def, LOCKMODE lockmode)
+{
+ Relation statrel;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ ListCell *l;
+
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+
+ /* checking whether the statistics matches / should be dropped */
+ bool build_dependencies = false;
+ bool check_dependencies = false;
+
+ if (def != NULL)
+ {
+ Assert(IsA(def, StatisticsDef));
+
+ /* collect attribute numbers */
+ foreach(l, def->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /* parse the statistics options */
+ foreach (l, def->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ {
+ check_dependencies = true;
+ build_dependencies = defGetBoolean(opt);
+ }
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ }
+
+ statrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(rel)));
+
+ scan = systable_beginscan(statrel,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ /* by default we delete everything */
+ bool delete = true;
+
+ /* check that the options match (dependencies, mcv, histogram) */
+ if (delete && check_dependencies)
+ {
+ bool isnull;
+ Datum adatum = heap_getattr(tuple,
+ Anum_pg_mv_statistic_deps_enabled,
+ RelationGetDescr(statrel),
+ &isnull);
+
+ delete = (! isnull) &&
+ (DatumGetBool(adatum) == build_dependencies);
+ }
+
+ /* check that the columns match the statistics definition */
+ if (delete && (numcols > 0))
+ {
+ int i, j;
+ ArrayType *arr;
+ bool isnull;
+
+ int16 *stakeys;
+ int nstakeys;
+
+ Datum adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ nstakeys = ARR_DIMS(arr)[0];
+ stakeys = (int16 *) ARR_DATA_PTR(arr);
+
+ /* assume match */
+ delete = true;
+
+ /* check that for each column we find a match in stakeys */
+ for (i = 0; i < numcols; i++)
+ {
+ bool found = false;
+ for (j = 0; j < nstakeys; j++)
+ {
+ if (attnums[i] == stakeys[j])
+ {
+ found = true;
+ break;
+ }
+ }
+
+ if (! found)
+ {
+ delete = false;
+ break;
+ }
+ }
+
+ /* check that for each stakeys we find a match in columns */
+ for (j = 0; j < nstakeys; j++)
+ {
+ bool found = false;
+
+ for (i = 0; i < numcols; i++)
+ {
+ if (attnums[i] == stakeys[j])
+ {
+ found = true;
+ break;
+ }
+ }
+
+ if (! found)
+ {
+ delete = false;
+ break;
+ }
+ }
+ }
+
+ /* don't delete, if we've found mismatches */
+ if (delete)
+ simple_heap_delete(statrel, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(statrel, RowExclusiveLock);
+
+ /*
+ * Invalidate relcache so that others forget the dropped statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ return;
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 4c363d3..e5a3d96 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4095,6 +4095,17 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static StatisticsDef *
+_copyStatisticsDef(const StatisticsDef *from)
+{
+ StatisticsDef *newnode = makeNode(StatisticsDef);
+
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4938,6 +4949,8 @@ copyObject(const void *from)
break;
case T_TableSampleClause:
retval = _copyTableSampleClause(from);
+ case T_StatisticsDef:
+ retval = _copyStatisticsDef(from);
break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4775acf..93a6f04 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1898,6 +1898,21 @@ _outIndexOptInfo(StringInfo str, const IndexOptInfo *node)
}
static void
+_outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
+{
+ WRITE_NODE_TYPE("MVSTATISTICINFO");
+
+ /* NB: this isn't a complete set of fields */
+ WRITE_OID_FIELD(mvoid);
+
+ /* enabled statistics */
+ WRITE_BOOL_FIELD(deps_enabled);
+
+ /* built/available statistics */
+ WRITE_BOOL_FIELD(deps_built);
+}
+
+static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
@@ -3331,6 +3346,9 @@ _outNode(StringInfo str, const void *obj)
case T_PlannerParamItem:
_outPlannerParamItem(str, obj);
break;
+ case T_MVStatisticInfo:
+ _outMVStatisticInfo(str, obj);
+ break;
case T_CreateStmt:
_outCreateStmt(str, obj);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b04dc2e..c397773 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -27,6 +27,7 @@
#include "catalog/catalog.h"
#include "catalog/dependency.h"
#include "catalog/heap.h"
+#include "catalog/pg_mv_statistic.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -39,7 +40,9 @@
#include "parser/parsetree.h"
#include "rewrite/rewriteManip.h"
#include "storage/bufmgr.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
+#include "utils/syscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -92,6 +95,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
Relation relation;
bool hasindex;
List *indexinfos = NIL;
+ List *stainfos = NIL;
/*
* We need not lock the relation since it was already locked, either by
@@ -380,6 +384,65 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->indexlist = indexinfos;
+ if (true)
+ {
+ List *mvstatoidlist;
+ ListCell *l;
+
+ mvstatoidlist = RelationGetMVStatList(relation);
+
+ foreach(l, mvstatoidlist)
+ {
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ Oid mvoid = lfirst_oid(l);
+ Form_pg_mv_statistic mvstat;
+ MVStatisticInfo *info;
+
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ continue;
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /* unavailable stats are not interesting for the planner */
+ if (mvstat->deps_built)
+ {
+ info = makeNode(MVStatisticInfo);
+
+ info->mvoid = mvoid;
+ info->rel = rel;
+
+ /* enabled statistics */
+ info->deps_enabled = mvstat->deps_enabled;
+
+ /* built/available statistics */
+ info->deps_built = mvstat->deps_built;
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ info->stakeys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+
+ stainfos = lcons(info, stainfos);
+ }
+
+ ReleaseSysCache(htup);
+ }
+
+ list_free(mvstatoidlist);
+ }
+
+ rel->mvstatlist = stainfos;
+
/* Grab foreign-table info using the relcache, while we have it */
if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e0ff6f1..d81bab6 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -375,6 +375,12 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
+%type <list> OptStatsOptions
+%type <str> stats_options_name
+%type <node> stats_options_arg
+%type <defelt> stats_options_elem
+%type <list> stats_options_list
+
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -501,7 +507,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <keyword> unreserved_keyword type_func_name_keyword
%type <keyword> col_name_keyword reserved_keyword
-%type <node> TableConstraint TableLikeClause
+%type <node> TableConstraint TableLikeClause TableStatistics
%type <ival> TableLikeOptionList TableLikeOption
%type <list> ColQualList
%type <node> ColConstraint ColConstraintElem ConstraintAttr
@@ -2333,6 +2339,29 @@ alter_table_cmd:
n->subtype = AT_DisableRowSecurity;
$$ = (Node *)n;
}
+ /* ALTER TABLE <name> ADD STATISTICS (options) ON (columns) */
+ | ADD_P TableStatistics
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_AddStatistics;
+ n->def = $2;
+ $$ = (Node *)n;
+ }
+ /* ALTER TABLE <name> DROP STATISTICS (options) ON (columns) */
+ | DROP TableStatistics
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_DropStatistics;
+ n->def = $2;
+ $$ = (Node *)n;
+ }
+ /* ALTER TABLE <name> DROP STATISTICS ALL */
+ | DROP STATISTICS ALL
+ {
+ AlterTableCmd *n = makeNode(AlterTableCmd);
+ n->subtype = AT_DropStatistics;
+ $$ = (Node *)n;
+ }
| alter_generic_options
{
AlterTableCmd *n = makeNode(AlterTableCmd);
@@ -3407,6 +3436,56 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * ALTER TABLE relname ADD STATISTICS (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+TableStatistics:
+ STATISTICS OptStatsOptions ON '(' columnList ')'
+ {
+ StatisticsDef *n = makeNode(StatisticsDef);
+ n->keys = $5;
+ n->options = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
+OptStatsOptions:
+ '(' stats_options_list ')' { $$ = $2; }
+ | /*EMPTY*/ { $$ = NIL; }
+ ;
+
+stats_options_list:
+ stats_options_elem
+ {
+ $$ = list_make1($1);
+ }
+ | stats_options_list ',' stats_options_elem
+ {
+ $$ = lappend($1, $3);
+ }
+ ;
+
+stats_options_elem:
+ stats_options_name stats_options_arg
+ {
+ $$ = makeDefElem($1, $2);
+ }
+ ;
+
+stats_options_name:
+ NonReservedWord { $$ = $1; }
+ ;
+
+stats_options_arg:
+ opt_boolean_or_string { $$ = (Node *) makeString($1); }
+ | NumericOnly { $$ = (Node *) $1; }
+ | /* EMPTY */ { $$ = NULL; }
+ ;
+
/*****************************************************************************
*
@@ -13796,7 +13875,6 @@ unreserved_keyword:
| STANDALONE_P
| START
| STATEMENT
- | STATISTICS
| STDIN
| STDOUT
| STORAGE
@@ -14013,6 +14091,7 @@ reserved_keyword:
| SELECT
| SESSION_USER
| SOME
+ | STATISTICS
| SYMMETRIC
| TABLE
| THEN
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index f60f3cb..8e17872 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -47,6 +47,7 @@
#include "catalog/pg_auth_members.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
@@ -3906,6 +3907,62 @@ RelationGetIndexList(Relation relation)
return result;
}
+
+List *
+RelationGetMVStatList(Relation relation)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result;
+ List *oldlist;
+ MemoryContext oldcxt;
+
+ /* Quick exit if we already computed the list. */
+ if (relation->rd_mvstatvalid != 0)
+ return list_copy(relation->rd_mvstatlist);
+
+ /*
+ * We build the list we intend to return (in the caller's context) while
+ * doing the scan. After successfully completing the scan, we copy that
+ * list into the relcache entry. This avoids cache-context memory leakage
+ * if we get some sort of error partway through.
+ */
+ result = NIL;
+
+ /* Prepare to scan pg_index for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(relation)));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ /* TODO maybe include only already built statistics? */
+ result = insert_ordered_oid(result, HeapTupleGetOid(htup));
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* Now save a copy of the completed list in the relcache entry. */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ oldlist = relation->rd_mvstatlist;
+ relation->rd_mvstatlist = list_copy(result);
+
+ relation->rd_mvstatvalid = true;
+ MemoryContextSwitchTo(oldcxt);
+
+ /* Don't leak the old list, if there is one */
+ list_free(oldlist);
+
+ return result;
+}
+
/*
* insert_ordered_oid
* Insert a new Oid into a sorted list of Oids, preserving ordering
@@ -4875,6 +4932,8 @@ load_relcache_init_file(bool shared)
rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_idattr = NULL;
+ rel->rd_mvstatvalid = false;
+ rel->rd_mvstatlist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 58f90f6..89173d6 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -502,6 +503,17 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 128
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..a755c49
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,356 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+static List* list_mv_stats(Oid relid);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ ListCell *lc;
+ List *mvstats;
+
+ TupleDesc tupdesc = RelationGetDescr(onerel);
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel));
+
+ foreach (lc, mvstats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies deps = NULL;
+
+ VacAttrStats **stats = NULL;
+ int numatts = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = stat->stakeys;
+
+ /* see how many of the columns are not dropped */
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ numatts += 1;
+
+ /* if there are dropped attributes, build a filtered int2vector */
+ if (numatts != attrs->dim1)
+ {
+ int16 *tmp = palloc0(numatts * sizeof(int16));
+ int attnum = 0;
+
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ tmp[attnum++] = attrs->values[j];
+
+ pfree(attrs);
+ attrs = buildint2vector(tmp, numatts);
+ }
+
+ /* filter only the interesting vacattrstats records */
+ stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(stat->mvoid, deps, attrs);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+static List*
+list_mv_stats(Oid relid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result = NIL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ info->mvoid = HeapTupleGetOid(htup);
+ info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_built = stats->deps_built;
+
+ result = lappend(result, info);
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_stakeys-1] = false;
+
+ /* use the new attnums, in case we removed some dropped ones */
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_stakeys -1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..6d5465b
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/datum.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "access/sysattr.h"
+
+#include "utils/mvstats.h"
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..84b6561
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,638 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ * Mine functional dependencies between columns, in the form (A => B),
+ * meaning that a value in column 'A' determines value in 'B'. A simple
+ * artificial example may be a table created like this
+ *
+ * CREATE TABLE deptest (a INT, b INT)
+ * AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+ *
+ * Clearly, once we know the value for 'A' we can easily determine the
+ * value of 'B' by dividing (A/10). A more practical example may be
+ * addresses, where (ZIP code => city name), i.e. once we know the ZIP,
+ * we probably know which city it belongs to. Larger cities usually have
+ * multiple ZIP codes, so the dependency can't be reversed.
+ *
+ * Functional dependencies are a concept well described in relational
+ * theory, especially in definition of normalization and "normal forms".
+ * Wikipedia has a nice definition of a functional dependency [1]:
+ *
+ * In a given table, an attribute Y is said to have a functional
+ * dependency on a set of attributes X (written X -> Y) if and only
+ * if each X value is associated with precisely one Y value. For
+ * example, in an "Employee" table that includes the attributes
+ * "Employee ID" and "Employee Date of Birth", the functional
+ * dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ * It follows from the previous two sentences that each {Employee ID}
+ * is associated with precisely one {Employee Date of Birth}.
+ *
+ * [1] http://en.wikipedia.org/wiki/Database_normalization
+ *
+ * Most datasets might be normalized not to contain any such functional
+ * dependencies, but sometimes it's not practical. In some cases it's
+ * actually a conscious choice to model the dataset in denormalized way,
+ * either because of performance or to make querying easier.
+ *
+ * The current implementation supports only dependencies between two
+ * columns, but this is merely a simplification of the initial patch.
+ * It's certainly useful to mine for dependencies involving multiple
+ * columns on the 'left' side, i.e. a condition for the dependency.
+ * That is dependencies [A,B] => C and so on.
+ *
+ * TODO The implementation may/should be smart enough not to mine both
+ * [A => B] and [A,C => B], because the second dependency is a
+ * consequence of the first one (if values of A determine values
+ * of B, adding another column won't change that). The ANALYZE
+ * should first analyze 1:1 dependencies, then 2:1 dependencies
+ * (and skip the already identified ones), etc.
+ *
+ * For example the dependency [city name => zip code] is much weaker
+ * than [city name, state name => zip code], because there may be
+ * multiple cities with the same name in various states. It's not
+ * perfect though - there are probably cities with the same name within
+ * the same state, but this is relatively rare occurence hopefully.
+ * More about this in the section about dependency mining.
+ *
+ * Handling multiple columns on the right side is not necessary, as such
+ * dependencies may be decomposed into a set of dependencies with
+ * the same meaning, one for each column on the right side. For example
+ *
+ * A => [B,C]
+ *
+ * is exactly the same as
+ *
+ * (A => B) & (A => C).
+ *
+ * Of course, storing (A => [B, C]) may be more efficient thant storing
+ * the two dependencies (A => B) and (A => C) separately.
+ *
+ *
+ * Dependency mining (ANALYZE)
+ * ---------------------------
+ *
+ * The current build algorithm is rather simple - for each pair [A,B] of
+ * columns, the data are sorted lexicographically (first by A, then B),
+ * and then a number of metrics is computed by walking the sorted data.
+ *
+ * In general the algorithm counts distict values of A (forming groups
+ * thanks to the sorting), supporting or contradicting the hypothesis
+ * that A => B (i.e. that values of B are predetermined by A). If there
+ * are multiple values of B for a single value of A, it's counted as
+ * contradicting.
+ *
+ * A group may be neither supporting nor contradicting. To be counted as
+ * supporting, the group has to have at least min_group_size(=3) rows.
+ * Smaller 'supporting' groups are counted as neutral.
+ *
+ * Finally, the number of rows in supporting and contradicting groups is
+ * compared, and if there is at least 10x more supporting rows, the
+ * dependency is considered valid.
+ *
+ *
+ * Real-world datasets are imperfect - there may be errors (e.g. due to
+ * data-entry mistakes), or factually correct records, yet contradicting
+ * the dependency (e.g. when a city splits into two, but both keep the
+ * same ZIP code). A strict ANALYZE implementation (where the functional
+ * dependencies are identified) would ignore dependencies on such noisy
+ * data, making the approach unusable in practice.
+ *
+ * The proposed implementation attempts to handle such noisy cases
+ * gracefully, by tolerating small number of contradicting cases.
+ *
+ * In the future this might also perform some sort of test and decide
+ * whether it's worth building any other kind of multivariate stats,
+ * or whether the dependencies sufficiently describe the data. Or at
+ * least not build the MCV list / histogram on the implied columns.
+ * Such reduction would however make the 'verification' (see the next
+ * section) impossible.
+ *
+ *
+ * Clause reduction (planner/optimizer)
+ * ------------------------------------
+ *
+ * Apllying the dependencies is quite simple - given a list of clauses,
+ * try to apply all the dependencies. For example given clause list
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d < 100)
+ *
+ * and dependencies [a=>b] and [a=>d], this may be reduced to
+ *
+ * (a = 1) AND (c = 1) AND (d < 100)
+ *
+ * The (d<100) can't be reduced as it's not an equality clause, so the
+ * dependency [a=>d] can't be applied.
+ *
+ * See clauselist_apply_dependencies() for more details.
+ *
+ * The problem with the reduction is that the query may use conditions
+ * that are not redundant, but in fact contradictory - e.g. the user
+ * may search for a ZIP code and a city name not matching the ZIP code.
+ *
+ * In such cases, the condition on the city name is not actually
+ * redundant, but actually contradictory (making the result empty), and
+ * removing it while estimating the cardinality will make the estimate
+ * worse.
+ *
+ * The current estimation assuming independence (and multiplying the
+ * selectivities) works better in this case, but only by utter luck.
+ *
+ * In some cases this might be verified using the other multivariate
+ * statistics - MCV lists and histograms. For MCV lists the verification
+ * might be very simple - peek into the list if there are any items
+ * matching the clause on the 'A' column (e.g. ZIP code), and if such
+ * item is found, check that the 'B' column matches the other clause.
+ * If it does not, the clauses are contradictory. We can't really say
+ * if such item was not found, except maybe restricting the selectivity
+ * using the MCV data (e.g. using min/max selectivity, or something).
+ *
+ * With histograms, it might work similarly - we can't check the values
+ * directly (because histograms use buckets, unlike MCV lists, storing
+ * the actual values). So we can only observe the buckets matching the
+ * clauses - if those buckets have very low frequency, it probably means
+ * the two clauses are incompatible.
+ *
+ * It's unclear what 'low frequency' is, but if one of the clauses is
+ * implied (automatically true because of the other clause), then
+ *
+ * selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+ *
+ * So we might compute selectivity of the first clause (on the column
+ * A in dependency [A=>B]) - for example using regular statistics.
+ * And then check if the selectivity computed from the histogram is
+ * about the same (or significantly lower).
+ *
+ * The problem is that histograms work well only when the data ordering
+ * matches the natural meaning. For values that serve as labels - like
+ * city names or ZIP codes, or even generated IDs, histograms really
+ * don't work all that well. For example sorting cities by name won't
+ * match the sorting of ZIP codes, rendering the histogram unusable.
+ *
+ * The MCV are probably going to work much better, because they don't
+ * really assume any sort of ordering. And it's probably more appropriate
+ * for the label-like data.
+ *
+ * TODO Support dependencies with multiple columns on left/right.
+ *
+ * TODO Investigate using histogram and MCV list to confirm the
+ * functional dependencies.
+ *
+ * TODO Investigate statistical testing of the distribution (to decide
+ * whether it makes sense to build the histogram/MCV list).
+ *
+ * TODO Using a min/max of selectivities would probably make more sense
+ * for the associated columns.
+ *
+ * TODO Consider eliminating the implied columns from the histogram and
+ * MCV lists (but maybe that's not a good idea, because that'd make
+ * it impossible to use these stats for non-equality clauses and
+ * also it wouldn't be possible to use the stats for verification
+ * of the dependencies as proposed in another TODO).
+ *
+ * TODO This builds a complete set of dependencies, i.e. including
+ * transitive dependencies - if we identify [A => B] and [B => C],
+ * we're likely to identify [A => C] too. It might be better to
+ * keep only the minimal set of dependencies, i.e. prune all the
+ * dependencies that we can recreate by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may
+ * be recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check
+ * before actually doing the work
+ *
+ * The second option has the advantage that we don't really need
+ * to perform the sort/count. It's not sufficient alone, though,
+ * because we may discover the dependencies in the wrong order.
+ * For example [A => B], [A => C] and then [B => C]. None of those
+ * dependencies is a combination of the already known ones, yet
+ * [A => C] is a combination of [A => B] and [B => C].
+ *
+ * FIXME Not sure the current NULL handling makes much sense. We assume
+ * that NULL is 0, so it's handled like a regular value
+ * (NULL == NULL), so all NULLs in a single column form a single
+ * group. Maybe that's not the right thing to do, especially with
+ * equality conditions - in that case NULLs are irrelevant. So
+ * maybe the right solution would be to just ignore NULL values?
+ *
+ * However simply "ignoring" the NULL values does not seem like
+ * a good idea - imagine columns A and B, where for each value of
+ * A, values in B are constant (same for the whole group) or NULL.
+ * Let's say only 10% of B values in each group is not NULL. Then
+ * ignoring the NULL values will result in 10x misestimate (and
+ * it's trivial to construct arbitrary errors). So maybe handling
+ * NULL values just like a regular value is the right thing here.
+ *
+ * Or maybe NULL values should be treated differently on each side
+ * of the dependency? E.g. as ignored on the left (condition) and
+ * as regular values on the right - this seems consistent with how
+ * equality clauses work, as equality clause means 'NOT NULL'.
+ * So if we say [A => B] then it may also imply "NOT NULL" on the
+ * right side.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct values in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we can estimate the
+ * average group size and use it as a threshold. Or something
+ * like that. Seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, stats);
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ SortItem current;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, stats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ current = items[0];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
+ {
+ n_violations += 1;
+ }
+
+ current = items[i];
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means there's a 1:1 relation (or one is a 'label'), making the
+ * conditions rather redundant. Although it's possible that the
+ * query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index db56809..912b4f3 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2096,6 +2096,46 @@ describeOneTableDetails(const char *schemaname,
PQclear(result);
}
+ /* print any multivariate statistics */
+ if (pset.sversion >= 90500)
+ {
+ printfPQExpBuffer(&buf,
+ "SELECT oid, stakeys,\n"
+ " deps_enabled,\n"
+ " deps_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
+ " (SELECT string_agg(attname::text,', ')\n"
+ " FROM ((SELECT unnest(stakeys) AS attnum) s\n"
+ " JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
+ "FROM pg_mv_statistic stat WHERE starelid = '%s' ORDER BY 1;",
+ oid);
+
+ result = PSQLexec(buf.data);
+ if (!result)
+ goto error_return;
+ else
+ tuples = PQntuples(result);
+
+ if (tuples > 0)
+ {
+ printTableAddFooter(&cont, _("Statistics:"));
+ for (i = 0; i < tuples; i++)
+ {
+ printfPQExpBuffer(&buf, " ");
+
+ /* options */
+ if (!strcmp(PQgetvalue(result, i, 2), "t"))
+ appendPQExpBuffer(&buf, "(dependencies)");
+
+ appendPQExpBuffer(&buf, " ON (%s)",
+ PQgetvalue(result, i, 6));
+
+ printTableAddFooter(&cont, buf.data);
+ }
+ }
+ PQclear(result);
+ }
+
/* print rules */
if (tableinfo.hasrules && tableinfo.relkind != 'm')
{
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index e6ac394..36debeb 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -119,6 +119,7 @@ extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
DropBehavior behavior, bool complain, bool internal);
extern void RemoveAttrDefaultById(Oid attrdefId);
extern void RemoveStatistics(Oid relid, AttrNumber attnum);
+extern void RemoveMVStatistics(Oid relid, AttrNumber attnum);
extern Form_pg_attribute SystemAttributeDefinition(AttrNumber attno,
bool relhasoids);
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index 748aadd..03ada1b 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,11 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..81ec23b
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,69 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_attrdef
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 5
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_deps_enabled 2
+#define Anum_pg_mv_statistic_deps_built 3
+#define Anum_pg_mv_statistic_stakeys 4
+#define Anum_pg_mv_statistic_stadeps 5
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index c0aab38..69fc482 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2735,6 +2735,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3307 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3308 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index fb2f035..55f6079 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3309, 3310);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 290cdb3..9254f85 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -249,6 +249,7 @@ typedef enum NodeTag
T_PlaceHolderInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
+ T_MVStatisticInfo,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
@@ -426,6 +427,7 @@ typedef enum NodeTag
T_RoleSpec,
T_RangeTableSample,
T_TableSampleClause,
+ T_StatisticsDef,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 868905b..d81537c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -610,6 +610,14 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct StatisticsDef
+{
+ NodeTag type;
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+} StatisticsDef;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1515,7 +1523,9 @@ typedef enum AlterTableType
AT_ReplicaIdentity, /* REPLICA IDENTITY */
AT_EnableRowSecurity, /* ENABLE ROW SECURITY */
AT_DisableRowSecurity, /* DISABLE ROW SECURITY */
- AT_GenericOptions /* OPTIONS (...) */
+ AT_GenericOptions, /* OPTIONS (...) */
+ AT_AddStatistics, /* ADD STATISTICS */
+ AT_DropStatistics /* DROP STATISTICS */
} AlterTableType;
typedef struct ReplicaIdentityStmt
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 279051e..10f7425 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -459,6 +459,7 @@ typedef struct RelOptInfo
Relids lateral_relids; /* minimum parameterization of rel */
Relids lateral_referencers; /* rels that reference me laterally */
List *indexlist; /* list of IndexOptInfo */
+ List *mvstatlist; /* list of MVStatisticInfo */
BlockNumber pages; /* size estimates derived from pg_class */
double tuples;
double allvisfrac;
@@ -553,6 +554,33 @@ typedef struct IndexOptInfo
bool amhasgetbitmap; /* does AM have amgetbitmap interface? */
} IndexOptInfo;
+/*
+ * MVStatisticInfo
+ * Information about multivariate stats for planning/optimization
+ *
+ * This contains information about which columns are covered by the
+ * statistics (stakeys), which options were requested while adding the
+ * statistics (*_enabled), and which kinds of statistics were actually
+ * built and are available for the optimizer (*_built).
+ */
+typedef struct MVStatisticInfo
+{
+ NodeTag type;
+
+ Oid mvoid; /* OID of the statistics row */
+ RelOptInfo *rel; /* back-link to index's table */
+
+ /* enabled statistics */
+ bool deps_enabled; /* functional dependencies enabled */
+
+ /* built/available statistics */
+ bool deps_built; /* functional dependencies built */
+
+ /* columns in the statistics (attnums) */
+ int2vector *stakeys; /* attnums of the columns covered */
+
+} MVStatisticInfo;
+
/*
* EquivalenceClasses
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 2414069..f69480b 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -360,7 +360,7 @@ PG_KEYWORD("stable", STABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("standalone", STANDALONE_P, UNRESERVED_KEYWORD)
PG_KEYWORD("start", START, UNRESERVED_KEYWORD)
PG_KEYWORD("statement", STATEMENT, UNRESERVED_KEYWORD)
-PG_KEYWORD("statistics", STATISTICS, UNRESERVED_KEYWORD)
+PG_KEYWORD("statistics", STATISTICS, RESERVED_KEYWORD)
PG_KEYWORD("stdin", STDIN, UNRESERVED_KEYWORD)
PG_KEYWORD("stdout", STDOUT, UNRESERVED_KEYWORD)
PG_KEYWORD("storage", STORAGE, UNRESERVED_KEYWORD)
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..411cd16
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,69 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "commands/vacuum.h"
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+
+#endif
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 8a55a09..4d6edb6 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -79,6 +79,7 @@ typedef struct RelationData
bool rd_isvalid; /* relcache entry is valid */
char rd_indexvalid; /* state of rd_indexlist: 0 = not valid, 1 =
* valid, 2 = temporarily forced */
+ bool rd_mvstatvalid; /* state of rd_mvstatlist: true/false */
/*
* rd_createSubid is the ID of the highest subtransaction the rel has
@@ -111,6 +112,9 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
+
+ /* data managed by RelationGetMVStatList: */
+ List *rd_mvstatlist; /* list of OIDs of multivariate stats */
/* data managed by RelationGetIndexAttrBitmap: */
Bitmapset *rd_indexattr; /* identifies columns used in indexes */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 6953281..77efeff 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -38,6 +38,7 @@ extern void RelationClose(Relation relation);
* Routines to compute/retrieve additional cached information
*/
extern List *RelationGetIndexList(Relation relation);
+extern List *RelationGetMVStatList(Relation relation);
extern Oid RelationGetOidIndex(Relation relation);
extern Oid RelationGetReplicaIndex(Relation relation);
extern List *RelationGetIndexExpressions(Relation relation);
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 2dbd384..814269b 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,7 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 60c1f40..a12ad30 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1363,6 +1363,14 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index 14acd16..d740241 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
--
1.9.3
0003-clause-reduction-using-functional-dependencies-v7.patchtext/x-patch; name=0003-clause-reduction-using-functional-dependencies-v7.patchDownload
>From 1bc8e278cf96a33bdb5716023ae9929e4c625893 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 19:42:18 +0200
Subject: [PATCH 3/6] clause reduction using functional dependencies
During planning, use functional dependencies to decide which
clauses to skip during cardinality estimation. Initial and
rather simplistic implementation.
This only works with regular WHERE clauses, not clauses used
for join clauses.
Note: The clause_is_mv_compatible() needs to identify the
relation (so that we can fetch the list of multivariate stats
by OID). planner_rt_fetch() seems like the appropriate way to
get the relation OID, but apparently it only works with simple
vars. Maybe examine_variable() would make this work with more
complex vars too?
Includes regression tests analyzing functional dependencies
(part of ANALYZE) on several datasets (no dependencies, no
transitive dependencies, ...).
Checks that a query with conditions on two columns, where one (B)
is functionally dependent on the other one (A), correctly ignores
the clause on (B) and chooses bitmap index scan instead of plain
index scan (which is what happens otherwise, thanks to assumption
of independence).
Note: Functional dependencies only work with equality clauses,
no inequalities etc.
---
src/backend/commands/tablecmds.c | 6 +
src/backend/nodes/copyfuncs.c | 1 +
src/backend/optimizer/path/clausesel.c | 911 +++++++++++++++++++++++++-
src/backend/utils/mvstats/common.c | 5 +-
src/backend/utils/mvstats/dependencies.c | 24 +
src/bin/psql/describe.c | 1 -
src/include/utils/mvstats.h | 16 +-
src/test/regress/expected/mv_dependencies.out | 172 +++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/regression.diffs | 30 +
src/test/regress/regression.out | 156 +++++
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 150 +++++
13 files changed, 1470 insertions(+), 6 deletions(-)
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/regression.diffs
create mode 100644 src/test/regress/regression.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d6c6f8e..107e9fc 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11980,6 +11980,12 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
opt->defname)));
}
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e5a3d96..36094c0 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4949,6 +4949,7 @@ copyObject(const void *from)
break;
case T_TableSampleClause:
retval = _copyTableSampleClause(from);
+ break;
case T_StatisticsDef:
retval = _copyStatisticsDef(from);
break;
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index dcac1c1..6365425 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -14,15 +14,19 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/selfuncs.h"
+#include "utils/typcache.h"
/*
@@ -42,6 +46,44 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+
+static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+
+static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, List *stats,
+ SpecialJoinInfo *sjinfo);
+
+static bool has_stats(List *stats, int type);
+
+static List * find_stats(PlannerInfo *root, List *clauses,
+ Oid varRelid, Index *relid);
+
+static Bitmapset* fdeps_collect_attnums(List *stats);
+
+static int *make_idx_to_attnum_mapping(Bitmapset *attnums);
+static int *make_attnum_to_idx_mapping(Bitmapset *attnums);
+
+static bool *build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx);
+
+static void multiply_adjacency_matrix(bool *matrix, int natts);
+
+static List* fdeps_reduce_clauses(List *clauses,
+ Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx,
+ Index relid);
+
+static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+
+static Bitmapset * get_varattnos(Node * node, Index relid);
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
@@ -61,7 +103,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -88,6 +130,88 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
*
* Of course this is all very dependent on the behavior of
* scalarltsel/scalargtsel; perhaps some day we can generalize the approach.
+ *
+ *
+ * Multivariate statististics
+ * --------------------------
+ * This also uses multivariate stats to estimate combinations of
+ * conditions, in a way (a) maximizing the estimate accuracy by using
+ * as many stats as possible, and (b) minimizing the overhead,
+ * especially when there are no suitable multivariate stats (so if you
+ * are not using multivariate stats, there's no additional overhead).
+ *
+ * The following checks are performed (in this order), and the optimizer
+ * falls back to regular stats on the first 'false'.
+ *
+ * NOTE: This explains how this works with all the patches applied, not
+ * just the functional dependencies.
+ *
+ * (0) check if there are multivariate stats on the relation
+ *
+ * If no, just skip all the following steps (directly to the
+ * original code).
+ *
+ * (1) check how many attributes are there in conditions compatible
+ * with functional dependencies
+ *
+ * Only simple equality clauses are considered compatible with
+ * functional dependencies (and that's unlikely to change, because
+ * that's the only case when functional dependencies are useful).
+ *
+ * If there are no conditions that might be handled by multivariate
+ * stats, or if the conditions reference just a single column, it
+ * makes no sense to use functional dependencies, so skip to (4).
+ *
+ * (2) reduce the clauses using functional dependencies
+ *
+ * This simply attempts to 'reduce' the clauses by applying functional
+ * dependencies. For example if there are two clauses:
+ *
+ * WHERE (a = 1) AND (b = 2)
+ *
+ * and we know that 'a' determines the value of 'b', we may remove
+ * the second condition (b = 2) when computing the selectivity.
+ * This is of course tricky - see mvstats/dependencies.c for details.
+ *
+ * After the reduction, step (1) is to be repeated.
+ *
+ * (3) check how many attributes are there in conditions compatible
+ * with MCV lists and histograms
+ *
+ * What conditions are compatible with multivariate stats is decided
+ * by clause_is_mv_compatible(). At this moment, only conditions
+ * of the form "column operator constant" (for simple comparison
+ * operators), IS [NOT] NULL and some AND/OR clauses are considered
+ * compatible with multivariate statistics.
+ *
+ * Again, see clause_is_mv_compatible() for details.
+ *
+ * (4) check how many attributes are there in conditions compatible
+ * with MCV lists and histograms
+ *
+ * If there are no conditions that might be handled by MCV lists
+ * or histograms, or if the conditions reference just a single
+ * column, it makes no sense to continue, so just skip to (7).
+ *
+ * (5) choose the stats matching the most columns
+ *
+ * If there are multiple instances of multivariate statistics (e.g.
+ * built on different sets of columns), we choose the stats covering
+ * the most columns from step (1). It may happen that all available
+ * stats match just a single column - for example with conditions
+ *
+ * WHERE a = 1 AND b = 2
+ *
+ * and statistics built on (a,c) and (b,c). In such case just fall
+ * back to the regular stats because it makes no sense to use the
+ * multivariate statistics.
+ *
+ * For more details about how exactly we choose the stats, see
+ * choose_mv_statistics().
+ *
+ * (6) use the multivariate stats to estimate matching clauses
+ *
+ * (7) estimate the remaining clauses using the regular statistics
*/
Selectivity
clauselist_selectivity(PlannerInfo *root,
@@ -100,6 +224,16 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+
+ /* attributes in mv-compatible clauses */
+ Bitmapset *mvattnums = NULL;
+ List *stats = NIL;
+
+ /* use clauses (not conditions), because those are always non-empty */
+ stats = find_stats(root, clauses, varRelid, &relid);
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -109,6 +243,31 @@ clauselist_selectivity(PlannerInfo *root,
varRelid, jointype, sjinfo);
/*
+ * Check that there are some stats with functional dependencies
+ * built (by walking the stats list). We're going to find that
+ * anyway when trying to apply the functional dependencies, but
+ * this is probably a tad faster.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ {
+ /* collect attributes referenced by mv-compatible clauses */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+
+ /*
+ * If there are mv-compatible clauses, referencing at least two
+ * different columns (otherwise it makes no sense to use mv stats),
+ * try to reduce the clauses using functional dependencies, and
+ * recollect the attributes from the reduced list.
+ *
+ * We don't need to select a single statistics for this - we can
+ * apply all the functional dependencies we have.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ clauses = clauselist_apply_dependencies(root, clauses, varRelid,
+ stats, sjinfo);
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -782,3 +941,753 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
+ Index *relid, SpecialJoinInfo *sjinfo)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ *relid = InvalidOid;
+ }
+
+ return attnums;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+{
+
+ if (IsA(clause, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* no support for OR clauses at this point */
+ if (rinfo->orclause)
+ return false;
+
+ /* get the actual clause from the RestrictInfo (it's not an OR clause) */
+ clause = (Node*)rinfo->clause;
+
+ /* only simple opclauses are compatible with multivariate stats */
+ if (! is_opclause(clause))
+ return false;
+
+ /* we don't support join conditions at this moment */
+ if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
+ return false;
+
+ /* is it 'variable op constant' ? */
+ if (list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ *relid = var->varno;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+ *attnum = var->varattno;
+ return true;
+ }
+ }
+ }
+ }
+
+ return false;
+
+}
+
+/*
+ * Performs reduction of clauses using functional dependencies, i.e.
+ * removes clauses that are considered redundant. It simply walks
+ * through dependencies, and checks whether the dependency 'matches'
+ * the clauses, i.e. if there's a clause matching the condition. If yes,
+ * all clauses matching the implied part of the dependency are removed
+ * from the list.
+ *
+ * This simply looks at attnums references by the clauses, not at the
+ * type of the operator (equality, inequality, ...). This may not be the
+ * right way to do - it certainly works best for equalities, which is
+ * naturally consistent with functional dependencies (implications).
+ * It's not clear that other operators are handled sensibly - for
+ * example for inequalities, like
+ *
+ * WHERE (A >= 10) AND (B <= 20)
+ *
+ * and a trivial case where [A == B], resulting in symmetric pair of
+ * rules [A => B], [B => A], it's rather clear we can't remove either of
+ * those clauses.
+ *
+ * That only highlights that functional dependencies are most suitable
+ * for label-like data, where using non-equality operators is very rare.
+ * Using the common city/zipcode example, clauses like
+ *
+ * (zipcode <= 12345)
+ *
+ * or
+ *
+ * (cityname >= 'Washington')
+ *
+ * are rare. So restricting the reduction to equality should not harm
+ * the usefulness / applicability.
+ *
+ * The other assumption is that this assumes 'compatible' clauses. For
+ * example by using mismatching zip code and city name, this is unable
+ * to identify the discrepancy and eliminates one of the clauses. The
+ * usual approach (multiplying both selectivities) thus produces a more
+ * accurate estimate, although mostly by luck - the multiplication
+ * comes from assumption of statistical independence of the two
+ * conditions (which is not not valid in this case), but moves the
+ * estimate in the right direction (towards 0%).
+ *
+ * This might be somewhat improved by cross-checking the selectivities
+ * against MCV and/or histogram.
+ *
+ * The implementation needs to be careful about cyclic rules, i.e. rules
+ * like [A => B] and [B => A] at the same time. This must not reduce
+ * clauses on both attributes at the same time.
+ *
+ * Technically we might consider selectivities here too, somehow. E.g.
+ * when (A => B) and (B => A), we might use the clauses with minimum
+ * selectivity.
+ *
+ * TODO Consider restricting the reduction to equality clauses. Or maybe
+ * use equality classes somehow?
+ *
+ * TODO Merge this docs to dependencies.c, as it's saying mostly the
+ * same things as the comments there.
+ *
+ * TODO Currently this is applied only to the top-level clauses, but
+ * maybe we could apply it to lists at subtrees too, e.g. to the
+ * two AND-clauses in
+ *
+ * (x=1 AND y=2) OR (z=3 AND q=10)
+ *
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, List *stats,
+ SpecialJoinInfo *sjinfo)
+{
+ List *reduced_clauses = NIL;
+ Index relid;
+
+ /*
+ * matrix of (natts x natts), 1 means x=>y
+ *
+ * This serves two purposes - first, it merges dependencies from all
+ * the statistics, second it makes generating all the transitive
+ * dependencies easier.
+ *
+ * We need to build this only for attributes from the dependencies,
+ * not for all attributes in the table.
+ *
+ * We can't do that only for attributes from the clauses, because we
+ * want to build transitive dependencies (including those going
+ * through attributes not listed in the stats).
+ *
+ * This only works for A=>B dependencies, not sure how to do that
+ * for complex dependencies.
+ */
+ bool *deps_matrix;
+ int deps_natts; /* size of the matric */
+
+ /* mapping attnum <=> matrix index */
+ int *deps_idx_to_attnum;
+ int *deps_attnum_to_idx;
+
+ /* attnums in dependencies and clauses (and intersection) */
+ List *deps_clauses = NIL;
+ Bitmapset *deps_attnums = NULL;
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *intersect_attnums = NULL;
+
+ /*
+ * Is there at least one statistics with functional dependencies?
+ * If not, return the original clauses right away.
+ *
+ * XXX Isn't this pointless, thanks to exactly the same check in
+ * clauselist_selectivity()? Can we trigger the condition here?
+ */
+ if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ return clauses;
+
+ /*
+ * Build the dependency matrix, i.e. attribute adjacency matrix,
+ * where 1 means (a=>b). Once we have the adjacency matrix, we'll
+ * multiply it by itself, to get transitive dependencies.
+ *
+ * Note: This is pretty much transitive closure from graph theory.
+ *
+ * First, let's see what attributes are covered by functional
+ * dependencies (sides of the adjacency matrix), and also a maximum
+ * attribute (size of mapping to simple integer indexes);
+ */
+ deps_attnums = fdeps_collect_attnums(stats);
+
+ /*
+ * Walk through the clauses - clauses that are (one of)
+ *
+ * (a) not mv-compatible
+ * (b) are using more than a single attnum
+ * (c) using attnum not covered by functional depencencies
+ *
+ * may be copied directly to the result. The interesting clauses are
+ * kept in 'deps_clauses' and will be processed later.
+ */
+ clause_attnums = fdeps_filter_clauses(root, clauses, deps_attnums,
+ &reduced_clauses, &deps_clauses,
+ varRelid, &relid, sjinfo);
+
+ /*
+ * we need at least two clauses referencing two different attributes
+ * referencing to do the reduction
+ */
+ if ((list_length(deps_clauses) < 2) || (bms_num_members(clause_attnums) < 2))
+ {
+ bms_free(clause_attnums);
+ list_free(reduced_clauses);
+ list_free(deps_clauses);
+
+ return clauses;
+ }
+
+
+ /*
+ * We need at least two matching attributes in the clauses and
+ * dependencies, otherwise we can't really reduce anything.
+ */
+ intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
+ if (bms_num_members(intersect_attnums) < 2)
+ {
+ bms_free(clause_attnums);
+ bms_free(deps_attnums);
+ bms_free(intersect_attnums);
+
+ list_free(deps_clauses);
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ /*
+ * Build mapping between matrix indexes and attnums, and then the
+ * adjacency matrix itself.
+ */
+ deps_idx_to_attnum = make_idx_to_attnum_mapping(deps_attnums);
+ deps_attnum_to_idx = make_attnum_to_idx_mapping(deps_attnums);
+
+ /* build the adjacency matrix */
+ deps_matrix = build_adjacency_matrix(stats, deps_attnums,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx);
+
+ deps_natts = bms_num_members(deps_attnums);
+
+ /*
+ * Multiply the matrix N-times (N = size of the matrix), so that we
+ * get all the transitive dependencies. That makes the next step
+ * much easier and faster.
+ *
+ * This is essentially an adjacency matrix from graph theory, and
+ * by multiplying it we get transitive edges. We don't really care
+ * about the exact number (number of paths between vertices) though,
+ * so we can do the multiplication in-place (we don't care whether
+ * we found the dependency in this round or in the previous one).
+ *
+ * Track how many new dependencies were added, and stop when 0, but
+ * we can't multiply more than N-times (longest path in the graph).
+ */
+ multiply_adjacency_matrix(deps_matrix, deps_natts);
+
+ /*
+ * Walk through the clauses, and see which other clauses we may
+ * reduce. The matrix contains all transitive dependencies, which
+ * makes this very fast.
+ *
+ * We have to be careful not to reduce the clause using itself, or
+ * reducing all clauses forming a cycle (so we have to skip already
+ * eliminated clauses).
+ *
+ * I'm not sure whether this guarantees finding the best solution,
+ * i.e. reducing the most clauses, but it probably does (thanks to
+ * having all the transitive dependencies).
+ */
+ deps_clauses = fdeps_reduce_clauses(deps_clauses,
+ deps_attnums, deps_matrix,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx, relid);
+
+ /* join the two lists of clauses */
+ reduced_clauses = list_union(reduced_clauses, deps_clauses);
+
+ pfree(deps_matrix);
+ pfree(deps_idx_to_attnum);
+ pfree(deps_attnum_to_idx);
+
+ bms_free(deps_attnums);
+ bms_free(clause_attnums);
+ bms_free(intersect_attnums);
+
+ return reduced_clauses;
+}
+
+static bool
+has_stats(List *stats, int type)
+{
+ ListCell *s;
+
+ foreach (s, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Determing relid (either from varRelid or from clauses) and then
+ * lookup stats using the relid.
+ */
+static List *
+find_stats(PlannerInfo *root, List *clauses, Oid varRelid, Index *relid)
+{
+ /* unknown relid by default */
+ *relid = InvalidOid;
+
+ /*
+ * First we need to find the relid (index info simple_rel_array).
+ * If varRelid is not 0, we already have it, otherwise we have to
+ * look it up from the clauses.
+ */
+ if (varRelid != 0)
+ *relid = varRelid;
+ else
+ {
+ Relids relids = pull_varnos((Node*)clauses);
+
+ /*
+ * We only expect 0 or 1 members in the bitmapset. If there are
+ * no vars, we'll get empty bitmapset, otherwise we'll get the
+ * relid as the single member.
+ *
+ * FIXME For some reason we can get 2 relids here (e.g. \d in
+ * psql does that).
+ */
+ if (bms_num_members(relids) == 1)
+ *relid = bms_singleton_member(relids);
+
+ bms_free(relids);
+ }
+
+ /*
+ * if we found the relid, we can get the stats from simple_rel_array
+ *
+ * This only gets stats that are already built, because that's how
+ * we load it into RelOptInfo (see get_relation_info), but we don't
+ * detoast the whole stats yet. That'll be done later, after we
+ * decide which stats to use.
+ */
+ if (*relid != InvalidOid)
+ return root->simple_rel_array[*relid]->mvstatlist;
+
+ return NIL;
+}
+
+static Bitmapset*
+fdeps_collect_attnums(List *stats)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ int2vector *stakeys = info->stakeys;
+
+ /* skip stats without functional dependencies built */
+ if (! info->deps_built)
+ continue;
+
+ for (j = 0; j < stakeys->dim1; j++)
+ attnums = bms_add_member(attnums, stakeys->values[j]);
+ }
+
+ return attnums;
+}
+
+
+static int*
+make_idx_to_attnum_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum = -1;
+
+ int *mapping = (int*)palloc0(bms_num_members(attnums) * sizeof(int));
+
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attidx++] = attnum;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+static int*
+make_attnum_to_idx_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum = -1;
+ int maxattnum = -1;
+ int *mapping;
+
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ maxattnum = attnum;
+
+ mapping = (int*)palloc0((maxattnum+1) * sizeof(int));
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attnum] = attidx++;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+static bool*
+build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx)
+{
+ ListCell *lc;
+ int natts = bms_num_members(attnums);
+ bool *matrix = (bool*)palloc0(natts * natts * sizeof(bool));
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies dependencies = NULL;
+
+ /* skip stats without functional dependencies built */
+ if (! stat->deps_built)
+ continue;
+
+ /* fetch and deserialize dependencies */
+ dependencies = load_mv_dependencies(stat->mvoid);
+ if (dependencies == NULL)
+ {
+ elog(WARNING, "failed to deserialize func deps %d", stat->mvoid);
+ continue;
+ }
+
+ /* set matrix[a,b] to 'true' if 'a=>b' */
+ for (j = 0; j < dependencies->ndeps; j++)
+ {
+ int aidx = attnum_to_idx[dependencies->deps[j]->a];
+ int bidx = attnum_to_idx[dependencies->deps[j]->b];
+
+ /* a=> b */
+ matrix[aidx * natts + bidx] = true;
+ }
+ }
+
+ return matrix;
+}
+
+static void
+multiply_adjacency_matrix(bool *matrix, int natts)
+{
+ int i;
+
+ for (i = 0; i < natts; i++)
+ {
+ int k, l, m;
+ int nchanges = 0;
+
+ /* k => l */
+ for (k = 0; k < natts; k++)
+ {
+ for (l = 0; l < natts; l++)
+ {
+ /* we already have this dependency */
+ if (matrix[k * natts + l])
+ continue;
+
+ /* we don't really care about the exact value, just 0/1 */
+ for (m = 0; m < natts; m++)
+ {
+ if (matrix[k * natts + m] * matrix[m * natts + l])
+ {
+ matrix[k * natts + l] = true;
+ nchanges += 1;
+ break;
+ }
+ }
+ }
+ }
+
+ /* no transitive dependency added here, so terminate */
+ if (nchanges == 0)
+ break;
+ }
+}
+
+static List*
+fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx, Index relid)
+{
+ int i;
+ ListCell *lc;
+ List *reduced_clauses = NIL;
+
+ int nmvclauses; /* size of the arrays */
+ bool *reduced;
+ AttrNumber *mvattnums;
+ Node **mvclauses;
+
+ int natts = bms_num_members(attnums);
+
+ /*
+ * Preallocate space for all clauses (the list only containst
+ * compatible clauses at this point). This makes it somewhat easier
+ * to access the stats / attnums randomly.
+ *
+ * XXX This assumes each clause references exactly one Var, so the
+ * arrays are sized accordingly - for functional dependencies
+ * this is safe, because it only works with Var=Const.
+ */
+ mvclauses = (Node**)palloc0(list_length(clauses) * sizeof(Node*));
+ mvattnums = (AttrNumber*)palloc0(list_length(clauses) * sizeof(AttrNumber));
+ reduced = (bool*)palloc0(list_length(clauses) * sizeof(bool));
+
+ /* fill the arrays */
+ nmvclauses = 0;
+ foreach (lc, clauses)
+ {
+ Node * clause = (Node*)lfirst(lc);
+ Bitmapset * attnums = get_varattnos(clause, relid);
+
+ mvclauses[nmvclauses] = clause;
+ mvattnums[nmvclauses] = bms_singleton_member(attnums);
+ nmvclauses++;
+ }
+
+ Assert(nmvclauses == list_length(clauses));
+
+ /* now try to reduce the clauses (using the dependencies) */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ int j;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[i], attnums))
+ continue;
+
+ /* this clause was already reduced, so let's skip it */
+ if (reduced[i])
+ continue;
+
+ /* walk the potentially 'implied' clauses */
+ for (j = 0; j < nmvclauses; j++)
+ {
+ int aidx, bidx;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[j], attnums))
+ continue;
+
+ aidx = attnum_to_idx[mvattnums[i]];
+ bidx = attnum_to_idx[mvattnums[j]];
+
+ /* can't reduce the clause by itself, or if already reduced */
+ if ((i == j) || reduced[j])
+ continue;
+
+ /* mark the clause as reduced (if aidx => bidx) */
+ reduced[j] = matrix[aidx * natts + bidx];
+ }
+ }
+
+ /* now walk through the clauses, and keep only those not reduced */
+ for (i = 0; i < nmvclauses; i++)
+ if (! reduced[i])
+ reduced_clauses = lappend(reduced_clauses, mvclauses[i]);
+
+ pfree(reduced);
+ pfree(mvclauses);
+ pfree(mvattnums);
+
+ return reduced_clauses;
+}
+
+
+static Bitmapset *
+fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo)
+{
+ ListCell *lc;
+ Bitmapset *clause_attnums = NULL;
+
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(lc);
+
+ if (! clause_is_mv_compatible(root, clause, varRelid, relid,
+ &attnum, sjinfo))
+
+ /* clause incompatible with functional dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(attnum, deps_attnums))
+
+ /* clause not covered by the dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else
+ {
+ *deps_clauses = lappend(*deps_clauses, clause);
+ clause_attnums = bms_add_member(clause_attnums, attnum);
+ }
+ }
+
+ return clause_attnums;
+}
+
+/*
+ * Pull varattnos from the clauses, similarly to pull_varattnos() but:
+ *
+ * (a) only get attributes for a particular relation (relid)
+ * (b) ignore system attributes (we can't build stats on them anyway)
+ *
+ * This makes it possible to directly compare the result with attnum
+ * values from pg_attribute etc.
+ */
+static Bitmapset *
+get_varattnos(Node * node, Index relid)
+{
+ int k;
+ Bitmapset *varattnos = NULL;
+ Bitmapset *result = NULL;
+
+ /* get the varattnos */
+ pull_varattnos(node, relid, &varattnos);
+
+ k = -1;
+ while ((k = bms_next_member(varattnos, k)) >= 0)
+ {
+ if (k + FirstLowInvalidHeapAttributeNumber > 0)
+ result
+ = bms_add_member(result,
+ k + FirstLowInvalidHeapAttributeNumber);
+ }
+
+ bms_free(varattnos);
+
+ return result;
+}
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index a755c49..bd200bc 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -84,7 +84,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(stat->mvoid, deps, attrs);
@@ -163,6 +164,7 @@ list_mv_stats(Oid relid)
info->mvoid = HeapTupleGetOid(htup);
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
result = lappend(result, info);
@@ -274,6 +276,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 84b6561..0a08d12 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -636,3 +636,27 @@ pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(result));
}
+
+MVDependencies
+load_mv_dependencies(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->deps_enabled && mvstat->deps_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_dependencies(DatumGetByteaP(deps));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 912b4f3..5f89604 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2103,7 +2103,6 @@ describeOneTableDetails(const char *schemaname,
"SELECT oid, stakeys,\n"
" deps_enabled,\n"
" deps_built,\n"
- " mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 411cd16..02a7dda 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -16,12 +16,20 @@
#include "commands/vacuum.h"
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
int16 a;
@@ -47,6 +55,8 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
+MVDependencies load_mv_dependencies(Oid mvoid);
+
bytea * serialize_mv_dependencies(MVDependencies dependencies);
/* deserialization of stats (serialization is private to analyze) */
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..cf986e8
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,172 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE functional_dependencies ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c, d);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 91780cd..11d9d38 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -110,3 +110,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/regression.diffs b/src/test/regress/regression.diffs
new file mode 100644
index 0000000..95b9cc5
--- /dev/null
+++ b/src/test/regress/regression.diffs
@@ -0,0 +1,30 @@
+*** /home/user/work/tvondra_postgres/src/test/regress/expected/rolenames.out Wed May 6 21:31:06 2015
+--- /home/user/work/tvondra_postgres/src/test/regress/results/rolenames.out Mon May 25 22:24:21 2015
+***************
+*** 38,47 ****
+--- 38,52 ----
+ ORDER BY 2;
+ $$ LANGUAGE SQL;
+ CREATE ROLE "Public";
++ ERROR: role "Public" already exists
+ CREATE ROLE "None";
++ ERROR: role "None" already exists
+ CREATE ROLE "current_user";
++ ERROR: role "current_user" already exists
+ CREATE ROLE "session_user";
++ ERROR: role "session_user" already exists
+ CREATE ROLE "user";
++ ERROR: role "user" already exists
+ CREATE ROLE current_user; -- error
+ ERROR: CURRENT_USER cannot be used as a role name here
+ LINE 1: CREATE ROLE current_user;
+***************
+*** 938,940 ****
+--- 943,946 ----
+ DROP OWNED BY testrol0, "Public", "current_user", testrol1, testrol2, testrolx CASCADE;
+ DROP ROLE testrol0, testrol1, testrol2, testrolx;
+ DROP ROLE "Public", "None", "current_user", "session_user", "user";
++ ERROR: current user cannot be dropped
+
+======================================================================
+
diff --git a/src/test/regress/regression.out b/src/test/regress/regression.out
new file mode 100644
index 0000000..bd81385
--- /dev/null
+++ b/src/test/regress/regression.out
@@ -0,0 +1,156 @@
+test tablespace ... ok
+test boolean ... ok
+test char ... ok
+test name ... ok
+test varchar ... ok
+test text ... ok
+test int2 ... ok
+test int4 ... ok
+test int8 ... ok
+test oid ... ok
+test float4 ... ok
+test float8 ... ok
+test bit ... ok
+test numeric ... ok
+test txid ... ok
+test uuid ... ok
+test enum ... ok
+test money ... ok
+test rangetypes ... ok
+test pg_lsn ... ok
+test regproc ... ok
+test strings ... ok
+test numerology ... ok
+test point ... ok
+test lseg ... ok
+test line ... ok
+test box ... ok
+test path ... ok
+test polygon ... ok
+test circle ... ok
+test date ... ok
+test time ... ok
+test timetz ... ok
+test timestamp ... ok
+test timestamptz ... ok
+test interval ... ok
+test abstime ... ok
+test reltime ... ok
+test tinterval ... ok
+test inet ... ok
+test macaddr ... ok
+test tstypes ... ok
+test comments ... ok
+test geometry ... ok
+test horology ... ok
+test regex ... ok
+test oidjoins ... ok
+test type_sanity ... ok
+test opr_sanity ... ok
+test insert ... ok
+test insert_conflict ... ok
+test create_function_1 ... ok
+test create_type ... ok
+test create_table ... ok
+test create_function_2 ... ok
+test copy ... ok
+test copyselect ... ok
+test create_misc ... ok
+test create_operator ... ok
+test create_index ... ok
+test create_view ... ok
+test create_aggregate ... ok
+test create_function_3 ... ok
+test create_cast ... ok
+test constraints ... ok
+test triggers ... ok
+test inherit ... ok
+test create_table_like ... ok
+test typed_table ... ok
+test vacuum ... ok
+test drop_if_exists ... ok
+test updatable_views ... ok
+test rolenames ... FAILED
+test sanity_check ... ok
+test errors ... ok
+test select ... ok
+test select_into ... ok
+test select_distinct ... ok
+test select_distinct_on ... ok
+test select_implicit ... ok
+test select_having ... ok
+test subselect ... ok
+test union ... ok
+test case ... ok
+test join ... ok
+test aggregates ... ok
+test groupingsets ... ok
+test transactions ... ok
+test random ... ok
+test portals ... ok
+test arrays ... ok
+test btree_index ... ok
+test hash_index ... ok
+test update ... ok
+test delete ... ok
+test namespace ... ok
+test prepared_xacts ... ok
+test brin ... ok
+test gin ... ok
+test gist ... ok
+test spgist ... ok
+test privileges ... ok
+test security_label ... ok
+test collate ... ok
+test matview ... ok
+test lock ... ok
+test replica_identity ... ok
+test rowsecurity ... ok
+test object_address ... ok
+test alter_generic ... ok
+test misc ... ok
+test psql ... ok
+test async ... ok
+test rules ... ok
+test select_views ... ok
+test portals_p2 ... ok
+test foreign_key ... ok
+test cluster ... ok
+test dependency ... ok
+test guc ... ok
+test bitmapops ... ok
+test combocid ... ok
+test tsearch ... ok
+test tsdicts ... ok
+test foreign_data ... ok
+test window ... ok
+test xmlmap ... ok
+test functional_deps ... ok
+test advisory_lock ... ok
+test json ... ok
+test jsonb ... ok
+test indirect_toast ... ok
+test equivclass ... ok
+test plancache ... ok
+test limit ... ok
+test plpgsql ... ok
+test copy2 ... ok
+test temp ... ok
+test domain ... ok
+test rangefuncs ... ok
+test prepare ... ok
+test without_oid ... ok
+test conversion ... ok
+test truncate ... ok
+test alter_table ... ok
+test sequence ... ok
+test polymorphism ... ok
+test rowtypes ... ok
+test returning ... ok
+test largeobject ... ok
+test with ... ok
+test xml ... ok
+test event_trigger ... ok
+test stats ... ok
+test tablesample ... ok
+test mv_dependencies ... ok
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index a2e0ceb..66925b3 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -156,3 +156,4 @@ test: xml
test: event_trigger
test: stats
test: tablesample
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..2491aca
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,150 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (unknown_column);
+
+-- single column
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a);
+
+-- single column, duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE functional_dependencies ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- correct command
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE functional_dependencies ADD STATISTICS (dependencies) ON (a, b, c, d);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
--
1.9.3
0004-multivariate-MCV-lists-v7.patchtext/x-patch; name=0004-multivariate-MCV-lists-v7.patchDownload
>From 3114c82ae310d840f613583b169ac1cc79520f81 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 16:52:15 +0200
Subject: [PATCH 4/6] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
---
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/tablecmds.c | 89 ++-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 1079 ++++++++++++++++++++++++++--
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 104 ++-
src/backend/utils/mvstats/common.h | 11 +-
src/backend/utils/mvstats/mcv.c | 1237 ++++++++++++++++++++++++++++++++
src/bin/psql/describe.c | 25 +-
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 69 +-
src/test/regress/expected/mv_mcv.out | 207 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/regression.diffs | 30 -
src/test/regress/regression.out | 156 ----
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 178 +++++
21 files changed, 2940 insertions(+), 288 deletions(-)
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
delete mode 100644 src/test/regress/regression.diffs
delete mode 100644 src/test/regress/regression.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 0dedaba..3144a29 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -156,7 +156,9 @@ CREATE VIEW pg_mv_stats AS
C.relname AS tablename,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 107e9fc..0d72aec 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11918,7 +11918,13 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
Relation mvstatrel;
/* by default build nothing */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(def, StatisticsDef));
@@ -11973,6 +11979,29 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -11981,10 +12010,16 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -12000,9 +12035,13 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
- nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
@@ -12049,7 +12088,13 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
/* checking whether the statistics matches / should be dropped */
bool build_dependencies = false;
+ bool build_mcv = false;
+
+ bool max_mcv_items = 0;
+
bool check_dependencies = false;
+ bool check_mcv = false;
+ bool check_mcv_items = false;
if (def != NULL)
{
@@ -12091,6 +12136,18 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
check_dependencies = true;
build_dependencies = defGetBoolean(opt);
}
+ else if (strcmp(opt->defname, "mcv") == 0)
+ {
+ check_mcv = true;
+ build_mcv = defGetBoolean(opt);
+ }
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ check_mcv = true;
+ check_mcv_items = true;
+ build_mcv = true;
+ max_mcv_items = defGetInt32(opt);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -12130,6 +12187,30 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
(DatumGetBool(adatum) == build_dependencies);
}
+ if (delete && check_mcv)
+ {
+ bool isnull;
+ Datum adatum = heap_getattr(tuple,
+ Anum_pg_mv_statistic_mcv_enabled,
+ RelationGetDescr(statrel),
+ &isnull);
+
+ delete = (! isnull) &&
+ (DatumGetBool(adatum) == build_mcv);
+ }
+
+ if (delete && check_mcv_items)
+ {
+ bool isnull;
+ Datum adatum = heap_getattr(tuple,
+ Anum_pg_mv_statistic_mcv_max_items,
+ RelationGetDescr(statrel),
+ &isnull);
+
+ delete = (! isnull) &&
+ (DatumGetInt32(adatum) == max_mcv_items);
+ }
+
/* check that the columns match the statistics definition */
if (delete && (numcols > 0))
{
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 93a6f04..1867ab7 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1907,9 +1907,11 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
+ WRITE_BOOL_FIELD(mcv_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
+ WRITE_BOOL_FIELD(mcv_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 6365425..95872de 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -15,6 +15,7 @@
#include "postgres.h"
#include "access/sysattr.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
@@ -47,17 +48,38 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+ Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int type);
static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
+ int type);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, List *stats,
SpecialJoinInfo *sjinfo);
+static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
+
+static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid,
+ List **mvclauses, MVStatisticInfo *mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
@@ -85,6 +107,13 @@ static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
static Bitmapset * get_varattnos(Node * node, Index relid);
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,isor) \
+ (m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -250,8 +279,12 @@ clauselist_selectivity(PlannerInfo *root,
*/
if (has_stats(stats, MV_CLAUSE_TYPE_FDEP))
{
- /* collect attributes referenced by mv-compatible clauses */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+ /*
+ * Collect attributes referenced by mv-compatible clauses (looking
+ * for clauses compatible with functional dependencies for now).
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_FDEP);
/*
* If there are mv-compatible clauses, referencing at least two
@@ -268,6 +301,48 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Check that there are statistics with MCV list. If not, we don't
+ * need to waste time with the optimization.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV))
+ {
+ /*
+ * Recollect attributes from mv-compatible clauses (maybe we've
+ * removed so many clauses we have a single mv-compatible attnum).
+ * From now on we're only interested in MCV-compatible clauses.
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_MCV);
+
+ /*
+ * If there still are at least two columns, we'll try to select
+ * a suitable multivariate stats.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /* see choose_mv_statistics() for details */
+ MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+
+ if (mvstat != NULL) /* we have a matching stats */
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -942,12 +1017,129 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Estimate selectivity for the list of MV-compatible clauses, using
+ * using a MV statistics (combining a histogram and MCV list).
+ *
+ * This simply passes the estimation to the MCV list and then to the
+ * histogram, if available.
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities
+ * (i.e. the selectivity of the most restrictive clause), because
+ * that's the maximum we can ever get from ANDed list of clauses.
+ * This may probably prevent issues with hitting too many buckets
+ * and low precision histograms.
+ *
+ * TODO We may support some additional conditions, most importantly
+ * those matching multiple columns (e.g. "a = b" or "a < b").
+ * Ultimately we could track multi-table histograms for join
+ * cardinality estimation.
+ *
+ * TODO Further thoughts on processing equality clauses: Maybe it'd be
+ * better to look for stats (with MCV) covered by the equality
+ * clauses, because then we have a chance to find an exact match
+ * in the MCV list, which is pretty much the best we can do. We may
+ * also look at the least frequent MCV item, and use it as a upper
+ * boundary for the selectivity (had there been a more frequent
+ * item, it'd be in the MCV list).
+ *
+ * TODO There are several options for 'sanity clamping' the estimates.
+ *
+ * First, if we have selectivities for each condition, then
+ *
+ * P(A,B) <= MIN(P(A), P(B))
+ *
+ * Because additional conditions (connected by AND) can only lower
+ * the probability.
+ *
+ * So we can do some basic sanity checks using the single-variate
+ * stats (the ones we have right now).
+ *
+ * Second, when we have multivariate stats with a MCV list, then
+ *
+ * (a) if we have a full equality condition (one equality condition
+ * on each column) and we found a match in the MCV list, this is
+ * the selectivity (and it's supposed to be exact)
+ *
+ * (b) if we have a full equality condition and we haven't found a
+ * match in the MCV list, then the selectivity is below the
+ * lowest selectivity in the MCV list
+ *
+ * (c) if we have a equality condition (not full), we can still
+ * search the MCV for matches and use the sum of probabilities
+ * as a lower boundary for the histogram (if there are no
+ * matches in the MCV list, then we have no boundary)
+ *
+ * Third, if there are multiple (combinations of) multivariate
+ * stats for a set of clauses, we may compute all of them and then
+ * somehow aggregate them - e.g. by choosing the minimum, median or
+ * average. The stats are susceptible to overestimation (because
+ * we take 50% of the bucket for partial matches). Some stats may
+ * give better estimates than others, but it's very difficult to
+ * say that in advance which one is the best (it depends on the
+ * number of buckets, number of additional columns not referenced
+ * in the clauses, type of condition etc.).
+ *
+ * So we may compute them all and then choose a sane aggregation
+ * (minimum seems like a good approach). Of course, this may result
+ * in longer / more expensive estimation (CPU-wise), but it may be
+ * worth it.
+ *
+ * It's possible to add a GUC choosing whether to do a 'simple'
+ * (using a single stats expected to give the best estimate) and
+ * 'complex' (combining the multiple estimates).
+ *
+ * multivariate_estimates = (simple|full)
+ *
+ * Also, this might be enabled at a table level, by something like
+ *
+ * ALTER TABLE ... SET STATISTICS (simple|full)
+ *
+ * Which would make it possible to use this only for the tables
+ * where the simple approach does not work.
+ *
+ * Also, there are ways to optimize this algorithmically. E.g. we
+ * may try to get an estimate from a matching MCV list first, and
+ * if we happen to get a "full equality match" we may stop computing
+ * the estimates from other stats (for this condition) because
+ * that's probably the best estimate we can really get.
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive). But this
+ * requires really knowing the per-clause selectivities in advance,
+ * and that's not what we do now.
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
- Index *relid, SpecialJoinInfo *sjinfo)
+ Index *relid, SpecialJoinInfo *sjinfo, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
@@ -963,12 +1155,11 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
+ sjinfo, types);
}
/*
@@ -987,6 +1178,188 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
}
/*
+ * We're looking for statistics matching at least 2 attributes,
+ * referenced in the clauses compatible with multivariate statistics.
+ * The current selection criteria is very simple - we choose the
+ * statistics referencing the most attributes.
+ *
+ * If there are multiple statistics referencing the same number of
+ * columns (from the clauses), the one with less source columns
+ * (as listed in the ADD STATISTICS when creating the statistics) wins.
+ * Other wise the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns,
+ * but one has 100 buckets and the other one has 1000 buckets (thus
+ * likely providing better estimates), this is not currently
+ * considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list,
+ * another one with just a histogram and a third one with both,
+ * this is not considered.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts,
+ * so if there are multiple clauses on a single attribute, this
+ * still counts as a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example
+ * equality clauses probably work better with MCV lists than with
+ * histograms. But IS [NOT] NULL conditions may often work better
+ * with histograms (thanks to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
+ * selected as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list
+ * of clauses into two parts - conditions that are compatible with the
+ * selected stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while
+ * the last condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting
+ * conditions instead of just referenced attributes), but eventually
+ * the best option should be to combine multiple statistics. But that's
+ * much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing
+ * the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses,
+ * because 'dependencies' will probably work only with equality
+ * clauses.
+ */
+static MVStatisticInfo *
+choose_mv_statistics(List *stats, Bitmapset *attnums)
+{
+ int i;
+ ListCell *lc;
+
+ MVStatisticInfo *choice = NULL;
+
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements)
+ * and for each one count the referenced attributes (encoded in
+ * the 'attnums' bitmap).
+ */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = info->stakeys;
+ int numattrs = attrs->dim1;
+
+ /* skip dependencies-only stats */
+ if (! info->mcv_built)
+ continue;
+
+ /* count columns covered by the histogram */
+ for (i = 0; i < numattrs; i++)
+ if (bms_is_member(attrs->values[i], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = info;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen statistics, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes covered by the stats, so we can
+ * do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(root, clause, varRelid, NULL,
+ &attnums, sjinfo, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list
+ * of mv-compatible clauses. Otherwise, keep it in the list of
+ * 'regular' clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
* into simple_rte_array) and a bitmap of attributes. This is then
@@ -1005,8 +1378,12 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
static bool
clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+ Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int types)
{
+ Relids clause_relids;
+ Relids left_relids;
+ Relids right_relids;
if (IsA(clause, RestrictInfo))
{
@@ -1016,82 +1393,176 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
if (rinfo->pseudoconstant)
return false;
- /* no support for OR clauses at this point */
- if (rinfo->orclause)
- return false;
-
/* get the actual clause from the RestrictInfo (it's not an OR clause) */
clause = (Node*)rinfo->clause;
- /* only simple opclauses are compatible with multivariate stats */
- if (! is_opclause(clause))
- return false;
-
/* we don't support join conditions at this moment */
if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
return false;
+ clause_relids = rinfo->clause_relids;
+ left_relids = rinfo->left_relids;
+ right_relids = rinfo->right_relids;
+ }
+ else if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
+ {
+ left_relids = pull_varnos(get_leftop((Expr*)clause));
+ right_relids = pull_varnos(get_rightop((Expr*)clause));
+
+ clause_relids = bms_union(left_relids,
+ right_relids);
+ }
+ else
+ {
+ /* Not a binary opclause, so mark left/right relid sets as empty */
+ left_relids = NULL;
+ right_relids = NULL;
+ /* and get the total relid set the hard way */
+ clause_relids = pull_varnos((Node *) clause);
+ }
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
/* is it 'variable op constant' ? */
- if (list_length(((OpExpr *) clause)->args) == 2)
+
+ ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ left_relids)));
+
+ if (ok)
{
- OpExpr *expr = (OpExpr *) clause;
- bool varonleft = true;
- bool ok;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
- ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
- (is_pseudo_constant_clause_relids(lsecond(expr->args),
- rinfo->right_relids) ||
- (varonleft = false,
- is_pseudo_constant_clause_relids(linitial(expr->args),
- rinfo->left_relids)));
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
- if (ok)
- {
- Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
- /*
- * Simple variables only - otherwise the planner_rt_fetch seems to fail
- * (return NULL).
- *
- * TODO Maybe use examine_variable() would fix that?
- */
- if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
- return false;
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
- /*
- * Only consider this variable if (varRelid == 0) or when the varno
- * matches varRelid (see explanation at clause_selectivity).
- *
- * FIXME I suspect this may not be really necessary. The (varRelid == 0)
- * part seems to be enforced by treat_as_join_clause().
- */
- if (! ((varRelid == 0) || (varRelid == var->varno)))
- return false;
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
+ *relid = var->varno;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ /* not compatible with functional dependencies */
+ if (types & MV_CLAUSE_TYPE_MCV)
+ {
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return (types & MV_CLAUSE_TYPE_MCV);
+ }
+ return false;
+
+ case F_EQSEL:
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return true;
+ }
+ }
+ }
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ Var * var = (Var*)((NullTest*)clause)->arg;
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
- /* Also skip special varno values, and system attributes ... */
- if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
- return false;
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
*relid = var->varno;
- /*
- * If it's not a "<" or ">" or "=" operator, just ignore the
- * clause. Otherwise note the relid and attnum for the variable.
- * This uses the function for estimating selectivity, ont the
- * operator directly (a bit awkward, but well ...).
- */
- switch (get_oprrest(expr->opno))
- {
- case F_EQSEL:
- *attnum = var->varattno;
- return true;
- }
- }
+ *attnums = bms_add_member(*attnums, var->varattno);
+
+ return true;
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /*
+ * AND/OR-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses
+ * are supported and some are not, and treat all supported
+ * subclauses as a single clause, compute it's selectivity
+ * using mv stats, and compute the total selectivity using
+ * the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the
+ * orclause with nested RestrictInfo - we won't have to
+ * call pull_varnos() for each clause, saving time.
+ */
+ Bitmapset *tmp = NULL;
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ if (! clause_is_mv_compatible(root, (Node*)lfirst(l),
+ varRelid, relid, &tmp, sjinfo, types))
+ return false;
}
+
+ /* add the attnums from the OR-clause to the set of attnums */
+ *attnums = bms_join(*attnums, tmp);
+
+ return true;
}
return false;
-
}
/*
@@ -1340,6 +1811,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
}
return false;
@@ -1635,25 +2109,39 @@ fdeps_filter_clauses(PlannerInfo *root,
foreach (lc, clauses)
{
- AttrNumber attnum;
+ Bitmapset *attnums = NULL;
Node *clause = (Node *) lfirst(lc);
- if (! clause_is_mv_compatible(root, clause, varRelid, relid,
- &attnum, sjinfo))
+ if (! clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
+ sjinfo, MV_CLAUSE_TYPE_FDEP))
/* clause incompatible with functional dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
- else if (! bms_is_member(attnum, deps_attnums))
+ else if (bms_num_members(attnums) > 1)
+
+ /*
+ * clause referencing multiple attributes (strange, should
+ * this be handled by clause_is_mv_compatible directly)
+ */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(bms_singleton_member(attnums), deps_attnums))
/* clause not covered by the dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
else
{
+ /* ok, clause compatible with existing dependencies */
+ Assert(bms_num_members(attnums) == 1);
+
*deps_clauses = lappend(*deps_clauses, clause);
- clause_attnums = bms_add_member(clause_attnums, attnum);
+ clause_attnums = bms_add_member(clause_attnums,
+ bms_singleton_member(attnums));
}
+
+ bms_free(attnums);
}
return clause_attnums;
@@ -1691,3 +2179,454 @@ get_varattnos(Node * node, Index relid)
return result;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = load_mv_mcvlist(mvstats->mvoid);
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* used to 'scale' for MCV lists not covering all tuples */
+ u += mcvlist->items[i]->frequency;
+
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s*u;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /* frequency of the lowest MCV item */
+ *lowsel = 1.0;
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ *
+ * FIXME This would probably deserve a refactoring, I guess. Unify
+ * the two loops and put the checks inside, or something like
+ * that.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ /* operator */
+ FmgrInfo opproc;
+
+ /* get procedure computing operator selectivity */
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo ltproc, gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype,
+ TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* TODO consider bsearch here (list is sorted by values)
+ * TODO handle other operators too (LT, GT)
+ * TODO identify "full match" when the clauses fully
+ * match the whole MCV list (so that checking the
+ * histogram is not needed)
+ */
+ if (oprrest == F_EQSEL)
+ {
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ bool match = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (match)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ mismatch = (! match);
+ }
+ else if (oprrest == F_SCALARLTSEL) /* column < constant */
+ {
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ } /* (get_oprrest(expr->opno) == F_SCALARLTSEL) */
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+ }
+ else if (oprrest == F_SCALARGTSEL) /* column > constant */
+ {
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARGTSEL) */
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! mcvlist->items[i]->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (mcvlist->items[i]->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index c397773..8c4396a 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -409,7 +409,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built)
+ if (mvstat->deps_built || mvstat->mcv_built)
{
info = makeNode(MVStatisticInfo);
@@ -418,9 +418,11 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
+ info->mcv_enabled = mvstat->mcv_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
+ info->mcv_built = mvstat->mcv_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..f9bf10c 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o dependencies.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index bd200bc..d1da714 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,12 +16,14 @@
#include "common.h"
+#include "utils/array.h"
+
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+ int natts,
+ VacAttrStats **vacattrstats);
static List* list_mv_stats(Oid relid);
-
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -49,6 +51,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int j;
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -87,8 +91,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (stat->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, attrs);
+ update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -166,6 +174,8 @@ list_mv_stats(Oid relid)
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
+ info->mcv_enabled = stats->mcv_enabled;
+ info->mcv_built = stats->mcv_built;
result = lappend(result, info);
}
@@ -180,8 +190,56 @@ list_mv_stats(Oid relid)
return result;
}
+
+/*
+ * Find attnims of MV stats using the mvoid.
+ */
+int2vector*
+find_mv_attnums(Oid mvoid, Oid *relid)
+{
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ HeapTuple htup;
+ int2vector *keys;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ htup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+ /* starelid */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_starelid, &isnull);
+ Assert(!isnull);
+
+ *relid = DatumGetObjectId(adatum);
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+ ReleaseSysCache(htup);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return keys;
+}
+
+
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -206,18 +264,29 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
@@ -246,6 +315,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -256,11 +340,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index 6d5465b..f4309f7 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -46,7 +46,15 @@ typedef struct
Datum value; /* a data value */
int tupno; /* position index for tuple it came from */
} ScalarItem;
-
+
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -71,5 +79,6 @@ int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..670dbda
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1237 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+
+/*
+ * Multivariate MCVs (most-common values lists) are a straightforward
+ * extension of regular MCV list, tracking combinations of values for
+ * several attributes (columns), including NULL flags, and frequency
+ * of the combination.
+ *
+ * For columns with small number of distinct values, this works quite
+ * well and may represent the distribution very accurately. For columns
+ * with large number of distinct values (e.g. stored as FLOAT), this
+ * does not work that well. Especially if the distribution is mostly
+ * uniform, with no very common combinations.
+ *
+ * If we can represent the distribution as a MCV list, we can estimate
+ * some clauses (e.g. equality clauses) much accurately than using
+ * histograms for example.
+ *
+ * Another benefit of MCV lists (compared to histograms) is that they
+ * don't require sorting of the values, so that they work better for
+ * data types that either don't support sorting at all, or when the
+ * sorting does not really match the meaning. For example we know how to
+ * sort strings, but it's unlikely to make much sense for city names.
+ *
+ *
+ * Hashed MCV (not yet implemented)
+ * --------------------------------
+ * By restricting to MCV list and equality conditions, we may use hash
+ * values instead of the long varlena values. This significantly reduces
+ * the storage requirements, and we can still use it to estimate the
+ * equality conditions (assuming the collisions are rare enough).
+ *
+ * This however complicates matching the columns to available stats, as
+ * it requires matching clauses (not columns) to stats. And it may get
+ * quite complex - e.g. what if there are multiple clauses, each
+ * compatible with different stats subset?
+ *
+ *
+ * Selectivity estimation
+ * ----------------------
+ * The estimation, implemented in clauselist_mv_selectivity_mcvlist(),
+ * is quite simple in principle - walk through the MCV items and sum
+ * frequencies of all the items that match all the clauses.
+ *
+ * The current implementation uses MCV lists to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ * (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ * (d) OR clauses WHERE (a < 1) OR (b >= 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (e) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ *
+ * Estimating equality clauses
+ * ---------------------------
+ * When computing selectivity estimate for equality clauses
+ *
+ * (a = 1) AND (b = 2)
+ *
+ * we can do this estimate pretty exactly assuming that two conditions
+ * are met:
+ *
+ * (1) there's an equality condition on each attribute
+ *
+ * (2) we find a matching item in the MCV list
+ *
+ * In that case we know the MCV item represents all the tuples matching
+ * the clauses, and the selectivity estimate is complete. This is what
+ * we call 'full match'.
+ *
+ * When only (1) holds, but there's no matching MCV item, we don't know
+ * whether there are no such rows or just are not very frequent. We can
+ * however use the frequency of the least frequent MCV item as an upper
+ * bound for the selectivity.
+ *
+ * If the equality conditions match only a subset of the attributes
+ * the MCV list is built on (i.e. we can't get a full match - we may get
+ * multiple MCV items matching the clauses, but even if we get a single
+ * match there may be items that did not get into the MCV list. But in
+ * this case we can still use the frequency of the last MCV item to clam
+ * the 'additional' selectivity not accounted for by the matching items.
+ *
+ * If there's no histogram, because the MCV list approximates the
+ * distribution accurately (not because the histogram was disabled),
+ * it does not really matter whether there are equality conditions on
+ * all the columns - we can do pretty accurate estimation using the MCV.
+ *
+ * TODO For a combination of equality conditions (not full-match case)
+ * we probably can clamp the selectivity by the minimum of
+ * selectivities for each condition. For example if we know the
+ * number of distinct values for each column, we can use 1/ndistinct
+ * as a per-column estimate. Or rather 1/ndistinct + selectivity
+ * derived from the MCV list.
+ *
+ * If we know the estimate of number of combinations of the columns
+ * (i.e. ndistinct(A,B)), we may estimate the average frequency of
+ * items in the remaining 10% as [10% / ndistinct(A,B)].
+ *
+ *
+ * Bounding estimates
+ * ------------------
+ * In general the MCV lists may not provide estimates as accurate as
+ * for the full-match equality case, but may provide some useful
+ * lower/upper boundaries for the estimation error.
+ *
+ * With equality clauses we can do a few more tricks to narrow this
+ * error range (see the previous section and TODO), but with inequality
+ * clauses (or generally non-equality clauses), it's rather dificult.
+ * There's nothing like a 'full match' - we have to consider both the
+ * MCV items and the remaining part every time. We can't use the minimum
+ * selectivity of MCV items, as the clauses may match multiple items.
+ *
+ * For example with a MCV list on columns (A, B), covering 90% of the
+ * table (computed while building the MCV list), about ~10% of the table
+ * is not represented by the MCV list. So even if the conditions match
+ * all the remaining rows (not represented by the MCV items), we can't
+ * get selectivity higher than those 10%. We may use 1/2 the remaining
+ * selectivity as an estimate (minimizing average error).
+ *
+ * TODO Most of these ideas (error limiting) are not yet implemented.
+ *
+ *
+ * General TODO
+ * ------------
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * TODO Add support for clauses referencing multiple columns (a < b).
+ *
+ * TODO It's possible to build a special case of MCV list, storing not
+ * the actual values but only 32/64-bit hash. This is only useful
+ * for estimating equality clauses and for large varlena types,
+ * which are very impractical for plain MCV list because of size.
+ * But for those data types we really want just the equality
+ * clauses, so it's actually a good solution.
+ *
+ * TODO Currently there's no logic to consider building only a MCV list
+ * (and not building the histogram at all), except for doing this
+ * decision manually in ADD STATISTICS.
+ */
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(int32))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(int32) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/* pointers into a flat serialized item of ITEM_SIZE(n) bytes */
+#define ITEM_INDEXES(item) ((uint16*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+/*
+ * Builds MCV list from sample rows, and removes rows represented by
+ * the MCV list from the sample (the number of remaining sample rows is
+ * returned by the numrows_filtered parameter).
+ *
+ * The method is quite simple - in short it does about these steps:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * For more details, see the comments in the code.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we
+ * want to check the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or
+ * float4). Maybe we could save some space here, but the bytea
+ * compression should handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed
+ * from the table, but rather estimate the number of distinct
+ * values in the table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * Preallocate space for all the items as a single chunk, and point
+ * the items to the appropriate parts of the array.
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ /* keep all the rows by default (as if there was no MCV list) */
+ *numrows_filtered = numrows;
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ items[j].values[i] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Count the number of distinct groups - just walk through the
+ * sorted list and count the number of key changes. We use this to
+ * determine the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array. We'll always
+ * require at least 4 rows per group.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e.
+ * if there are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS),
+ * we'll require only 2 rows per group.
+ *
+ * TODO For now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ * for the multivariate case.
+ *
+ * TODO We can do this only if we believe we got all the distinct
+ * values of the table.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog)
+ * instead of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /*
+ * Walk through the sorted data again, and see how many groups
+ * reach the mcv_threshold (and become an item in the MCV list).
+ */
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or new group, so check if we exceed mcv_threshold */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* group hits the threshold, count the group as MCV item */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* within group, so increase the number of items */
+ count += 1;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as
+ * we'll pass this outside this method and thus it needs to be
+ * easy to pfree() the data (and we wouldn't know where the
+ * arrays start).
+ *
+ * TODO Maybe the reasoning that we can't allocate a single
+ * piece because we're passing it out is bogus? Who'd
+ * free a single item of the MCV list, anyway?
+ *
+ * TODO Maybe with a proper encoding (stuffing all the values
+ * into a list-level array, this will be untrue)?
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /*
+ * Repeat the same loop as above, but this time copy the data
+ * into the MCV list (for items exceeding the threshold).
+ *
+ * TODO Maybe we could simply remember indexes of the last item
+ * in each group (from the previous loop)?
+ */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[nitems];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, items[(i-1)].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, items[(i-1)].isnull, sizeof(bool) * numattrs);
+
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ /* next item */
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows
+ * that are not represented by the MCV list).
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ *
+ * A better option would be keeping the ID of the row in
+ * the sort item, and then just walk through the items and
+ * mark rows to remove (in a bitmap of the same size).
+ * There's not space for that in SortItem at this moment,
+ * but it's trivial to add 'private' pointer, or just
+ * using another structure with extra field (starting with
+ * SortItem, so that the comparators etc. still work).
+ *
+ * Another option is to use the sorted array of items
+ * (because that's how we sorted the source data), and
+ * simply do a bsearch() into it. If we find a matching
+ * item, the row belongs to the MCV list.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* used for the searches */
+ SortItem item, mcvitem;;
+
+ item.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ item.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * FIXME we don't need to allocate this, we can reference
+ * the MCV item directly ...
+ */
+ mcvitem.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ mcvitem.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ item.values[j] = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &item.isnull[j]);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /*
+ * TODO Create a SortItem/MCVItem comparator so that
+ * we don't need to do memcpy() like crazy.
+ */
+ memcpy(mcvitem.values, mcvlist->items[j]->values,
+ numattrs * sizeof(Datum));
+ memcpy(mcvitem.isnull, mcvlist->items[j]->isnull,
+ numattrs * sizeof(bool));
+
+ if (multi_sort_compare(&item, &mcvitem, mss) == 0)
+ {
+ match = true;
+ break;
+ }
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the rows and remember how many rows we kept */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(rows_filtered);
+ pfree(item.values);
+ pfree(item.isnull);
+ pfree(mcvitem.values);
+ pfree(mcvitem.isnull);
+ }
+ }
+
+ pfree(values);
+ pfree(items);
+ pfree(isnull);
+
+ return mcvlist;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+MCVList
+load_mv_mcvlist(Oid mvoid)
+{
+ bool isnull = false;
+ Datum mcvlist;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->mcv_enabled && mvstat->mcv_built);
+#endif
+
+ mcvlist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_mcvlist(DatumGetByteaP(mcvlist));
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Serialize MCV list into a bytea value. The basic algorithm is simple:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use uint16 values for the indexes in step (3), as we don't
+ * allow more than 8k MCV items (see list max_mcv_items). We might
+ * increase this to 65k and still fit into uint16.
+ *
+ * We don't really expect the high compression as with histograms,
+ * because we're not doing any bucket splits etc. (which is the source
+ * of high redundancy there), but we need to do it anyway as we need
+ * to serialize varlena values etc. We might invent another way to
+ * serialize MCV lists, but let's keep it consistent.
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider using 16-bit values for the indexes in step (3).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ Size total_length = 0;
+
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for all values, including NULLs (won't use them) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ if (! mcvlist->items[j]->isnull[i]) /* skip NULL values */
+ {
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* do not exceed UINT16_MAX */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval || (info[i].typlen > 0))
+ /* by value pased by reference, but fixed length */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data.
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized MCV exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'ptr' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the values for each item */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! mcvlist->items[i]->isnull[j])
+ {
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ v = (Datum*)bsearch(&mcvlist->items[i]->values[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims),
+ mcvlist->items[i]->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims),
+ &mcvlist->items[i]->frequency, sizeof(double));
+
+ /* copy the item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/*
+ * Inverse to serialize_mv_mcvlist() - see the comment there.
+ *
+ * We'll do full deserialization, because we don't really expect high
+ * duplication of values so the caching may not be as efficient as with
+ * histograms.
+ */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ uint16 *indexes = NULL;
+ Datum **values = NULL;
+
+ /* local allocation buffer (used only for deserialization) */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* buffer used for the result */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert(nitems > 0);
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MCVListData,items) +
+ ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * We'll allocate one large chunk of memory for the intermediate
+ * data, needed only for deserializing the MCV list, and we'll pack
+ * use a local dense allocation to minimize the palloc overhead.
+ *
+ * Let's see how much space we'll actually need, and also include
+ * space for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ /* for full-size byval types, we reuse the serialized value */
+ if (! (info[i].typbyval && info[i].typlen == sizeof(Datum)))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ /* pased by reference, but fixed length (name, tid, ...) */
+ if (info[i].typlen > 0)
+ {
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should exhaust the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for the MCV items in a single piece */
+ rbufflen = (sizeof(MCVItem) + sizeof(MCVItemData) +
+ sizeof(Datum)*ndims + sizeof(bool)*ndims) * nitems;
+
+ rbuff = palloc(rbufflen);
+ rptr = rbuff;
+
+ mcvlist->items = (MCVItem*)rbuff;
+ rptr += (sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)rptr;
+ rptr += (sizeof(MCVItemData));
+
+ item->values = (Datum*)rptr;
+ rptr += (sizeof(Datum)*ndims);
+
+ item->isnull = (bool*)rptr;
+ rptr += (sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+ for (j = 0; j < ndims; j++)
+ Assert(indexes[j] <= UINT16_MAX);
+#endif
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ /* release the temporary buffer */
+ pfree(buff);
+
+ return mcvlist;
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_mcv_items);
+
+Datum
+pg_mv_mcv_items(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MCVList mcvlist;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ mcvlist = load_mv_mcvlist(PG_GETARG_OID(0));
+
+ funcctx->user_fctx = mcvlist;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = mcvlist->nitems;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MCVList mcvlist;
+ MCVItem item;
+
+ mcvlist = (MCVList)funcctx->user_fctx;
+
+ Assert(call_cntr < mcvlist->nitems);
+
+ item = mcvlist->items[call_cntr];
+
+ stakeys = find_mv_attnums(PG_GETARG_OID(0), &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(4 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+
+ /* frequency */
+ values[3] = (char *) palloc(64 * sizeof(char));
+
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * mcvlist->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* item ID */
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ Datum val, valout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == mcvlist->ndimensions-1)
+ format = "%s, %s}";
+
+ val = item->values[i];
+ valout = FunctionCall1(&fmgrinfo[i], val);
+
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 5f89604..01d29db 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2101,8 +2101,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stakeys,\n"
- " deps_enabled,\n"
- " deps_built,\n"
+ " deps_enabled, mcv_enabled,\n"
+ " deps_built, mcv_built,\n"
+ " mcv_max_items,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2120,14 +2121,28 @@ describeOneTableDetails(const char *schemaname,
printTableAddFooter(&cont, _("Statistics:"));
for (i = 0; i < tuples; i++)
{
+ bool first = true;
+
printfPQExpBuffer(&buf, " ");
/* options */
if (!strcmp(PQgetvalue(result, i, 2), "t"))
- appendPQExpBuffer(&buf, "(dependencies)");
+ {
+ appendPQExpBuffer(&buf, "(dependencies");
+ first = false;
+ }
+
+ if (!strcmp(PQgetvalue(result, i, 3), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", mcv");
+ else
+ appendPQExpBuffer(&buf, "(mcv");
+ first = false;
+ }
- appendPQExpBuffer(&buf, " ON (%s)",
- PQgetvalue(result, i, 6));
+ appendPQExpBuffer(&buf, ") ON (%s)",
+ PQgetvalue(result, i, 8));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 81ec23b..c6e7d74 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -35,15 +35,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -59,11 +65,15 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-#define Natts_pg_mv_statistic 5
+#define Natts_pg_mv_statistic 9
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_deps_enabled 2
-#define Anum_pg_mv_statistic_deps_built 3
-#define Anum_pg_mv_statistic_stakeys 4
-#define Anum_pg_mv_statistic_stadeps 5
+#define Anum_pg_mv_statistic_mcv_enabled 3
+#define Anum_pg_mv_statistic_mcv_max_items 4
+#define Anum_pg_mv_statistic_deps_built 5
+#define Anum_pg_mv_statistic_mcv_built 6
+#define Anum_pg_mv_statistic_stakeys 7
+#define Anum_pg_mv_statistic_stadeps 8
+#define Anum_pg_mv_statistic_stamcv 9
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 69fc482..890c763 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2739,6 +2739,10 @@ DATA(insert OID = 3307 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3308 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 10f7425..917ae8d 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -572,9 +572,11 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
+ bool mcv_enabled; /* MCV list enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
+ bool mcv_built; /* MCV list built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 02a7dda..b028192 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -50,30 +50,89 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
*/
MVDependencies load_mv_dependencies(Oid mvoid);
+MCVList load_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..85e8499
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE mcv_list ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+ALTER TABLE mcv_list ADD STATISTICS (dependencies, max_mcv_items 200) ON (a, b, c);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10) ON (a, b, c);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10000) ON (a, b, c);
+ERROR: max number of MCV items is 8192
+-- correct command
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c, d);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index a12ad30..faa41c7 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1367,7 +1367,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
c.relname AS tablename,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 11d9d38..d083442 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/regression.diffs b/src/test/regress/regression.diffs
deleted file mode 100644
index 95b9cc5..0000000
--- a/src/test/regress/regression.diffs
+++ /dev/null
@@ -1,30 +0,0 @@
-*** /home/user/work/tvondra_postgres/src/test/regress/expected/rolenames.out Wed May 6 21:31:06 2015
---- /home/user/work/tvondra_postgres/src/test/regress/results/rolenames.out Mon May 25 22:24:21 2015
-***************
-*** 38,47 ****
---- 38,52 ----
- ORDER BY 2;
- $$ LANGUAGE SQL;
- CREATE ROLE "Public";
-+ ERROR: role "Public" already exists
- CREATE ROLE "None";
-+ ERROR: role "None" already exists
- CREATE ROLE "current_user";
-+ ERROR: role "current_user" already exists
- CREATE ROLE "session_user";
-+ ERROR: role "session_user" already exists
- CREATE ROLE "user";
-+ ERROR: role "user" already exists
- CREATE ROLE current_user; -- error
- ERROR: CURRENT_USER cannot be used as a role name here
- LINE 1: CREATE ROLE current_user;
-***************
-*** 938,940 ****
---- 943,946 ----
- DROP OWNED BY testrol0, "Public", "current_user", testrol1, testrol2, testrolx CASCADE;
- DROP ROLE testrol0, testrol1, testrol2, testrolx;
- DROP ROLE "Public", "None", "current_user", "session_user", "user";
-+ ERROR: current user cannot be dropped
-
-======================================================================
-
diff --git a/src/test/regress/regression.out b/src/test/regress/regression.out
deleted file mode 100644
index bd81385..0000000
--- a/src/test/regress/regression.out
+++ /dev/null
@@ -1,156 +0,0 @@
-test tablespace ... ok
-test boolean ... ok
-test char ... ok
-test name ... ok
-test varchar ... ok
-test text ... ok
-test int2 ... ok
-test int4 ... ok
-test int8 ... ok
-test oid ... ok
-test float4 ... ok
-test float8 ... ok
-test bit ... ok
-test numeric ... ok
-test txid ... ok
-test uuid ... ok
-test enum ... ok
-test money ... ok
-test rangetypes ... ok
-test pg_lsn ... ok
-test regproc ... ok
-test strings ... ok
-test numerology ... ok
-test point ... ok
-test lseg ... ok
-test line ... ok
-test box ... ok
-test path ... ok
-test polygon ... ok
-test circle ... ok
-test date ... ok
-test time ... ok
-test timetz ... ok
-test timestamp ... ok
-test timestamptz ... ok
-test interval ... ok
-test abstime ... ok
-test reltime ... ok
-test tinterval ... ok
-test inet ... ok
-test macaddr ... ok
-test tstypes ... ok
-test comments ... ok
-test geometry ... ok
-test horology ... ok
-test regex ... ok
-test oidjoins ... ok
-test type_sanity ... ok
-test opr_sanity ... ok
-test insert ... ok
-test insert_conflict ... ok
-test create_function_1 ... ok
-test create_type ... ok
-test create_table ... ok
-test create_function_2 ... ok
-test copy ... ok
-test copyselect ... ok
-test create_misc ... ok
-test create_operator ... ok
-test create_index ... ok
-test create_view ... ok
-test create_aggregate ... ok
-test create_function_3 ... ok
-test create_cast ... ok
-test constraints ... ok
-test triggers ... ok
-test inherit ... ok
-test create_table_like ... ok
-test typed_table ... ok
-test vacuum ... ok
-test drop_if_exists ... ok
-test updatable_views ... ok
-test rolenames ... FAILED
-test sanity_check ... ok
-test errors ... ok
-test select ... ok
-test select_into ... ok
-test select_distinct ... ok
-test select_distinct_on ... ok
-test select_implicit ... ok
-test select_having ... ok
-test subselect ... ok
-test union ... ok
-test case ... ok
-test join ... ok
-test aggregates ... ok
-test groupingsets ... ok
-test transactions ... ok
-test random ... ok
-test portals ... ok
-test arrays ... ok
-test btree_index ... ok
-test hash_index ... ok
-test update ... ok
-test delete ... ok
-test namespace ... ok
-test prepared_xacts ... ok
-test brin ... ok
-test gin ... ok
-test gist ... ok
-test spgist ... ok
-test privileges ... ok
-test security_label ... ok
-test collate ... ok
-test matview ... ok
-test lock ... ok
-test replica_identity ... ok
-test rowsecurity ... ok
-test object_address ... ok
-test alter_generic ... ok
-test misc ... ok
-test psql ... ok
-test async ... ok
-test rules ... ok
-test select_views ... ok
-test portals_p2 ... ok
-test foreign_key ... ok
-test cluster ... ok
-test dependency ... ok
-test guc ... ok
-test bitmapops ... ok
-test combocid ... ok
-test tsearch ... ok
-test tsdicts ... ok
-test foreign_data ... ok
-test window ... ok
-test xmlmap ... ok
-test functional_deps ... ok
-test advisory_lock ... ok
-test json ... ok
-test jsonb ... ok
-test indirect_toast ... ok
-test equivclass ... ok
-test plancache ... ok
-test limit ... ok
-test plpgsql ... ok
-test copy2 ... ok
-test temp ... ok
-test domain ... ok
-test rangefuncs ... ok
-test prepare ... ok
-test without_oid ... ok
-test conversion ... ok
-test truncate ... ok
-test alter_table ... ok
-test sequence ... ok
-test polymorphism ... ok
-test rowtypes ... ok
-test returning ... ok
-test largeobject ... ok
-test with ... ok
-test xml ... ok
-test event_trigger ... ok
-test stats ... ok
-test tablesample ... ok
-test mv_dependencies ... ok
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 66925b3..e63b7aa 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -157,3 +157,4 @@ test: event_trigger
test: stats
test: tablesample
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..5de3d29
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,178 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (unknown_column);
+
+-- single column
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a);
+
+-- single column, duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE mcv_list ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- missing MCV statistics
+ALTER TABLE mcv_list ADD STATISTICS (dependencies, max_mcv_items 200) ON (a, b, c);
+
+-- invalid mcv_max_items value / too low
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10) ON (a, b, c);
+
+-- invalid mcv_max_items value / too high
+ALTER TABLE mcv_list ADD STATISTICS (mcv, max_mcv_items 10000) ON (a, b, c);
+
+-- correct command
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE mcv_list ADD STATISTICS (mcv) ON (a, b, c, d);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DROP TABLE mcv_list;
--
1.9.3
0005-multivariate-histograms-v7.patchtext/x-patch; name=0005-multivariate-histograms-v7.patchDownload
>From 89db32a7015e92bb5642604b822e9d3a41db2701 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 5/6] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/tablecmds.c | 86 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 713 ++++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 37 +-
src/backend/utils/mvstats/histogram.c | 2188 ++++++++++++++++++++++++++++
src/bin/psql/describe.c | 17 +-
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 131 +-
src/test/regress/expected/mv_histogram.out | 207 +++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 176 +++
18 files changed, 3566 insertions(+), 38 deletions(-)
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3144a29..0a1c25b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,7 +158,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 0d72aec..4c2da51 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -11919,12 +11919,15 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
/* by default build nothing */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(def, StatisticsDef));
@@ -12002,6 +12005,29 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -12010,10 +12036,10 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -12021,6 +12047,11 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -12038,10 +12069,14 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
@@ -12064,6 +12099,7 @@ static void ATExecAddStatistics(AlteredTableInfo *tab, Relation rel,
return;
}
+
/*
* Implements the ALTER TABLE ... DROP STATISTICS in two forms:
*
@@ -12089,12 +12125,16 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
/* checking whether the statistics matches / should be dropped */
bool build_dependencies = false;
bool build_mcv = false;
+ bool build_histogram = false;
bool max_mcv_items = 0;
+ bool max_buckets = 0;
bool check_dependencies = false;
bool check_mcv = false;
bool check_mcv_items = false;
+ bool check_histogram = false;
+ bool check_buckets = false;
if (def != NULL)
{
@@ -12148,6 +12188,18 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
build_mcv = true;
max_mcv_items = defGetInt32(opt);
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ {
+ check_histogram = true;
+ build_histogram = defGetBoolean(opt);
+ }
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ check_histogram = true;
+ check_buckets = true;
+ max_buckets = defGetInt32(opt);
+ build_histogram = true;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -12211,6 +12263,30 @@ static void ATExecDropStatistics(AlteredTableInfo *tab, Relation rel,
(DatumGetInt32(adatum) == max_mcv_items);
}
+ if (delete && check_histogram)
+ {
+ bool isnull;
+ Datum adatum = heap_getattr(tuple,
+ Anum_pg_mv_statistic_hist_enabled,
+ RelationGetDescr(statrel),
+ &isnull);
+
+ delete = (! isnull) &&
+ (DatumGetBool(adatum) == build_histogram);
+ }
+
+ if (delete && check_buckets)
+ {
+ bool isnull;
+ Datum adatum = heap_getattr(tuple,
+ Anum_pg_mv_statistic_hist_max_buckets,
+ RelationGetDescr(statrel),
+ &isnull);
+
+ delete = (! isnull) &&
+ (DatumGetInt32(adatum) == max_buckets);
+ }
+
/* check that the columns match the statistics definition */
if (delete && (numcols > 0))
{
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 1867ab7..19d672f 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1908,10 +1908,12 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
WRITE_BOOL_FIELD(mcv_enabled);
+ WRITE_BOOL_FIELD(hist_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
WRITE_BOOL_FIELD(mcv_built);
+ WRITE_BOOL_FIELD(hist_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 95872de..bc02e92 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -49,6 +49,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
@@ -73,6 +74,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStatisticInfo *mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -80,6 +83,12 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
@@ -304,7 +313,7 @@ clauselist_selectivity(PlannerInfo *root,
* Check that there are statistics with MCV list. If not, we don't
* need to waste time with the optimization.
*/
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV))
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
/*
* Recollect attributes from mv-compatible clauses (maybe we've
@@ -312,7 +321,7 @@ clauselist_selectivity(PlannerInfo *root,
* From now on we're only interested in MCV-compatible clauses.
*/
mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- MV_CLAUSE_TYPE_MCV);
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/*
* If there still are at least two columns, we'll try to select
@@ -331,7 +340,7 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, sjinfo, clauses,
varRelid, &mvclauses, mvstat,
- MV_CLAUSE_TYPE_MCV);
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -1116,6 +1125,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -1129,9 +1139,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1273,7 +1298,7 @@ choose_mv_statistics(List *stats, Bitmapset *attnums)
int numattrs = attrs->dim1;
/* skip dependencies-only stats */
- if (! info->mcv_built)
+ if (! (info->mcv_built || info->hist_built))
continue;
/* count columns covered by the histogram */
@@ -1433,7 +1458,6 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
bool ok;
/* is it 'variable op constant' ? */
-
ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
(is_pseudo_constant_clause_relids(lsecond(expr->args),
right_relids) ||
@@ -1483,10 +1507,10 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
case F_SCALARLTSEL:
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (types & MV_CLAUSE_TYPE_MCV)
+ if (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
*attnums = bms_add_member(*attnums, var->varattno);
- return (types & MV_CLAUSE_TYPE_MCV);
+ return (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
}
return false;
@@ -1814,6 +1838,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
}
return false;
@@ -2630,3 +2657,671 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ int nmatches = 0;
+ char *matches = NULL;
+
+ MVSerializedHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = load_mv_histogram(mvstats->mvoid);
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * Find out what part of the data is covered by the histogram,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample got into the histogram, and the rest
+ * is in a MCV list).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole histogram, which might save us some time
+ * spent accessing the not-matching part of the histogram.
+ * Although it's likely in a cache, so it's very fast.
+ */
+ u += mvhist->buckets[i]->ntuples;
+
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s * u;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ /*
+ * Used for caching function calls, only once per deduplicated value.
+ *
+ * We know may have up to (2 * nbuckets) values per dimension. It's
+ * probably overkill, but let's allocate that once for all clauses,
+ * to minimize overhead.
+ *
+ * Also, we only need two bits per value, but this allocates byte
+ * per value. Might be worth optimizing.
+ *
+ * 0x00 - not yet called
+ * 0x01 - called, result is 'false'
+ * 0x03 - called, result is 'true'
+ */
+ char *callcache = palloc(mvhist->nbuckets);
+
+ Assert(mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ /* reset the cache (per clause) */
+ memset(callcache, 0, mvhist->nbuckets);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ *
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR
+ | TYPECACHE_GT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ bool tmp;
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /* histogram boundaries */
+ Datum minval, maxval;
+
+ /* values from the call cache */
+ char mincached, maxcached;
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* lookup the values and cache of function calls */
+ minval = mvhist->values[idx][bucket->min[idx]];
+ maxval = mvhist->values[idx][bucket->max[idx]];
+
+ mincached = callcache[bucket->min[idx]];
+ maxcached = callcache[bucket->max[idx]];
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ *
+ * FIXME Once the min/max values are deduplicated, we can easily minimize
+ * the number of calls to the comparator (assuming we keep the
+ * deduplicated structure). See the note on compression at MVBucket
+ * serialize/deserialize methods.
+ */
+ switch (oprrest)
+ {
+ case F_SCALARLTSEL: /* column < constant */
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (minval, constvalue).
+ */
+ callcache[bucket->min[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(mincached & 0x02); /* get call result from the cache (inverse) */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ maxval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (minval, constvalue).
+ */
+ callcache[bucket->max[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(maxcached & 0x02); /* extract the result (reverse) */
+
+ if (tmp) /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+
+ }
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->max[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ minval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (mincached & 0x02); /* extract the result */
+
+ if (tmp) /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ break;
+
+ case F_SCALARGTSEL: /* column > constant */
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ maxval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (val, constvalue).
+ */
+ callcache[bucket->max[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (val, constvalue).
+ */
+ callcache[bucket->min[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(mincached & 0x02); /* extract the result */
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ minval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (mincached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->max[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the lt/gt
+ * operators fetched from type cache.
+ *
+ * TODO We'll use the default 50% estimate, but that's probably way off
+ * if there are multiple distinct values. Consider tweaking this a
+ * somehow, e.g. using only a part inversely proportional to the
+ * estimated number of distinct values in the bucket.
+ *
+ * TODO This does not handle inclusion flags at the moment, thus counting
+ * some buckets twice (when hitting the boundary).
+ *
+ * TODO Optimization is that if max[i] == min[i], it's effectively a MCV
+ * item and we can count the whole bucket as a complete match (thus
+ * using 100% bucket selectivity and not just 50%).
+ *
+ * TODO Technically some buckets may "degenerate" into single-value
+ * buckets (not necessarily for all the dimensions) - maybe this
+ * is better than keeping a separate MCV list (multi-dimensional).
+ * Update: Actually, that's unlikely to be better than a separate
+ * MCV list for two reasons - first, it requires ~2x the space
+ * (because of storing lower/upper boundaries) and second because
+ * the buckets are ranges - depending on the partitioning algorithm
+ * it may not even degenerate into (min=max) bucket. For example the
+ * the current partitioning algorithm never does that.
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /* Update the cache. */
+ callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (mincached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->max[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+
+ break;
+ }
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+
+ /* free the call cache */
+ pfree(callcache);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8c4396a..0dc575a 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -409,7 +409,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
{
info = makeNode(MVStatisticInfo);
@@ -419,10 +419,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
+ info->hist_enabled = mvstat->hist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
+ info->hist_built = mvstat->hist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index f9bf10c..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index d1da714..ffb76f4 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -13,11 +13,11 @@
*
*-------------------------------------------------------------------------
*/
+#include "postgres.h"
+#include "utils/array.h"
#include "common.h"
-#include "utils/array.h"
-
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts,
VacAttrStats **vacattrstats);
@@ -52,7 +52,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -95,8 +96,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (stat->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
}
}
@@ -176,6 +181,8 @@ list_mv_stats(Oid relid)
info->deps_built = stats->deps_built;
info->mcv_enabled = stats->mcv_enabled;
info->mcv_built = stats->mcv_built;
+ info->hist_enabled = stats->hist_enabled;
+ info->hist_built = stats->hist_built;
result = lappend(result, info);
}
@@ -190,7 +197,6 @@ list_mv_stats(Oid relid)
return result;
}
-
/*
* Find attnims of MV stats using the mvoid.
*/
@@ -236,9 +242,16 @@ find_mv_attnums(Oid mvoid, Oid *relid)
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -271,22 +284,34 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..6290d2f
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,2188 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+#include <math.h>
+
+/*
+ * Multivariate histograms
+ * -----------------------
+ *
+ * Histograms are a collection of buckets, represented by n-dimensional
+ * rectangles. Each rectangle is delimited by a min/max value in each
+ * dimension, stored in an array, so that the bucket includes values
+ * fulfilling condition
+ *
+ * min[i] <= value[i] <= max[i]
+ *
+ * where 'i' is the dimension. In 1D this corresponds to a simple
+ * interval, in 2D to a rectangle, and in 3D to a block. If you can
+ * imagine this in 4D, congrats!
+ *
+ * In addition to the bounaries, each bucket tracks additional details:
+ *
+ * * frequency (fraction of tuples it matches)
+ * * whether the boundaries are inclusive or exclusive
+ * * whether the dimension contains only NULL values
+ * * number of distinct values in each dimension (for building)
+ *
+ * and possibly some additional information.
+ *
+ * We do expect to support multiple histogram types, with different
+ * features etc. The 'type' field is used to identify those types.
+ * Technically some histogram types might use completely different
+ * bucket representation, but that's not expected at the moment.
+ *
+ * Although the current implementation builds non-overlapping buckets,
+ * the code does not (and should not) rely on the non-overlapping
+ * nature - there are interesting types of histograms / histogram
+ * building algorithms producing overlapping buckets.
+ *
+ *
+ * NULL handling (create_null_buckets)
+ * -----------------------------------
+ * Another thing worth mentioning is handling of NULL values. It would
+ * be quite difficult to work with buckets containing NULL and non-NULL
+ * values for a single dimension. To work around this, the initial step
+ * in building a histogram is building a set of 'NULL-buckets', i.e.
+ * buckets with one or more NULL-only dimensions.
+ *
+ * After that, no buckets are mixing NULL and non-NULL values in one
+ * dimension, and the actual histogram building starts. As that only
+ * splits the buckets into smaller ones, the resulting buckets can't
+ * mix NULL and non-NULL values either.
+ *
+ * The maximum number of NULL-buckets is determined by the number of
+ * attributes the histogram is built on. For N-dimensional histogram,
+ * the maximum number of NULL-buckets is 2^N. So for 8 attributes
+ * (which is the current value of MVSTATS_MAX_DIMENSIONS), there may be
+ * up to 256 NULL-buckets.
+ *
+ * Those buckets are only built if needed - if there are no NULL values
+ * in the data, no such buckets are built.
+ *
+ *
+ * Estimating selectivity
+ * ----------------------
+ * With histograms, we always "match" a whole bucket, not indivitual
+ * rows (or values), irrespectedly of the type of clause. Therefore we
+ * can't use the optimizations for equality clauses, as in MCV lists.
+ *
+ * The current implementation uses histograms to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ * (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ * (d) OR-clauses WHERE (a = 1) OR (b = 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (e) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ * When used on low-cardinality data, histograms usually perform
+ * considerably worse than MCV lists (which are a good fit for this
+ * kind of data). This is especially true on categorical data, where
+ * ordering of the values is mostly unrelated to meaning of the data,
+ * as proper ordering is crucial for histograms.
+ *
+ * On high-cardinality data the histograms are usually a better choice,
+ * because MCV lists can't represent the distribution accurately enough.
+ *
+ * By evaluating a clause on a bucket, we may get one of three results:
+ *
+ * (a) FULL_MATCH - The bucket definitely matches the clause.
+ *
+ * (b) PARTIAL_MATCH - The bucket matches the clause, but not
+ * necessarily all the tuples it represents.
+ *
+ * (c) NO_MATCH - The bucket definitely does not match the clause.
+ *
+ * This may be illustrated using a range [1, 5], which is essentially
+ * a 1-D bucket. With clause
+ *
+ * WHERE (a < 10) => FULL_MATCH (all range values are below
+ * 10, so the whole bucket matches)
+ *
+ * WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ * the clause, but we don't know how many)
+ *
+ * WHERE (a < 0) => NO_MATCH (the whole range is above 1, so
+ * no values from the bucket can match)
+ *
+ * Some clauses may produce only some of those results - for example
+ * equality clauses may never produce FULL_MATCH as we always hit only
+ * part of the bucket (we can't match both boundaries at the same time).
+ * This results in less accurate estimates compared to MCV lists, where
+ * we can hit a MCV items exactly (there's no PARTIAL match in MCV).
+ *
+ * There are clauses that may not produce any PARTIAL_MATCH results.
+ * A nice example of that is 'IS [NOT] NULL' clause, which either
+ * matches the bucket completely (FULL_MATCH) or not at all (NO_MATCH),
+ * thanks to how the NULL-buckets are constructed.
+ *
+ * Computing the total selectivity estimate is trivial - simply sum
+ * selectivities from all the FULL_MATCH and PARTIAL_MATCH buckets (but
+ * multiply the PARTIAL_MATCH buckets by 0.5 to minimize average error).
+ *
+ *
+ * Serialization
+ * -------------
+ * After building, the histogram is serialized into a more efficient
+ * form (dedup boundary values etc.). See serialize_mv_histogram() for
+ * more details about how it's done.
+ *
+ * Serialized histograms are marked with 'magic' constant, to make it
+ * easier to check the bytea value really is a serialized histogram.
+ *
+ * In the serialized form, values for each dimension are deduplicated,
+ * and referenced using an uint16 index. This saves a lot of space,
+ * because every time we split a bucket, we introduce a single new
+ * boundary value (to split the bucket by the selected dimension), but
+ * we actually copy all the boundary values for all dimensions. So for
+ * a histogram with 4 dimensions and 1000 buckets, we do have
+ *
+ * 1000 * 4 * 2 = 8000
+ *
+ * boundary values, but many of them are actually duplicated because
+ * the histogram started with a single bucket (8 boundary values) and
+ * then there were 999 splits (each introducing 1 new value):
+ *
+ * 8 + 999 = 1007
+ *
+ * So that's quite large diffence. Let's assume the Datum values are
+ * 8 bytes each. Storing the raw histogram would take ~ 64 kB, while
+ * with deduplication it's only ~18 kB.
+ *
+ * The difference may be removed by the transparent bytea compression,
+ * but the deduplication is also used to optimize the estimation. It's
+ * possible to process the deduplicated values, and then use this as
+ * a cache to minimize the actual function calls while checking the
+ * buckets. This significantly reduces the number of calls to the
+ * (often quite expensive) operator functions etc.
+ *
+ *
+ * The current limit on number of buckets (16384) is mostly arbitrary,
+ * but set so that it makes sure we don't exceed the number of distinct
+ * values indexable by uint16. In practice we could handle more buckets,
+ * because we index each dimension independently, and we do the splits
+ * over multiple dimensions.
+ *
+ * Histograms with more than 16k buckets are quite expensive to build
+ * and process, so the current limit is somewhat reasonable.
+ *
+ * The actual number of buckets is also related to statistics target,
+ * because we require MIN_BUCKET_ROWS (10) tuples per bucket before
+ * a split, so we can't have more than (2 * 300 * target / 10) buckets.
+ *
+ *
+ * TODO Maybe the distinct stats (both for combination of all columns
+ * and for combinations of various subsets of columns) should be
+ * moved to a separate structure (next to histogram/MCV/...) to
+ * make it useful even without a histogram computed etc.
+ *
+ * This would actually make mvcoeff (proposed by Kyotaro Horiguchi
+ * in [1]) possible. Seems like a good way to estimate GROUP BY
+ * cardinality, and also some other cases, pointed out by Kyotaro:
+ *
+ * [1] http://www.postgresql.org/message-id/20150515.152936.83796179.horiguchi.kyotaro@lab.ntt.co.jp
+ *
+ * This is not implemented at the moment, though. Also, Kyotaro's
+ * patch only works with pairs of columns, but maybe tracking all
+ * the combinations would be useful to handle more complex
+ * conditions. It only seems to handle equalities, though (but for
+ * GROUP BY estimation that's not a big deal).
+ */
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(int32))
+ * - max boundary indexes (2 * ndim * sizeof(int32))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(int32) + 3 * sizeof(bool)) +
+ * 2 * sizeof(float)
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) ((float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* can't split bucket with less than 10 rows */
+#define MIN_BUCKET_ROWS 10
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size.
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket
+ * for more details about the algorithm.
+ *
+ * The current algorithm works like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (largest bucket)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (largest dimension)
+ * split the bucket into two buckets
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ int *ndistvalues;
+ Datum **distvalues;
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets
+ = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * Collect info on distinct values in each dimension (used later
+ * to select dimension to partition).
+ */
+ ndistvalues = (int*)palloc0(sizeof(int) * numattrs);
+ distvalues = (Datum**)palloc0(sizeof(Datum*) * numattrs);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ int j;
+ int nvals;
+ Datum *tmp;
+
+ SortSupportData ssup;
+ StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ nvals = 0;
+ tmp = (Datum*)palloc0(sizeof(Datum) * numrows);
+
+ for (j = 0; j < numrows; j++)
+ {
+ bool isnull;
+
+ /* remember the index of the sample row, to make the partitioning simpler */
+ Datum value = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+
+ if (isnull)
+ continue;
+
+ tmp[nvals++] = value;
+ }
+
+ /* do the sort and stuff only if there are non-NULL values */
+ if (nvals > 0)
+ {
+ /* sort the array of values */
+ qsort_arg((void *) tmp, nvals, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /* count distinct values */
+ ndistvalues[i] = 1;
+ for (j = 1; j < nvals; j++)
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ ndistvalues[i] += 1;
+
+ /* FIXME allocate only needed space (count ndistinct first) */
+ distvalues[i] = (Datum*)palloc0(sizeof(Datum) * ndistvalues[i]);
+
+ /* now collect distinct values into the array */
+ distvalues[i][0] = tmp[0];
+ ndistvalues[i] = 1;
+
+ for (j = 1; j < nvals; j++)
+ {
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ {
+ distvalues[i][ndistvalues[i]] = tmp[j];
+ ndistvalues[i] += 1;
+ }
+ }
+ }
+
+ pfree(tmp);
+ }
+
+ /*
+ * The initial bucket may contain NULL values, so we have to create
+ * buckets with NULL-only dimensions.
+ *
+ * FIXME We may need up to 2^ndims buckets - check that there are
+ * enough buckets (MVSTAT_HIST_MAX_BUCKETS >= 2^ndims).
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets]
+ = partition_bucket(bucket, attrs, stats,
+ ndistvalues, distvalues);
+
+ histogram->nbuckets += 1;
+ }
+
+ /* finalize the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data
+ = ((HistogramBuild)histogram->buckets[i]->build_data);
+
+ /*
+ * The frequency has to be computed from the whole sample, in
+ * case some of the rows were used for MCV (and thus are missing
+ * from the histogram).
+ */
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVSerializedHistogram
+load_mv_histogram(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram(DatumGetByteaP(histogram));
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVSerializedHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm
+ * is simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use 32-bit values for the indexes in step (3), although we
+ * could probably use just 16 bits as we don't allow more than 8k
+ * buckets in the histogram max_buckets (well, we might increase this
+ * to 16k and still fit into signed 16-bits). But let's be lazy and rely
+ * on the varlena compression to kick in. If most bytes will be 0x00
+ * so it should work nicely.
+ *
+ *
+ * Deduplication in serialization
+ * ------------------------------
+ * The deduplication is very effective and important here, because every
+ * time we split a bucket, we keep all the boundary values, except for
+ * the dimension that was used for the split. Another way to look at
+ * this is that each split introduces 1 new value (the value used to do
+ * the split). A histogram with M buckets was created by (M-1) splits
+ * of the initial bucket, and each bucket has 2*N boundary values. So
+ * assuming the initial bucket does not have any 'collapsed' dimensions,
+ * the number of distinct values is
+ *
+ * (2*N + (M-1))
+ *
+ * but the total number of boundary values is
+ *
+ * 2*N*M
+ *
+ * which is clearly much higher. For a histogram on two columns, with
+ * 1024 buckets, it's 1027 vs. 4096. Of course, we're not saving all
+ * the difference (because we'll use 32-bit indexes into the values).
+ * But with large values (e.g. stored as varlena), this saves a lot.
+ *
+ * An interesting feature is that the total number of distinct values
+ * does not really grow with the number of dimensions, except for the
+ * size of the initial bucket. After that it only depends on number of
+ * buckets (i.e. number of splits).
+ *
+ * XXX Of course this only holds for the current histogram building
+ * algorithm. Algorithms doing the splits differently (e.g.
+ * producing overlapping buckets) may behave differently.
+ *
+ * TODO This only confirms we can use the uint16 indexes. The worst
+ * that could happen is if all the splits happened by a single
+ * dimension. To exhaust the uint16 this would require ~64k
+ * splits (needs to be reflected in MVSTAT_HIST_MAX_BUCKETS).
+ *
+ * TODO We don't need to use a separate boolean for each flag, instead
+ * use a single char and set bits.
+ *
+ * TODO We might get a bit better compression by considering the actual
+ * data type length. The current implementation treats all data
+ * types passed by value as requiring 8B, but for INT it's actually
+ * just 4B etc.
+ *
+ * OTOH this is only related to the lookup table, and most of the
+ * space is occupied by the buckets (with int16 indexes).
+ *
+ *
+ * Varlena compression
+ * -------------------
+ * This encoding may prevent automatic varlena compression (similarly
+ * to JSONB), because first part of the serialized bytea will be an
+ * array of unique values (although sorted), and pglz decides whether
+ * to compress by trying to compress the first part (~1kB or so). Which
+ * is likely to be poor, due to the lack of repetition.
+ *
+ * One possible cure to that might be storing the buckets first, and
+ * then the deduplicated arrays. The buckets might be better suited
+ * for compression.
+ *
+ * On the other hand the encoding scheme is a context-aware compression,
+ * usually compressing to ~30% (or less, with large data types). So the
+ * lack of pglz compression may be OK.
+ *
+ * XXX But maybe we don't really want to compress this, to save on
+ * planning time?
+ *
+ * TODO Try storing the buckets / deduplicated arrays in reverse order,
+ * measure impact on compression.
+ *
+ *
+ * Deserialization
+ * ---------------
+ * The deserialization is currently implemented so that it reconstructs
+ * the histogram back into the same structures - this involves quite
+ * a few of memcpy() and palloc(), but maybe we could create a special
+ * structure for the serialized histogram, and access the data directly,
+ * without the unpacking.
+ *
+ * Not only it would save some memory and CPU time, but might actually
+ * work better with CPU caches (not polluting the caches).
+ *
+ * TODO Try to keep the compressed form, instead of deserializing it to
+ * MVHistogram/MVBucket.
+ *
+ *
+ * General TODOs
+ * -------------
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs
+ * (we won't use them, but we don't know how many are there),
+ * and then collect all non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* make sure we fit into uint16 */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typlen > 0)
+ /* byval or byref, but with fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (10 * 1024 * 1024))
+ elog(ERROR, "serialized histogram exceeds 10MB (%ld > %d)",
+ total_length, (10 * 1024 * 1024));
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typlen > 0)
+ {
+ /* pased by value or reference, but fixed length */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the histogram buckets */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ *BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ uint16 idx;
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ /* min boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* FIXME free the values/counts arrays here */
+
+ return output;
+}
+
+/*
+ * Returns histogram in a partially-serialized form (keeps the boundary
+ * values deduplicated, so that it's possible to optimize the estimation
+ * part by caching function call results between buckets etc.).
+ */
+MVSerializedHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+
+ MVSerializedHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram
+ = (MVSerializedHistogram)palloc(sizeof(MVSerializedHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVSerializedHistogramData, buckets));
+ tmp += offsetof(MVSerializedHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVSerializedHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* now let's allocate a single buffer for all the values and counts */
+
+ bufflen = (sizeof(int) + sizeof(Datum*)) * ndims;
+ for (i = 0; i < ndims; i++)
+ {
+ /* don't allocate space for byval types, matching Datum */
+ if (! (info[i].typbyval && (info[i].typlen == sizeof(Datum))))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+ }
+
+ /* also, include space for the result, tracking the buckets */
+ bufflen += nbuckets * (
+ sizeof(MVSerializedBucket) + /* bucket pointer */
+ sizeof(MVSerializedBucketData)); /* bucket data */
+
+ buff = palloc(bufflen);
+ ptr = buff;
+
+ histogram->nvalues = (int*)ptr;
+ ptr += (sizeof(int) * ndims);
+
+ histogram->values = (Datum**)ptr;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ histogram->nvalues[i] = info[i].nvalues;
+
+ if (info[i].typbyval && info[i].typlen == sizeof(Datum))
+ {
+ /* passed by value / Datum - simply reuse the array */
+ histogram->values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typbyval)
+ {
+ /* pased by value, but smaller than Datum */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&histogram->values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ histogram->buckets = (MVSerializedBucket*)ptr;
+ ptr += (sizeof(MVSerializedBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVSerializedBucket bucket = (MVSerializedBucket)ptr;
+ ptr += sizeof(MVSerializedBucketData);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+ bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+ bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+ bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+ bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+ bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ /* we should exhaust the output buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller ones.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size. We
+ * select the bucket with the highest number of distinct values, and
+ * then split it by the longest dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this
+ * is used to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this
+ * contains values for all the tuples from the sample, not just
+ * the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ * For example use the "bucket volume" (product of dimension
+ * lengths) to select the bucket.
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int numrows = 0;
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+ /* if the number of rows is higher, use this bucket */
+ if ((data->ndistinct > 2) &&
+ (data->numrows > numrows) &&
+ (data->numrows >= MIN_BUCKET_ROWS)) {
+ bucket = buckets[i];
+ numrows = data->numrows;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest
+ * bucket dimension, measured using the array of distinct values built
+ * at the very beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly
+ * distributed, and then use this to measure length. It's essentially
+ * a number of distinct values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts
+ * with roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning
+ * the new bucket (essentially shrinking the existing one in-place and
+ * returning the other "half" as a new bucket). The caller is responsible
+ * for adding the new bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case
+ * of strongly dependent columns - e.g. y=x). The current algorithm
+ * minimizes this, but may still happen for perfectly dependent
+ * examples (when all the dimensions have equal length, the first
+ * one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ // int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+ double delta;
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split.
+ */
+ delta = 0.0;
+ dimension = -1;
+
+ for (i = 0; i < numattrs; i++)
+ {
+ Datum *a, *b;
+
+ mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ /* can't split NULL-only dimension */
+ if (bucket->nullsonly[i])
+ continue;
+
+ /* can't split dimension with a single ndistinct value */
+ if (data->ndistincts[i] <= 1)
+ continue;
+
+ /* sort support for the bsearch_comparator */
+ ssup_private = &ssup;
+
+ /* search for min boundary in the distinct list */
+ a = (Datum*)bsearch(&bucket->min[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ b = (Datum*)bsearch(&bucket->max[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ /* if this dimension is 'larger' then partition by it */
+ if (((b-a)*1.0 / ndistvalues[i]) > delta)
+ {
+ delta = ((b-a)*1.0 / ndistvalues[i]);
+ dimension = i;
+ }
+ }
+
+ /*
+ * If we haven't found a dimension here, we've done something
+ * wrong in select_bucket_to_partition.
+ */
+ Assert(dimension != -1);
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ delta = fabs(data->numrows);
+ split_value = values[0].value;
+
+ for (i = 1; i < data->numrows; i++)
+ {
+ if (values[i].value != values[i-1].value)
+ {
+ /* are we closer to splitting the bucket in half? */
+ if (fabs(i - data->numrows/2.0) < delta)
+ {
+ /* let's assume we'll use this value for the split */
+ split_value = values[i].value;
+ delta = fabs(i - data->numrows/2.0);
+ nrows = i;
+ }
+ }
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and
+ * non-NULL values in a single dimension. Each dimension may either be
+ * marked as 'nulls only', and thus containing only NULL values, or
+ * it must not contain any NULL values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns,
+ * it's necessary to build those NULL-buckets. This is done in an
+ * iterative way using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL
+ * and non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not
+ * marked as NULL-only, mark it as NULL-only and run the
+ * algorithm again (on this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the
+ * bucket into two parts - one with NULL values, one with
+ * non-NULL values (replacing the current one). Then run
+ * the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions
+ * should be quite low - limited by the number of NULL-buckets. Also,
+ * in each branch the number of nested calls is limited by the number
+ * of dimensions (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The
+ * number of buckets produced by this algorithm is rather limited - with
+ * N dimensions, there may be only 2^N such buckets (each dimension may
+ * be either NULL or non-NULL). So with 8 dimensions (current value of
+ * MVSTATS_MAX_DIMENSIONS) there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further
+ * optimizing the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL
+ * in a dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only,
+ * but is not yet marked like that. It's enough to mark it and
+ * repeat the process recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in
+ * the dimension, one with non-NULL values. We don't need to sort
+ * the data or anything, but otherwise it's similar to what's done
+ * in partition_bucket().
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each
+ * bucket (NULL is not a value, so 0, and the other bucket got
+ * all the ndistinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram (or if there's no
+ * statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ * - prints actual values
+ * - using the output function of the data type (as string)
+ * - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ * - prints index of the distinct value (into the serialized array)
+ * - makes it easier to spot neighbor buckets, etc.
+ * - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ * - prints index of the distinct value, but normalized into [0,1]
+ * - similar to 1, but shows how 'long' the bucket range is
+ * - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options
+ * skew the lengths by distributing the distinct values uniformly. For
+ * data types without a clear meaning of 'distance' (e.g. strings) that
+ * is not a big deal, but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_histogram_buckets);
+
+Datum
+pg_mv_histogram_buckets(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ Oid mvoid = PG_GETARG_OID(0);
+ int otype = PG_GETARG_INT32(1);
+
+ if ((otype < 0) || (otype > 2))
+ elog(ERROR, "invalid output type specified");
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MVSerializedHistogram histogram;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ histogram = load_mv_histogram(mvoid);
+
+ funcctx->user_fctx = histogram;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = histogram->nbuckets;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+ double bucket_size = 1.0;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MVSerializedHistogram histogram;
+ MVSerializedBucket bucket;
+
+ histogram = (MVSerializedHistogram)funcctx->user_fctx;
+
+ Assert(call_cntr < histogram->nbuckets);
+
+ bucket = histogram->buckets[call_cntr];
+
+ stakeys = find_mv_attnums(mvoid, &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(9 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+ values[3] = (char *) palloc0(1024 * sizeof(char));
+ values[4] = (char *) palloc0(1024 * sizeof(char));
+ values[5] = (char *) palloc0(1024 * sizeof(char));
+
+ values[6] = (char *) palloc(64 * sizeof(char));
+ values[7] = (char *) palloc(64 * sizeof(char));
+ values[8] = (char *) palloc(64 * sizeof(char));
+
+ /* we need to do this only when printing the actual values */
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * histogram->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* bucket ID */
+
+ /*
+ * currently we only print array of indexes, but the deduplicated
+ * values should be sorted, so this is actually quite useful
+ *
+ * TODO print the actual min/max values, using the output
+ * function of the attribute type
+ */
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bucket_size *= (bucket->max[i] - bucket->min[i]) * 1.0
+ / (histogram->nvalues[i]-1);
+
+ /* print the actual values, i.e. use output function etc. */
+ if (otype == 0)
+ {
+ Datum minval, maxval;
+ Datum minout, maxout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ minval = histogram->values[i][bucket->min[i]];
+ minout = FunctionCall1(&fmgrinfo[i], minval);
+
+ maxval = histogram->values[i][bucket->max[i]];
+ maxout = FunctionCall1(&fmgrinfo[i], maxval);
+
+ // snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(minout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ // snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ snprintf(buff, 1024, format, values[2], DatumGetPointer(maxout));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else if (otype == 1)
+ {
+ format = "%s, %d";
+ if (i == 0)
+ format = "{%s%d";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %d}";
+
+ snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else
+ {
+ format = "%s, %f";
+ if (i == 0)
+ format = "{%s%f";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %f}";
+
+ snprintf(buff, 1024, format, values[1],
+ bucket->min[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2],
+ bucket->max[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ snprintf(buff, 1024, format, values[3], bucket->nullsonly[i] ? "t" : "f");
+ strncpy(values[3], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[4], bucket->min_inclusive[i] ? "t" : "f");
+ strncpy(values[4], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[5], bucket->max_inclusive[i] ? "t" : "f");
+ strncpy(values[5], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[6], 64, "%f", bucket->ntuples); /* frequency */
+ snprintf(values[7], 64, "%f", bucket->ntuples / bucket_size); /* density */
+ snprintf(values[8], 64, "%f", bucket_size); /* bucket_size */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+ pfree(values[4]);
+ pfree(values[5]);
+ pfree(values[6]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 01d29db..af3bd62 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2101,9 +2101,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stakeys,\n"
- " deps_enabled, mcv_enabled,\n"
- " deps_built, mcv_built,\n"
- " mcv_max_items,\n"
+ " deps_enabled, mcv_enabled, hist_enabled,\n"
+ " deps_built, mcv_built, hist_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2141,8 +2141,17 @@ describeOneTableDetails(const char *schemaname,
first = false;
}
+ if (!strcmp(PQgetvalue(result, i, 4), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", histogram");
+ else
+ appendPQExpBuffer(&buf, "(histogram");
+ first = false;
+ }
+
appendPQExpBuffer(&buf, ") ON (%s)",
- PQgetvalue(result, i, 8));
+ PQgetvalue(result, i, 10));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index c6e7d74..84579da 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -36,13 +36,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -50,6 +53,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -65,15 +69,19 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-#define Natts_pg_mv_statistic 9
+#define Natts_pg_mv_statistic 13
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_deps_enabled 2
#define Anum_pg_mv_statistic_mcv_enabled 3
-#define Anum_pg_mv_statistic_mcv_max_items 4
-#define Anum_pg_mv_statistic_deps_built 5
-#define Anum_pg_mv_statistic_mcv_built 6
-#define Anum_pg_mv_statistic_stakeys 7
-#define Anum_pg_mv_statistic_stadeps 8
-#define Anum_pg_mv_statistic_stamcv 9
+#define Anum_pg_mv_statistic_hist_enabled 4
+#define Anum_pg_mv_statistic_mcv_max_items 5
+#define Anum_pg_mv_statistic_hist_max_buckets 6
+#define Anum_pg_mv_statistic_deps_built 7
+#define Anum_pg_mv_statistic_mcv_built 8
+#define Anum_pg_mv_statistic_hist_built 9
+#define Anum_pg_mv_statistic_stakeys 10
+#define Anum_pg_mv_statistic_stadeps 11
+#define Anum_pg_mv_statistic_stamcv 12
+#define Anum_pg_mv_statistic_stahist 13
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 890c763..1d451f6 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2743,6 +2743,10 @@ DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f
DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
DESCR("details about MCV list items");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3374 ( pg_mv_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_size}" _null_ _null_ pg_mv_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 917ae8d..abf5815 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -573,10 +573,12 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
+ bool hist_enabled; /* histogram enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
+ bool hist_built; /* histogram built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index b028192..70f79ed 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -91,6 +91,123 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ *
+ * TODO add more detailed description here
+ */
+
+typedef struct MVSerializedBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ uint16 *min;
+ bool *min_inclusive;
+
+ /* indexes of upper boundaries - values and information about the
+ * inequalities (exclusive vs. inclusive) */
+ uint16 *max;
+ bool *max_inclusive;
+
+} MVSerializedBucketData;
+
+typedef MVSerializedBucketData *MVSerializedBucket;
+
+typedef struct MVSerializedHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ /*
+ * keep this the same with MVHistogramData, because of
+ * deserialization (same offset)
+ */
+ MVSerializedBucket *buckets; /* array of buckets */
+
+ /*
+ * serialized boundary values, one array per dimension, deduplicated
+ * (the min/max indexes point into these arrays)
+ */
+ int *nvalues;
+ Datum **values;
+
+} MVSerializedHistogramData;
+
+typedef MVSerializedHistogramData *MVSerializedHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -98,20 +215,25 @@ typedef MCVListData *MCVList;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
+MVSerializedHistogram load_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVSerializedHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
* dimension within the stats).
*/
int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
@@ -120,6 +242,8 @@ extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_histogram_buckets(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -129,10 +253,15 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..a3d3fd8
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (unknown_column);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a, b);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+ALTER TABLE mv_histogram ADD STATISTICS (unknown_option) ON (a, b, c);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+ALTER TABLE mv_histogram ADD STATISTICS (dependencies, max_buckets 200) ON (a, b, c);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 10) ON (a, b, c);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 100000) ON (a, b, c);
+ERROR: minimum number of buckets is 16384
+-- correct command
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c, d);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index faa41c7..e230e58 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1369,7 +1369,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index d083442..8715d17 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index e63b7aa..6b9ed27 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -158,3 +158,4 @@ test: stats
test: tablesample
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..31c627a
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,176 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (unknown_column);
+
+-- single column
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a);
+
+-- single column, duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a);
+
+-- two columns, one duplicated
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, a, b);
+
+-- unknown option
+ALTER TABLE mv_histogram ADD STATISTICS (unknown_option) ON (a, b, c);
+
+-- missing histogram statistics
+ALTER TABLE mv_histogram ADD STATISTICS (dependencies, max_buckets 200) ON (a, b, c);
+
+-- invalid max_buckets value / too low
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 10) ON (a, b, c);
+
+-- invalid max_buckets value / too high
+ALTER TABLE mv_histogram ADD STATISTICS (mcv, max_buckets 100000) ON (a, b, c);
+
+-- correct command
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+ALTER TABLE mv_histogram ADD STATISTICS (histogram) ON (a, b, c, d);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DROP TABLE mv_histogram;
--
1.9.3
0006-multi-statistics-estimation-v7.patchtext/x-patch; name=0006-multi-statistics-estimation-v7.patchDownload
>From a9df974e90067f68ea106e89a08ebc887412b5b5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 6/6] multi-statistics estimation
The general idea is that a probability (which
is what selectivity is) can be split into a product of
conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part
may be simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute
the original probability.
The implementation works in the other direction, though.
We know what probability P(A & B & C) we need to compute,
and also what statistics are available.
So we search for a combinations of statistics, covering
the clauses in an optimal way (most clauses covered, most
dependencies exploited).
There are two possible approaches - exhaustive and greedy.
The exhaustive one walks through all permutations of
stats using dynamic programming, so it's guaranteed to
find the optimal solution, but it soon gets very slow as
it's roughly O(N!). The dynamic programming may improve
that a bit, but it's still far too expensive for large
numbers of statistics (on a single table).
The greedy algorithm is very simple - in every step choose
the best solution. That may not guarantee the best solution
globally (but maybe it does?), but it only needs N steps
to find the solution, so it's very fast (processing the
selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with
respect to runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply
them to the clauses using the conditional probabilities.
We process the selected stats one by one, and for each
we select the estimated clauses and conditions. See
clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to
be covered by a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single
multivariate statistics.
Clauses not covered by a single statistics at this level
will be passed to clause_selectivity() but this will treat
them as a collection of simpler clauses (connected by AND
or OR), and the clauses from the previous level will be
used as conditions.
So using the same example, the last clause will be passed
to clause_selectivity() with 'clause1' and 'clause2' as
conditions, and it will be processed using multivariate
stats if possible.
The other limitation is that all the expressions have to
be mv-compatible, i.e. there can't be a mix of expressions.
If this is violated, the clause may be passed to the next
level (just like with list of clauses not covered by
a single statistics), which splits that into clauses
handled by multivariate stats and clauses handler by
regular statistics.
---
contrib/file_fdw/file_fdw.c | 3 +-
contrib/postgres_fdw/postgres_fdw.c | 6 +-
src/backend/optimizer/path/clausesel.c | 2151 +++++++++++++++++++++++++++++---
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
9 files changed, 2016 insertions(+), 222 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 499f24f..0d7d2e7 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -949,7 +949,8 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
baserel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 6da01e1..bd487c5 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -479,7 +479,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
@@ -1785,7 +1786,8 @@ estimate_path_cost_size(PlannerInfo *root,
local_join_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index bc02e92..fce77ec 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -29,6 +29,8 @@
#include "utils/selfuncs.h"
#include "utils/typcache.h"
+#include "miscadmin.h"
+
/*
* Data structure for accumulating info about possible range-query
@@ -44,6 +46,13 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
+
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -59,23 +68,29 @@ static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
int type);
+static Bitmapset *clause_mv_get_attnums(PlannerInfo *root, Node *clause);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, List *stats,
SpecialJoinInfo *sjinfo);
-static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
-
static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
List *clauses, Oid varRelid,
List **mvclauses, MVStatisticInfo *mvstats, int types);
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats, List *clauses,
+ List *conditions, bool is_or);
+
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats,
- bool *fullmatch, Selectivity *lowsel);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or, bool *fullmatch,
+ Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -89,11 +104,59 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static List *choose_mv_statistics(PlannerInfo *root,
+ List *mvstats,
+ List *clauses, List *conditions,
+ Oid varRelid,
+ SpecialJoinInfo *sjinfo);
+
+static List *filter_clauses(PlannerInfo *root, Oid varRelid,
+ SpecialJoinInfo *sjinfo, int type,
+ List *stats, List *clauses,
+ Bitmapset **attnums);
+
+static List *filter_stats(List *stats, Bitmapset *new_attnums,
+ Bitmapset *all_attnums);
+
+static Bitmapset **make_stats_attnums(MVStatisticInfo *mvstats,
+ int nmvstats);
+
+static MVStatisticInfo *make_stats_array(List *stats, int *nmvstats);
+
+static List* filter_redundant_stats(List *stats,
+ List *clauses, List *conditions);
+
+static Node** make_clauses_array(List *clauses, int *nclauses);
+
+static Bitmapset ** make_clauses_attnums(PlannerInfo *root, Oid varRelid,
+ SpecialJoinInfo *sjinfo, int type,
+ Node **clauses, int nclauses);
+
+static bool* make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
Oid varRelid, Index *relid);
-
+
static Bitmapset* fdeps_collect_attnums(List *stats);
static int *make_idx_to_attnum_mapping(Bitmapset *attnums);
@@ -116,6 +179,8 @@ static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
static Bitmapset * get_varattnos(Node * node, Index relid);
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
+
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
#define MIN(x, y) (((x) < (y)) ? (x) : (y))
@@ -256,14 +321,15 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
/* processing mv stats */
- Oid relid = InvalidOid;
+ Index relid = InvalidOid;
/* attributes in mv-compatible clauses */
Bitmapset *mvattnums = NULL;
@@ -273,12 +339,13 @@ clauselist_selectivity(PlannerInfo *root,
stats = find_stats(root, clauses, varRelid, &relid);
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, or matching multivariate statistics, so just go directly
+ * to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
+ varRelid, jointype, sjinfo, conditions);
/*
* Check that there are some stats with functional dependencies
@@ -310,8 +377,8 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
- * Check that there are statistics with MCV list. If not, we don't
- * need to waste time with the optimization.
+ * Check that there are statistics with MCV list or histogram.
+ * If not, we don't need to waste time with the optimization.
*/
if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
@@ -325,33 +392,194 @@ clauselist_selectivity(PlannerInfo *root,
/*
* If there still are at least two columns, we'll try to select
- * a suitable multivariate stats.
+ * a suitable combination of multivariate stats. If there are
+ * multiple combinations, we'll try to choose the best one.
+ * See choose_mv_statistics for more details.
*/
if (bms_num_members(mvattnums) >= 2)
{
- /* see choose_mv_statistics() for details */
- MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+ int k;
+ ListCell *s;
- if (mvstat != NULL) /* we have a matching stats */
+ /*
+ * Copy the list of conditions, so that we can build a list
+ * of local conditions (and keep the original intact, for
+ * the other clauses at the same level).
+ */
+ List *conditions_local = list_copy(conditions);
+
+ /* find the best combination of statistics */
+ List *solution = choose_mv_statistics(root, stats,
+ clauses, conditions,
+ varRelid, sjinfo);
+
+ /* we have a good solution (list of stats) */
+ foreach (s, solution)
{
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
/* clauses compatible with multi-variate stats */
List *mvclauses = NIL;
+ List *mvclauses_new = NIL;
+ List *mvclauses_conditions = NIL;
+ Bitmapset *stat_attnums = NULL;
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, sjinfo, clauses,
+ /* build attnum bitmapset for this statistics */
+ for (k = 0; k < mvstat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ mvstat->stakeys->values[k]);
+
+ /*
+ * Append the compatible conditions (passed from above)
+ * to mvclauses_conditions.
+ */
+ foreach (l, conditions)
+ {
+ Node *c = (Node*)lfirst(l);
+ Bitmapset *tmp = clause_mv_get_attnums(root, c);
+
+ if (bms_is_subset(tmp, stat_attnums))
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, c);
+
+ bms_free(tmp);
+ }
+
+ /* split the clauselist into regular and mv-clauses
+ *
+ * We keep the list of clauses (we don't remove the
+ * clauses yet, because we want to use the clauses
+ * as conditions of other clauses).
+ *
+ * FIXME Do this only once, i.e. filter the clauses
+ * once (selecting clauses covered by at least
+ * one statistics) and then convert them into
+ * smaller per-statistics lists of conditions
+ * and estimated clauses.
+ */
+ clauselist_mv_split(root, sjinfo, clauses,
varRelid, &mvclauses, mvstat,
(MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
- /* we've chosen the histogram to match the clauses */
+ /*
+ * We've chosen the statistics to match the clauses, so
+ * each statistics from the solution should have at least
+ * one new clause (not covered by the previous stats).
+ */
Assert(mvclauses != NIL);
+ /*
+ * Mvclauses now contains only clauses compatible
+ * with the currently selected stats, but we have to
+ * split that into conditions (already matched by
+ * the previous stats), and the new clauses we need
+ * to estimate using this stats.
+ */
+ foreach (l, mvclauses)
+ {
+ ListCell *p;
+ bool covered = false;
+ Node *clause = (Node *) lfirst(l);
+ Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
+
+ /*
+ * If already covered by previous stats, add it to
+ * conditions.
+ *
+ * TODO Maybe this could be relaxed a bit? Because
+ * with complex and/or clauses, this might
+ * mean no statistics actually covers such
+ * complex clause.
+ */
+ foreach (p, solution)
+ {
+ int k;
+ Bitmapset *stat_attnums = NULL;
+
+ MVStatisticInfo *prev_stat
+ = (MVStatisticInfo *)lfirst(p);
+
+ /* break if we've ran into current statistic */
+ if (prev_stat == mvstat)
+ break;
+
+ for (k = 0; k < prev_stat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ prev_stat->stakeys->values[k]);
+
+ covered = bms_is_subset(clause_attnums, stat_attnums);
+
+ bms_free(stat_attnums);
+
+ if (covered)
+ break;
+ }
+
+ if (covered)
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, clause);
+ else
+ mvclauses_new
+ = lappend(mvclauses_new, clause);
+ }
+
+ /*
+ * We need at least one new clause (not just conditions).
+ */
+ Assert(mvclauses_new != NIL);
+
/* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ mvclauses_new,
+ mvclauses_conditions,
+ false); /* AND */
+ }
+
+ /*
+ * And now finally remove all the mv-compatible clauses.
+ *
+ * This only repeats the same split as above, but this
+ * time we actually use the result list (and feed it to
+ * the next call).
+ */
+ foreach (s, solution)
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
+ /* split the list into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * Add the clauses to the conditions (to be passed
+ * to regular clauses), irrespectedly whether it
+ * will be used as a condition or a clause here.
+ *
+ * We only keep the remaining conditions in the
+ * clauses (we keep what clauselist_mv_split returns)
+ * so we add each MV condition exactly once.
+ */
+ conditions_local = list_concat(conditions_local, mvclauses);
}
+
+ /* from now on, work with the 'local' list of conditions */
+ conditions = conditions_local;
}
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ return clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo, conditions);
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -363,7 +591,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions);
/*
* Check for being passed a RestrictInfo.
@@ -522,6 +751,253 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Similar to clauselist_selectivity(), but for clauses connected by OR.
+ *
+ * That means a few differences:
+ *
+ * - functional dependencies don't apply to OR-clauses
+ *
+ * - we can't add the previous clauses to conditions
+ *
+ * - combined selectivities are combined using (s1+s2 - s1*s2)
+ * and not as a multiplication (s1*s2)
+ *
+ * Another way to evaluate this might be turning
+ *
+ * (a OR b OR c)
+ *
+ * into
+ *
+ * NOT ((NOT a) AND (NOT b) AND (NOT c))
+ *
+ * and computing selectivity of that using clauselist_selectivity().
+ * That would allow (a) using the clauselist_selectivity directly and
+ * (b) using the previous clauses as conditions. Not sure if it's
+ * worth the additional complexity, though.
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
+{
+ Selectivity s1 = 0.0;
+ ListCell *l;
+
+ /* processing mv stats */
+ Index relid = InvalidOid;
+
+ /* attributes in mv-compatible clauses */
+ Bitmapset *mvattnums = NULL;
+ List *stats = NIL;
+
+ /* use clauses (not conditions), because those are always non-empty */
+ stats = find_stats(root, clauses, varRelid, &relid);
+
+ /* OR-clauses do not work with functional dependencies */
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
+ {
+ /*
+ * Recollect attributes from mv-compatible clauses (maybe we've
+ * removed so many clauses we have a single mv-compatible attnum).
+ * From now on we're only interested in MCV-compatible clauses.
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * If there still are at least two columns, we'll try to select
+ * a suitable multivariate stats.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ int k;
+ ListCell *s;
+
+ List *solution
+ = choose_mv_statistics(root, stats,
+ clauses, conditions,
+ varRelid, sjinfo);
+
+ /* we have a good solution stats */
+ foreach (s, solution)
+ {
+ Selectivity s2;
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+ List *mvclauses_new = NIL;
+ List *mvclauses_conditions = NIL;
+ Bitmapset *stat_attnums = NULL;
+
+ /* build attnum bitmapset for this statistics */
+ for (k = 0; k < mvstat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ mvstat->stakeys->values[k]);
+
+ /*
+ * Append the compatible conditions (passed from above)
+ * to mvclauses_conditions.
+ */
+ foreach (l, conditions)
+ {
+ Node *c = (Node*)lfirst(l);
+ Bitmapset *tmp = clause_mv_get_attnums(root, c);
+
+ if (bms_is_subset(tmp, stat_attnums))
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, c);
+
+ bms_free(tmp);
+ }
+
+ /* split the clauselist into regular and mv-clauses
+ *
+ * We keep the list of clauses (we don't remove the
+ * clauses yet, because we want to use the clauses
+ * as conditions of other clauses).
+ *
+ * FIXME Do this only once, i.e. filter the clauses
+ * once (selecting clauses covered by at least
+ * one statistics) and then convert them into
+ * smaller per-statistics lists of conditions
+ * and estimated clauses.
+ */
+ clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * We've chosen the statistics to match the clauses, so
+ * each statistics from the solution should have at least
+ * one new clause (not covered by the previous stats).
+ */
+ Assert(mvclauses != NIL);
+
+ /*
+ * Mvclauses now contains only clauses compatible
+ * with the currently selected stats, but we have to
+ * split that into conditions (already matched by
+ * the previous stats), and the new clauses we need
+ * to estimate using this stats.
+ *
+ * XXX We'll only use the new clauses, but maybe we
+ * should use the conditions too, somehow. We can't
+ * use that directly in conditional probability, but
+ * maybe we might use them in a different way?
+ *
+ * If we have a clause (a OR b OR c), then knowing
+ * that 'a' is TRUE means (b OR c) can't make the
+ * whole clause FALSE.
+ *
+ * This is pretty much what
+ *
+ * (a OR b) == NOT ((NOT a) AND (NOT b))
+ *
+ * implies.
+ */
+ foreach (l, mvclauses)
+ {
+ ListCell *p;
+ bool covered = false;
+ Node *clause = (Node *) lfirst(l);
+ Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
+
+ /*
+ * If already covered by previous stats, add it to
+ * conditions.
+ *
+ * TODO Maybe this could be relaxed a bit? Because
+ * with complex and/or clauses, this might
+ * mean no statistics actually covers such
+ * complex clause.
+ */
+ foreach (p, solution)
+ {
+ int k;
+ Bitmapset *stat_attnums = NULL;
+
+ MVStatisticInfo *prev_stat
+ = (MVStatisticInfo *)lfirst(p);
+
+ /* break if we've ran into current statistic */
+ if (prev_stat == mvstat)
+ break;
+
+ for (k = 0; k < prev_stat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ prev_stat->stakeys->values[k]);
+
+ covered = bms_is_subset(clause_attnums, stat_attnums);
+
+ bms_free(stat_attnums);
+
+ if (covered)
+ break;
+ }
+
+ if (! covered)
+ mvclauses_new = lappend(mvclauses_new, clause);
+ }
+
+ /*
+ * We need at least one new clause (not just conditions).
+ */
+ Assert(mvclauses_new != NIL);
+
+ /* compute the multivariate stats */
+ s2 = clauselist_mv_selectivity(root, mvstat,
+ mvclauses_new,
+ mvclauses_conditions,
+ true); /* OR */
+
+ s1 = s1 + s2 - s1 * s2;
+ }
+
+ /*
+ * And now finally remove all the mv-compatible clauses.
+ *
+ * This only repeats the same split as above, but this
+ * time we actually use the result list (and feed it to
+ * the next call).
+ */
+ foreach (s, solution)
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
+ /* split the list into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+ }
+ }
+ }
+
+ /*
+ * Handle the remaining clauses (either using regular statistics,
+ * or by multivariate stats at the next level).
+ */
+ foreach(l, clauses)
+ {
+ Selectivity s2 = clause_selectivity(root,
+ (Node *) lfirst(l),
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
+ s1 = s1 + s2 - s1 * s2;
+ }
+
+ return s1;
+}
+
+/*
* addRangeClause --- add a new range clause for clauselist_selectivity
*
* Here is where we try to match up pairs of range-query clauses
@@ -728,7 +1204,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -858,7 +1335,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -867,29 +1345,18 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
- /*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
- */
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(arg),
- varRelid,
- jointype,
- sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ /* just call to clauselist_selectivity_or() */
+ s1 = clauselist_selectivity_or(root,
+ ((BoolExpr *) clause)->args,
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -998,7 +1465,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -1007,7 +1475,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
/* Cache the result if possible */
@@ -1120,9 +1589,67 @@ clause_selectivity(PlannerInfo *root,
* them without inspection, which is more expensive). But this
* requires really knowing the per-clause selectivities in advance,
* and that's not what we do now.
+ *
+ * TODO All this is based on the assumption that the statistics represent
+ * the necessary dependencies, i.e. that if two colunms are not in
+ * the same statistics, there's no dependency. If that's not the
+ * case, we may get misestimates, just like before. For example
+ * assume we have a table with three columns [a,b,c] with exactly
+ * the same values, and statistics on [a,b] and [b,c]. So somthing
+ * like this:
+ *
+ * CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+ *
+ * ALTER TABLE test ADD STATISTICS (mcv) ON (a,b);
+ * ALTER TABLE test ADD STATISTICS (mcv) ON (b,c);
+ *
+ * ANALYZE test;
+ *
+ * EXPLAIN ANALYZE SELECT * FROM test
+ * WHERE (a < 10) AND (b < 20) AND (c < 10);
+ *
+ * The problem here is that the only shared column between the two
+ * statistics is 'b' so the probability will be computed like this
+ *
+ * P[(a < 10) & (b < 20) & (c < 10)]
+ * = P[(a < 10) & (b < 20)] * P[(c < 10) | (a < 10) & (b < 20)]
+ * = P[(a < 10) & (b < 20)] * P[(c < 10) | (b < 20)]
+ *
+ * or like this
+ *
+ * P[(a < 10) & (b < 20) & (c < 10)]
+ * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20) & (c < 10)]
+ * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20)]
+ *
+ * In both cases the conditional probabilities will be evaluated as
+ * 0.5, because they lack the other column (which would make it 1.0).
+ *
+ * Theoretically it might be possible to transfer the dependency,
+ * e.g. by building bitmap for [a,b] and then combine it with [b,c]
+ * by doing something like this:
+ *
+ * 1) build bitmap on [a,b] using [(a<10) & (b < 20)]
+ * 2) for each element in [b,c] check the bitmap
+ *
+ * But that's certainly nontrivial - for example the statistics may
+ * be different (MCV list vs. histogram) and/or the items may not
+ * match (e.g. MCV items or histogram buckets will be built
+ * differently). Also, for one value of 'b' there might be multiple
+ * MCV items (because of the other column values) with different
+ * bitmap values (some will match, some won't) - so it's not exactly
+ * bitmap but a partial match.
+ *
+ * Maybe a hash table with number of matches and mismatches (or
+ * maybe sums of frequencies) would work? The step (2) would then
+ * lookup the values and use that to weight the item somehow.
+ *
+ * Currently the only solution is to build statistics on all three
+ * columns.
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
@@ -1140,7 +1667,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
*/
/* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions, is_or,
&fullmatch, &mcv_low);
/*
@@ -1153,7 +1681,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
/* FIXME if (fullmatch) without matching MCV item, use the mcv_low
* selectivity as upper bound */
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions, is_or);
/* TODO clamp to <= 1.0 (or more strictly, when possible) */
return s1 + s2;
@@ -1193,8 +1722,7 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
if (bms_num_members(attnums) <= 1)
{
- if (attnums != NULL)
- pfree(attnums);
+ bms_free(attnums);
attnums = NULL;
*relid = InvalidOid;
}
@@ -1203,123 +1731,852 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
}
/*
- * We're looking for statistics matching at least 2 attributes,
- * referenced in the clauses compatible with multivariate statistics.
- * The current selection criteria is very simple - we choose the
- * statistics referencing the most attributes.
+ * Selects the best combination of multivariate statistics, in an
+ * exhaustive way, where 'best' means:
+ *
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ * (c) using the most conditions to exploit dependency
+ *
+ * There may be other optimality criteria, not considered in the initial
+ * implementation (more on that 'weaknesses' section).
+ *
+ * This pretty much splits the probability of clauses (aka selectivity)
+ * into a sequence of conditional probabilities, like this
+ *
+ * P(A,B,C,D) = P(A,B) * P(C|A,B) * P(D|A,B,C)
+ *
+ * and removing the attributes not referenced by the existing stats,
+ * under the assumption that there's no dependency (otherwise the DBA
+ * would create the stats).
+ *
+ * The last criteria means that when we have the choice to compute like
+ * this
+ *
+ * P(A,B,C,D) = P(A,B,C) * P(D|B,C)
*
- * If there are multiple statistics referencing the same number of
- * columns (from the clauses), the one with less source columns
- * (as listed in the ADD STATISTICS when creating the statistics) wins.
- * Other wise the first one wins.
+ * or like this
*
- * This is a very simple criteria, and has several weaknesses:
+ * P(A,B,C,D) = P(A,B,C) * P(D|C)
*
- * (a) does not consider the accuracy of the statistics
+ * we should use the first option, as that exploits more dependencies.
*
- * If there are two histograms built on the same set of columns,
- * but one has 100 buckets and the other one has 1000 buckets (thus
- * likely providing better estimates), this is not currently
- * considered.
+ * The order of statistics in the solution implicitly determines the
+ * order of estimation of clauses, because as we apply a statistics,
+ * we always use it to estimate all the clauses covered by it (and
+ * then we use those clauses as conditions for the next statistics).
*
- * (b) does not consider the type of statistics
+ * Don't call this directly but through choose_mv_statistics().
*
- * If there are three statistics - one containing just a MCV list,
- * another one with just a histogram and a third one with both,
- * this is not considered.
*
- * (c) does not consider the number of clauses
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with
+ * maximum 'depth' equal to the number of multi-variate statistics
+ * available on the table.
*
- * As explained, only the number of referenced attributes counts,
- * so if there are multiple clauses on a single attribute, this
- * still counts as a single attribute.
+ * It explores all the possible permutations of the stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it
+ * matches are divided into 'conditions' (clauses already matched by at
+ * least one previous statistics) and clauses that are estimated.
*
- * (d) does not consider type of condition
+ * Then several checks are performed:
*
- * Some clauses may work better with some statistics - for example
- * equality clauses probably work better with MCV lists than with
- * histograms. But IS [NOT] NULL conditions may often work better
- * with histograms (thanks to NULL-buckets).
+ * (a) The statistics covers at least 2 columns, referenced in the
+ * estimated clauses (otherwise multi-variate stats are useless).
*
- * So for example with five WHERE conditions
+ * (b) The statistics covers at least 1 new column, i.e. column not
+ * refefenced by the already used stats (and the new column has
+ * to be referenced by the clauses, of couse). Otherwise the
+ * statistics would not add any new information.
*
- * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ * There are some other sanity checks (e.g. that the stats must not be
+ * used twice etc.).
*
- * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
- * selected as it references the most columns.
+ * Finally the new solution is compared to the currently best one, and
+ * if it's considered better, it's used instead.
*
- * Once we have selected the multivariate statistics, we split the list
- * of clauses into two parts - conditions that are compatible with the
- * selected stats, and conditions are estimated using simple statistics.
*
- * From the example above, conditions
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a somewhat simple optimality criteria,
+ * suffering by the following weaknesses.
*
- * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but
+ * with statistics in a different order). It's unclear which solution
+ * is the best one - in a sense all of them are equal.
*
- * will be estimated using the multivariate statistics (a,b,c,d) while
- * the last condition (e = 1) will get estimated using the regular ones.
+ * TODO It might be possible to compute estimate for each of those
+ * solutions, and then combine them to get the final estimate
+ * (e.g. by using average or median).
*
- * There are various alternative selection criteria (e.g. counting
- * conditions instead of just referenced attributes), but eventually
- * the best option should be to combine multiple statistics. But that's
- * much harder to do correctly.
+ * (b) Does not consider that some types of stats are a better match for
+ * some types of clauses (e.g. MCV list is a good match for equality
+ * than a histogram).
*
- * TODO Select multiple statistics and combine them when computing
- * the estimate.
+ * XXX Maybe MCV is almost always better / more accurate?
+ *
+ * But maybe this is pointless - generally, each column is either
+ * a label (it's not important whether because of the data type or
+ * how it's used), or a value with ordering that makes sense. So
+ * either a MCV list is more appropriate (labels) or a histogram
+ * (values with orderings).
+ *
+ * Now sure what to do with statistics on columns mixing columns of
+ * both types - maybe it'd be beeter to invent a new type of stats
+ * combining MCV list and histogram (keeping a small histogram for
+ * each MCV item, and a separate histogram for values not on the
+ * MCV list). But that's not implemented at this moment.
+ *
+ * TODO The algorithm should probably count number of Vars (not just
+ * attnums) when computing the 'score' of each solution. Computing
+ * the ratio of (num of all vars) / (num of condition vars) as a
+ * measure of how well the solution uses conditions might be
+ * useful.
+ */
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ /*
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int c;
+
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
+
+ Bitmapset *all_attnums = NULL;
+ Bitmapset *new_attnums = NULL;
+
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nclauses; c++)
+ {
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
+
+ if (covered)
+ break;
+ }
+
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
+
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
+
+ /* add the attnums into attnums from 'new clauses' */
+ // new_attnums = bms_union(new_attnums, clause_attnums);
+ }
+
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+ bms_free(new_attnums);
+
+ all_attnums = NULL;
+ new_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
+ {
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+ }
+
+ /*
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
+ */
+ ruled_out[i] = step;
+
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
+
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
+
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ /* we can't get more conditions that clauses and conditions combined
+ *
+ * FIXME This assert does not work because we count the conditions
+ * repeatedly (once for each statistics covering it).
+ */
+ /* Assert((nconditions + nclauses) >= current->nconditions); */
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics
+ * covering the clauses. This chooses the "best" statistics at each step,
+ * so the resulting solution may not be the best solution globally, but
+ * this produces the solution in only N steps (where N is the number of
+ * statistics), while the exhaustive approach may have to walk through
+ * ~N! combinations (although some of those are terminated early).
+ *
+ * See the comments at choose_mv_statistics_exhaustive() as this does
+ * the same thing (but in a different way).
+ *
+ * Don't call this directly, but through choose_mv_statistics().
+ *
+ * TODO There are probably other metrics we might use - e.g. using
+ * number of columns (num_cond_columns / num_cov_columns), which
+ * might work better with a mix of simple and complex clauses.
+ *
+ * TODO Also the choice at the very first step should be handled
+ * in a special way, because there will be 0 conditions at that
+ * moment, so there needs to be some other criteria - e.g. using
+ * the simplest (or most complex?) clause might be a good idea.
+ *
+ * TODO We might also select multiple stats using different criteria,
+ * and branch the search. This is however tricky, because if we
+ * choose k statistics at each step, we get k^N branches to
+ * walk through (with N steps). That's not really good with
+ * large number of stats (yet better than exhaustive search).
+ */
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
+
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
+
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
+ */
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *attnums_new
+ = bms_union(attnums_covered, clauses_attnums[j]);
+
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = attnums_new;
+
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
+
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ attnums_new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = attnums_new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
+
+ if (gain > max_gain)
+ {
+ max_gain = gain;
+ best_stat = i;
+ }
+ }
+
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
+ }
+
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
+}
+
+/*
+ * Chooses the combination of statistics, optimal for estimation of
+ * a particular clause list.
+ *
+ * This only handles a 'preparation' shared by the exhaustive and greedy
+ * implementations (see the previous methods), mostly trying to reduce
+ * the size of the problem (eliminate clauses/statistics that can't be
+ * really used in the solution).
+ *
+ * It also precomputes bitmaps for attributes covered by clauses and
+ * statistics, so that we don't need to do that over and over in the
+ * actual optimizations (as it's both CPU and memory intensive).
*
* TODO This will probably have to consider compatibility of clauses,
* because 'dependencies' will probably work only with equality
* clauses.
+ *
+ * TODO Another way to make the optimization problems smaller might
+ * be splitting the statistics into several disjoint subsets, i.e.
+ * if we can split the graph of statistics (after the elimination)
+ * into multiple components (so that stats in different components
+ * share no attributes), we can do the optimization for each
+ * component separately.
+ *
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew
+ * that we can cover 10 clauses and reuse 8 dependencies, maybe
+ * covering 9 clauses and 7 dependencies would be OK?
*/
-static MVStatisticInfo *
-choose_mv_statistics(List *stats, Bitmapset *attnums)
+static List*
+choose_mv_statistics(PlannerInfo *root, List *stats,
+ List *clauses, List *conditions,
+ Oid varRelid, SpecialJoinInfo *sjinfo)
{
int i;
- ListCell *lc;
+ mv_solution_t *best = NULL;
+ List *result = NIL;
+
+ int nmvstats;
+ MVStatisticInfo *mvstats;
+
+ /* we only work with MCV lists and histograms here */
+ int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
- MVStatisticInfo *choice = NULL;
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
+ stats = list_copy(stats);
/*
- * Walk through the statistics (simple array with nmvstats elements)
- * and for each one count the referenced attributes (encoded in
- * the 'attnums' bitmap).
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
*/
- foreach (lc, stats)
+ while (true)
{
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+ List *tmp;
+
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
+
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. We'll also keep info
+ * about attnums in clauses (without conditions) so that we can
+ * ignore stats covering just conditions (which is pointless).
+ */
+ tmp = filter_clauses(root, varRelid, sjinfo, type,
+ stats, clauses, &compatible_attnums);
+
+ /* discard the original list */
+ list_free(clauses);
+ clauses = tmp;
+
+ /*
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new
+ * attribute (by comparing with clauses).
+ */
+ if (conditions != NIL)
+ {
+ tmp = filter_clauses(root, varRelid, sjinfo, type,
+ stats, conditions, &condition_attnums);
+
+ /* discard the original list */
+ list_free(conditions);
+ conditions = tmp;
+ }
+
+ /* get a union of attnums (from conditions and new clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ tmp = filter_stats(stats, compatible_attnums, all_attnums);
+
+ /* if we've not eliminated anything, terminate */
+ if (list_length(stats) == list_length(tmp))
+ break;
+
+ /* work only with filtered statistics from now */
+ list_free(stats);
+ stats = tmp;
+ }
+
+ /* only do the optimization if we have clauses/statistics */
+ if ((list_length(stats) == 0) || (list_length(clauses) == 0))
+ return NULL;
+
+ /* remove redundant stats (stats covered by another stats) */
+ stats = filter_redundant_stats(stats, clauses, conditions);
+
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* collect clauses an bitmap of attnums */
+ clauses_array = make_clauses_array(clauses, &nclauses);
+ clauses_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
+ clauses_array, nclauses);
+
+ /* collect conditions and bitmap of attnums */
+ conditions_array = make_clauses_array(conditions, &nconditions);
+ conditions_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
+ conditions_array, nconditions);
+
+ /*
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
+ */
+ clause_cover_map = make_cover_map(stats_attnums, nmvstats,
+ clauses_attnums, nclauses);
+
+ condition_cover_map = make_cover_map(stats_attnums, nmvstats,
+ conditions_attnums, nconditions);
+
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ /* no stats are ruled out by default */
+ for (i = 0; i < nmvstats; i++)
+ ruled_out[i] = -1;
+
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* create a list of statistics from the array */
+ if (best != NULL)
+ {
+ for (i = 0; i < best->nstats; i++)
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
+ result = lappend(result, info);
+ }
+ pfree(best);
+ }
- /* columns matching this statistics */
- int matches = 0;
+ /* cleanup (maybe leave it up to the memory context?) */
+ for (i = 0; i < nmvstats; i++)
+ bms_free(stats_attnums[i]);
- int2vector * attrs = info->stakeys;
- int numattrs = attrs->dim1;
+ for (i = 0; i < nclauses; i++)
+ bms_free(clauses_attnums[i]);
- /* skip dependencies-only stats */
- if (! (info->mcv_built || info->hist_built))
- continue;
+ for (i = 0; i < nconditions; i++)
+ bms_free(conditions_attnums[i]);
- /* count columns covered by the histogram */
- for (i = 0; i < numattrs; i++)
- if (bms_is_member(attrs->values[i], attnums))
- matches++;
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(conditions_attnums);
- /*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
- */
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
- {
- choice = info;
- current_matches = matches;
- current_dims = numattrs;
- }
- }
+ pfree(clauses_array);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+ pfree(mvstats);
- return choice;
+ list_free(clauses);
+ list_free(conditions);
+ list_free(stats);
+
+ return result;
}
@@ -1589,6 +2846,51 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
return false;
}
+
+static Bitmapset *
+clause_mv_get_attnums(PlannerInfo *root, Node *clause)
+{
+ Bitmapset * attnums = NULL;
+
+ /* Extract clause from restrict info, if needed. */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+
+ if (IsA(linitial(expr->args), Var))
+ attnums = bms_add_member(attnums,
+ ((Var*)linitial(expr->args))->varattno);
+ else
+ attnums = bms_add_member(attnums,
+ ((Var*)lsecond(expr->args))->varattno);
+ }
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ attnums = bms_add_member(attnums,
+ ((Var*)((NullTest*)clause)->arg)->varattno);
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ attnums = bms_join(attnums,
+ clause_mv_get_attnums(root, (Node*)lfirst(l)));
+ }
+ }
+
+ return attnums;
+}
+
/*
* Performs reduction of clauses using functional dependencies, i.e.
* removes clauses that are considered redundant. It simply walks
@@ -2240,22 +3542,26 @@ get_varattnos(Node * node, Index relid)
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -2266,32 +3572,85 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
- nmatches, matches,
- lowsel, fullmatch, false);
+ ((is_or) ? 0 : nmatches), matches,
+ lowsel, fullmatch, is_or);
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
{
- /* used to 'scale' for MCV lists not covering all tuples */
+ /*
+ * Find out what part of the data is covered by the MCV list,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample items got into the MCV, and the rest
+ * is either in a histogram, or not covered by stats).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole list, which might save us a bit of time
+ * spent on accessing the not-matching part of the MCV list.
+ * Although it's likely in a cache, so it's very fast.
+ */
u += mcvlist->items[i]->frequency;
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s*u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -2589,38 +3948,29 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each MCV item */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert(list_length(tmp_clauses) >= 2);
/* number of matching MCV items */
- or_nmatches = mcvlist->nitems;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
/* by default none of the MCV items matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* AND clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
stakeys, mcvlist,
- or_nmatches, or_matches,
+ tmp_nmatches, tmp_matches,
lowsel, fullmatch, or_clause(clause));
/* merge the bitmap into the existing one*/
@@ -2632,16 +3982,14 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
+ pfree(tmp_matches);
}
else
- {
elog(ERROR, "unknown clause type: %d", clause->type);
- }
}
/*
@@ -2699,15 +4047,18 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
MVSerializedHistogram mvhist = NULL;
@@ -2720,25 +4071,52 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
Assert (mvhist != NULL);
Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
/*
- * Bitmap of bucket matches (mismatch, partial, full). by default
- * all buckets fully match (and we'll eliminate them).
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+ matches = palloc0(sizeof(char) * nmatches);
- nmatches = mvhist->nbuckets;
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
- /* build the match bitmap */
+ /* build the match bitmap for the conditions */
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, is_or);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
- nmatches, matches, false);
+ ((is_or) ? 0 : nmatches), matches,
+ is_or);
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
/*
* Find out what part of the data is covered by the histogram,
* so that we can 'scale' the selectivity properly (e.g. when
@@ -2752,17 +4130,35 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
*/
u += mvhist->buckets[i]->ntuples;
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s * u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -3268,38 +4664,31 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each bucket */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert(list_length(tmp_clauses) >= 2);
/* number of matching buckets */
- or_nmatches = mvhist->nbuckets;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mvhist->nbuckets;
/* by default none of the buckets matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ tmp_matches = palloc0(sizeof(char) * mvhist->nbuckets);
if (or_clause(clause))
{
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ /* AND clauses assume everything matches, initially */
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
}
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ tmp_nmatches = update_match_bitmap_histogram(root, tmp_clauses,
stakeys, mvhist,
- or_nmatches, or_matches, or_clause(clause));
+ tmp_nmatches, tmp_matches, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mvhist->nbuckets; i++)
@@ -3310,10 +4699,10 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
+ pfree(tmp_matches);
}
else
@@ -3325,3 +4714,363 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics.
+ */
+static List *
+filter_clauses(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
+ int type, List *stats, List *clauses, Bitmapset **attnums)
+{
+ ListCell *c;
+ ListCell *s;
+
+ /* results (list of compatible clauses, attnums) */
+ List *rclauses = NIL;
+
+ foreach (c, clauses)
+ {
+ Node *clause = (Node*)lfirst(c);
+ Bitmapset *clause_attnums = NULL;
+ Index relid;
+
+ /*
+ * The clause has to be mv-compatible (suitable operators etc.).
+ */
+ if (! clause_is_mv_compatible(root, clause, varRelid,
+ &relid, &clause_attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ /* is there a statistics covering this clause? */
+ foreach (s, stats)
+ {
+ int k, matches = 0;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ if (bms_is_member(stat->stakeys->values[k],
+ clause_attnums))
+ matches += 1;
+ }
+
+ /*
+ * The clause is compatible if all attributes it references
+ * are covered by the statistics.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ *attnums = bms_union(*attnums, clause_attnums);
+ rclauses = lappend(rclauses, clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(clauses) >= list_length(rclauses));
+
+ return rclauses;
+}
+
+
+/*
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ *
+ * This check might be made more strict by checking against individual
+ * clauses, because by using the bitmapsets of all attnums we may
+ * actually use attnums from clauses that are not covered by the
+ * statistics. For example, we may have a condition
+ *
+ * (a=1 AND b=2)
+ *
+ * and a new clause
+ *
+ * (c=1 AND d=1)
+ *
+ * With only bitmapsets, statistics on [b,c] will pass through this
+ * (assuming there are some statistics covering both clases).
+ *
+ * TODO Do the more strict check.
+ */
+static List *
+filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+{
+ ListCell *s;
+ List *stats_filtered = NIL;
+
+ foreach (s, stats)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* see how many attributes the statistics covers */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ /* attributes from new clauses */
+ if (bms_is_member(stat->stakeys->values[k], new_attnums))
+ matches_new += 1;
+
+ /* attributes from onditions */
+ if (bms_is_member(stat->stakeys->values[k], all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ stats_filtered = lappend(stats_filtered, stat);
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(list_length(stats) >= list_length(stats_filtered));
+
+ return stats_filtered;
+}
+
+static MVStatisticInfo *
+make_stats_array(List *stats, int *nmvstats)
+{
+ int i;
+ ListCell *l;
+
+ MVStatisticInfo *mvstats = NULL;
+ *nmvstats = list_length(stats);
+
+ mvstats
+ = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
+
+ i = 0;
+ foreach (l, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
+ memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
+ }
+
+ return mvstats;
+}
+
+static Bitmapset **
+make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
+{
+ int i, j;
+ Bitmapset **stats_attnums = NULL;
+
+ Assert(nmvstats > 0);
+
+ /* build bitmaps of attnums for the stats (easier to compare) */
+ stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i]
+ = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+
+ return stats_attnums;
+}
+
+
+/*
+ * Now let's remove redundant statistics, covering the same columns
+ * as some other stats, when restricted to the attributes from
+ * remaining clauses.
+ *
+ * If statistics S1 covers S2 (covers S2 attributes and possibly
+ * some more), we can probably remove S2. What actually matters are
+ * attributes from covered clauses (not all the attributes). This
+ * might however prefer larger, and thus less accurate, statistics.
+ *
+ * When a redundancy is detected, we simply keep the smaller
+ * statistics (less number of columns), on the assumption that it's
+ * more accurate and faster to process. That might be incorrect for
+ * two reasons - first, the accuracy really depends on number of
+ * buckets/MCV items, not the number of columns. Second, we might
+ * prefer MCV lists over histograms or something like that.
+ */
+static List*
+filter_redundant_stats(List *stats, List *clauses, List *conditions)
+{
+ int i, j, nmvstats;
+
+ MVStatisticInfo *mvstats;
+ bool *redundant;
+ Bitmapset **stats_attnums;
+ Bitmapset *varattnos;
+ Index relid;
+
+ Assert(list_length(stats) > 0);
+ Assert(list_length(clauses) > 0);
+
+ /*
+ * We'll convert the list of statistics into an array now, because
+ * the reduction of redundant statistics is easier to do that way
+ * (we can mark previous stats as redundant, etc.).
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* by default, none of the stats is redundant (so palloc0) */
+ redundant = palloc0(nmvstats * sizeof(bool));
+
+ /*
+ * We only expect a single relid here, and also we should get the
+ * same relid from clauses and conditions (but we get it from
+ * clauses, because those are certainly non-empty).
+ */
+ relid = bms_singleton_member(pull_varnos((Node*)clauses));
+
+ /*
+ * Get the varattnos from both conditions and clauses.
+ *
+ * This skips system attributes, although that should be impossible
+ * thanks to previous filtering out of incompatible clauses.
+ *
+ * XXX Is that really true?
+ */
+ varattnos = bms_union(get_varattnos((Node*)clauses, relid),
+ get_varattnos((Node*)conditions, relid));
+
+ for (i = 1; i < nmvstats; i++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
+
+ /* walk through 'previous' stats and check redundancy */
+ for (j = 0; j < i; j++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *prev;
+
+ /* skip stats already identified as redundant */
+ if (redundant[j])
+ continue;
+
+ prev = bms_intersect(stats_attnums[j], varattnos);
+
+ switch (bms_subset_compare(curr, prev))
+ {
+ case BMS_EQUAL:
+ /*
+ * Use the smaller one (hopefully more accurate).
+ * If both have the same size, use the first one.
+ */
+ if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
+ redundant[i] = TRUE;
+ else
+ redundant[j] = TRUE;
+
+ break;
+
+ case BMS_SUBSET1: /* curr is subset of prev */
+ redundant[i] = TRUE;
+ break;
+
+ case BMS_SUBSET2: /* prev is subset of curr */
+ redundant[j] = TRUE;
+ break;
+
+ case BMS_DIFFERENT:
+ /* do nothing - keep both stats */
+ break;
+ }
+
+ bms_free(prev);
+ }
+
+ bms_free(curr);
+ }
+
+ /* can't reduce all statistics (at least one has to remain) */
+ Assert(nmvstats > 0);
+
+ /* now, let's remove the reduced statistics from the arrays */
+ list_free(stats);
+ stats = NIL;
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ MVStatisticInfo *info;
+
+ pfree(stats_attnums[i]);
+
+ if (redundant[i])
+ continue;
+
+ info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
+
+ stats = lappend(stats, info);
+ }
+
+ pfree(mvstats);
+ pfree(stats_attnums);
+ pfree(redundant);
+
+ return stats;
+}
+
+static Node**
+make_clauses_array(List *clauses, int *nclauses)
+{
+ int i;
+ ListCell *l;
+
+ Node** clauses_array;
+
+ *nclauses = list_length(clauses);
+ clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
+
+ i = 0;
+ foreach (l, clauses)
+ clauses_array[i++] = (Node *)lfirst(l);
+
+ *nclauses = i;
+
+ return clauses_array;
+}
+
+static Bitmapset **
+make_clauses_attnums(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
+ int type, Node **clauses, int nclauses)
+{
+ int i;
+ Index relid;
+ Bitmapset **clauses_attnums
+ = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
+
+ for (i = 0; i < nclauses; i++)
+ {
+ Bitmapset * attnums = NULL;
+
+ if (! clause_is_mv_compatible(root, clauses[i], varRelid,
+ &relid, &attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ clauses_attnums[i] = attnums;
+ }
+
+ return clauses_attnums;
+}
+
+static bool*
+make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses)
+{
+ int i, j;
+ bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < nclauses; j++)
+ cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+
+ return cover_map;
+}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index ac865be..8f625e6 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3347,7 +3347,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3370,7 +3371,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3537,7 +3539,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3573,7 +3575,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3610,7 +3613,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3748,12 +3752,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3765,7 +3771,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index f0acc14..e41508b 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 04ed07b..3e2f7a4 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1580,13 +1580,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6209,7 +6211,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6534,7 +6537,8 @@ btcostestimate(PG_FUNCTION_ARGS)
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7277,7 +7281,8 @@ gincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7509,7 +7514,7 @@ brincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index be7ba4f..982b66a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -393,6 +394,15 @@ static const struct config_enum_entry row_security_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3648,6 +3658,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 24003ae..6bfd338 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -183,11 +183,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 70f79ed..f2fbc11 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -16,6 +16,14 @@
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
--
1.9.3
Hello, I started to work on this patch.
attached is v7 of the multivariate stats patch. The main improvement
is major refactoring of the clausesel.c portion - splitting the
awfully long spaghetti-style functions into smaller pieces, making it
much more understandable etc.
Thank you, it looks clearer. I have some comment for the brief
look at this. This patchset is relatively large so I will comment
on "per-notice" basis.. which means I'll send comment before
examining the entire of this patchset. Sorry in advance for the
desultory comments.
=======
General comments:
- You included unnecessary stuffs such like regression.diffs in
these patches.
- Now OID 3307 is used by pg_stat_file. I moved
pg_mv_stats_dependencies_info/show to 3311/3312.
- Single-variate stats have a mechanism to inject arbitrary
values as statistics, that is, get_relation_stats_hook and the
similar stuffs. I want the similar mechanism for multivariate
statistics, too.
0001:
- I also don't think it is right thing for expression_tree_walker
to recognize RestrictInfo since it is not a part of expression.
0003:
- In clauselist_selectivity, find_stats is uselessly called for
single clause. This should be called after the clauselist found
to consist more than one clause.
- Searching vars to be compared with mv-stat columns which
find_stats does should stop at disjunctions. But this patch
doesn't behave so and it should be an unwanted behavior. The
following steps shows that.
====
=# CREATE TABLE t1 (a int, b int, c int);
=# INSERT INTO t1 (SELECT a, a * 2, a * 3 FROM generate_series(0, 9999) a);
=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
Seq Scan on t1 (cost=0.00..230.00 rows=1 width=12)
=# ALTER TABLE t1 ADD STATISTICS (HISTOGRAM) ON (a, b, c);
=# ANALZYE t1;
=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
Seq Scan on t1 (cost=0.00..230.00 rows=268 width=12)
====
Rows changed unwantedly.
It seems not so simple thing as your code assumes.
I do assume some of those pieces are unnecessary because there already
is a helper function with the same purpose (but I'm not aware of
that). But IMHO this piece of code begins to look reasonable
(especially when compared to the previous state).
Year, such kind of work should be done later:p This patch is
not-so-invasive so as to make it undoable.
The other major improvement it review of the comments (including
FIXMEs and TODOs), and removal of the obsolete / misplaced ones. And
there was plenty of those ...These changes made this version ~20k smaller than v6.
The patch also rebases to current master, which I assume shall be
quite stable - so hopefully no more duplicate OIDs for a while.There are 6 files attached, but only 0002-0006 are actually part of
the multivariate statistics patch itself. The first part makes it
possible to use pull_varnos() with expression trees containing
RestrictInfo nodes, but maybe this is not the right way to fix this
(there's another thread where this was discussed).
As mentioned above, checking if mv stats can be applied would be
more complex matter than now you are assuming. I also will
consider that.
Also, the regression tests testing plan choice with multivariate stats
(e.g. that a bitmap index scan is chosen instead of index scan) fail
from time to time. I suppose this happens because the invalidation
after ANALYZE is not processed before executing the query, so the
optimizer does not see the stats, or something like that.
I saw that occurs, but have no idea how it occurs so far..
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello Horiguchi-san!
On 07/03/2015 07:30 AM, Kyotaro HORIGUCHI wrote:
Hello, I started to work on this patch.
attached is v7 of the multivariate stats patch. The main improvement
is major refactoring of the clausesel.c portion - splitting the
awfully long spaghetti-style functions into smaller pieces, making it
much more understandable etc.Thank you, it looks clearer. I have some comment for the brief look
at this. This patchset is relatively large so I will comment on
"per-notice" basis.. which means I'll send comment before examining
the entire of this patchset. Sorry in advance for the desultory
comments.
Sure. If you run into something that's not clear enough, I'm happy to
explain that (I tried to cover all the important details in the
comments, but it's a large patch, indeed.)
=======
General comments:- You included unnecessary stuffs such like regression.diffs in
these patches.
Ahhhh :-/ Will fix.
- Now OID 3307 is used by pg_stat_file. I moved
pg_mv_stats_dependencies_info/show to 3311/3312.
Will fix while rebasing to current master.
- Single-variate stats have a mechanism to inject arbitrary
values as statistics, that is, get_relation_stats_hook and the
similar stuffs. I want the similar mechanism for multivariate
statistics, too.
Fair point, although I'm not sure where should we place the hook, how
exactly should it be defined and how useful that would be in the end.
Can you give an example of how you'd use such hook?
I've never used get_relation_stats_hook, but if I get it right, the
plugins can use the hook to create the stats (for each column), either
from scratch or tweaking the existing stats.
I'm not sure how this should work with multivariate stats, though,
because there can be arbitrary number of stats for a column, and it
really depends on all the clauses (so examine_variable() seems a bit
inappropriate, as it only sees a single variable at a time).
Moreover, with multivariate stats
(a) there may be arbitrary number of stats for a column
(b) only some of the stats end up being used for the estimation
I see two or three possible places for calling such hook:
(a) at the very beginning, after fetching the list of stats
- sees all the existing stats on a table
- may add entirely new stats or tweak the existing ones
(b) after collecting the list of variables compatible with
multivariate stats
- like (a) and additionally knows which columns are interesting
for the query (but only with respect to the existing stats)
(c) after optimization (selection of the right combination if stats)
- like (b), but can't affect the optimization
But I can't really imagine anyone building multivariate stats on the
fly, in the hook.
It's more complicated, though, because the query may call
clauselist_selectivity multiple times, depending on how complex the
WHERE clauses are.
0001:
- I also don't think it is right thing for expression_tree_walker
to recognize RestrictInfo since it is not a part of expression.
Yes. In my working git repo, I've reworked this to use the second
option, i.e. adding RestrictInfo pull_(varno|varattno)_walker:
https://github.com/tvondra/postgres/commit/2dc79b914c759d31becd8ae670b37b79663a595f
Do you think this is the correct solution? If not, how to fix it?
0003:
- In clauselist_selectivity, find_stats is uselessly called for
single clause. This should be called after the clauselist found
to consist more than one clause.
Ok, will fix.
- Searching vars to be compared with mv-stat columns which
find_stats does should stop at disjunctions. But this patch
doesn't behave so and it should be an unwanted behavior. The
following steps shows that.
Why should it stop at disjunctions? There's nothing wrong with using
multivariate stats to estimate OR-clauses, IMHO.
====
=# CREATE TABLE t1 (a int, b int, c int);
=# INSERT INTO t1 (SELECT a, a * 2, a * 3 FROM generate_series(0, 9999) a);
=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
Seq Scan on t1 (cost=0.00..230.00 rows=1 width=12)
=# ALTER TABLE t1 ADD STATISTICS (HISTOGRAM) ON (a, b, c);
=# ANALZYE t1;
=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
Seq Scan on t1 (cost=0.00..230.00 rows=268 width=12)
====
Rows changed unwantedly.
That has nothing to do with OR clauses, but rather with using a type of
statistics that does not fit the data and queries. Histograms are quite
inaccurate for discrete data and equality conditions - in this case the
clauses probably match one bucket, and so we use 1/2 the bucket as an
estimate. There's nothing wrong with that.
So let's use MCV instead:
ALTER TABLE t1 ADD STATISTICS (MCV) ON (a, b, c);
ANALYZE t1;
EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
QUERY PLAN
-----------------------------------------------------
Seq Scan on t1 (cost=0.00..230.00 rows=1 width=12)
Filter: (((a = 1) AND (b = 2)) OR (c = 3))
(2 rows)
It seems not so simple thing as your code assumes.
Maybe, but I don't see what assumption is invalid? I see nothing wrong
with the previous query.
kind regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi, Tomas. I'll kick the gas pedal.
Thank you, it looks clearer. I have some comment for the brief look
at this. This patchset is relatively large so I will comment on
"per-notice" basis.. which means I'll send comment before examining
the entire of this patchset. Sorry in advance for the desultory
comments.Sure. If you run into something that's not clear enough, I'm happy to
explain that (I tried to cover all the important details in the
comments, but it's a large patch, indeed.)
- Single-variate stats have a mechanism to inject arbitrary
values as statistics, that is, get_relation_stats_hook and the
similar stuffs. I want the similar mechanism for multivariate
statistics, too.Fair point, although I'm not sure where should we place the hook, how
exactly should it be defined and how useful that would be in the
end. Can you give an example of how you'd use such hook?
It's my secret, but is open:p. this is crucial for us to examine
many planner-related problems occurred in our customer in-vitro.
http://pgdbmsstats.osdn.jp/pg_dbms_stats-en.html
# Mmm, this doc is a bit too old..
One tool of ours does like following,
- Copy pg_statistics and some attributes of pg_class into some
table. Of course this is exportable.
- For example, in examine_simple_variable, using the hook
get_relation_stats_hook, inject the saved statistics in place
of the real statistics.
The hook point is placed where the parameters to specify what
statistics is needed are avaiable in compact shape, and all the
hook function should do is returning corresponding statistics
values.
So the parallel stuff for this mv stats will look like this.
MVStatisticInfo *
get_mv_statistics(PlannerInfo *root, relid);
or
MVStatisticInfo *
get_mv_statistics(PlannerInfo *root, relid, <bitmap or list of attnos>);
So by simplly applying this, the current clauselist_selectivity
code will turn into following.
if (list_length(clauses) == 1)
return clause_selectivity(....);Index varrelid = find_singleton_relid(root, clauses, varRelid);
if (varrelid)
{
// Bitmapset attnums = collect_attnums(root, clauses, varrelid);
if (get_mv_statistics_hook)
stats = get_mv_statistics_hook(root, varrelid /*, attnums */);
else
statis = get_mv_statistics(root, varrelid /*, attnums*/);....
In comparison to single statistics, statistics values might be
preferable to separate from definition.
I've never used get_relation_stats_hook, but if I get it right, the
plugins can use the hook to create the stats (for each column), either
from scratch or tweaking the existing stats.
Mostly existing stats without change. I saw few hackers wanted to
provide predefined statistics for typical cases. I haven't see
anyone who tweaks existing stats.
I'm not sure how this should work with multivariate stats, though,
because there can be arbitrary number of stats for a column, and it
really depends on all the clauses (so examine_variable() seems a bit
inappropriate, as it only sees a single variable at a time).
Restriction clauses are not a problem. What is needed to replace
stats value is defining few APIs to retrieve them, and to
retrieve the stats values only in a way that compatible with the
API. It would be okay to be a substitute views for mv stats as an
extreme case but it is not good.
Moreover, with multivariate stats
(a) there may be arbitrary number of stats for a column
(b) only some of the stats end up being used for the estimation
I see two or three possible places for calling such hook:
(a) at the very beginning, after fetching the list of stats
- sees all the existing stats on a table
- may add entirely new stats or tweak the existing ones
Getting all stats for a table would be okay but attnum list can
restrict the possibilities, as the second form of the example
APIs above. And we may forget the case of forged or tweaked
stats, they are their problem, not ours.
(b) after collecting the list of variables compatible with
multivariate stats- like (a) and additionally knows which columns are interesting
for the query (but only with respect to the existing stats)
We should carefully design the API to be able to point the
pertinent stats for every situation. Mv stats is based on the
correlation of multiple columns so I think only relid and
attributes list are enough as the parameter.
| if (st.relid == param.relid && bms_equal(st.attnums, param.attnums))
| /* This is the stats to be wanted */
If we can filter the appropriate stats from all the stats using
clauselist, we definitely can make the appropriate parameter
(column set) prior to retrieving mv statistics. Isn't it correct?
(c) after optimization (selection of the right combination if stats)
- like (b), but can't affect the optimization
But I can't really imagine anyone building multivariate stats on the
fly, in the hook.It's more complicated, though, because the query may call
clauselist_selectivity multiple times, depending on how complex the
WHERE clauses are.0001:
- I also don't think it is right thing for expression_tree_walker
to recognize RestrictInfo since it is not a part of expression.Yes. In my working git repo, I've reworked this to use the second
option, i.e. adding RestrictInfo pull_(varno|varattno)_walker:https://github.com/tvondra/postgres/commit/2dc79b914c759d31becd8ae670b37b79663a595f
Do you think this is the correct solution? If not, how to fix it?
The reason why I think it is not appropreate is that RestrictInfo
is not a part of expression.
Increasing selectivity of a condition by column correlation is
occurs only for a set of conjunctive clauses. OR operation
devides the sets. Is it agreeable? RestrictInfos can be nested
each other and we should be aware of the AND/OR operators. This
is what expression_tree_walker doesn't.
Perhaps we should provide the dedicate function such like
find_conjunctive_attr_set which does this,
- Check the type top expression of the clause
- If it is a RestrictInfo, check clause_relids then check
clause.
- If it is a bool OR, stop to search and return empty set of
attributes.
- If it is a bool AND, make further check of the components. A
list of RestrictInfo should be treaed as AND connection.
- If it is operator exression, collect used relids and attrs
walking the expression tree.
I should missing something but I think the outline is correct.
Addition to that we should carefully avoid duplicate correction
using the same mv statistics.
I haven't understood what choose_mv_satistics precisely but I
suppose what this function does would be split into the 'making
parameter to find stats' part and 'matching the parameter with
stats in order to retrieve desired stats' part. Could you
reconstruct this process into the form like this?
I feel it is too invasive, or exccesively intermix(?)ed.
0003:
- In clauselist_selectivity, find_stats is uselessly called for
single clause. This should be called after the clauselist found
to consist more than one clause.Ok, will fix.
- Searching vars to be compared with mv-stat columns which
find_stats does should stop at disjunctions. But this patch
doesn't behave so and it should be an unwanted behavior. The
following steps shows that.Why should it stop at disjunctions? There's nothing wrong with using
multivariate stats to estimate OR-clauses, IMHO.
Mv statistics represents how often *every combination of the
column values* occurs. Is it correct? Where the combination can
be replaced with coexists, that is AND. For example MV-MCV.
(a, b, c) freq
(1, 2, 3) 100
(1, 2, 5) 50
(1, 3, 8) 20
(1, 7, 2) 5
===============
total 175
| select * from t where a = 1 and b = 2 and c = 3;
| SELECT 100
This is correct,
| select * from t where a = 1 and b = 2 or c = 3;
| SELECT 100
This is *not* correct. The correct number of tuples is 150.
This is a simple example where OR breaks MV stats assumption.
====
=# CREATE TABLE t1 (a int, b int, c int);
=# INSERT INTO t1 (SELECT a, a * 2, a * 3 FROM generate_series(0,
9999) a);
=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
Seq Scan on t1 (cost=0.00..230.00 rows=1 width=12)
=# ALTER TABLE t1 ADD STATISTICS (HISTOGRAM) ON (a, b, c);
=# ANALZYE t1;
=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
Seq Scan on t1 (cost=0.00..230.00 rows=268 width=12)
====
Rows changed unwantedly.That has nothing to do with OR clauses, but rather with using a type
of statistics that does not fit the data and queries. Histograms are
quite inaccurate for discrete data and equality conditions - in this
case the clauses probably match one bucket, and so we use 1/2 the
bucket as an estimate. There's nothing wrong with that.So let's use MCV instead:
Hmm, it's not a problem what specific number is displayed as
rows. What is crucial is the fact that rows has changed even
though it shouldn't have changed. As I demonstrated above.
ALTER TABLE t1 ADD STATISTICS (MCV) ON (a, b, c);
ANALYZE t1;
EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
QUERY PLAN
-----------------------------------------------------
Seq Scan on t1 (cost=0.00..230.00 rows=1 width=12)
Filter: (((a = 1) AND (b = 2)) OR (c = 3))
(2 rows)It seems not so simple thing as your code assumes.
Maybe, but I don't see what assumption is invalid? I see nothing wrong
with the previous query.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 07/07/2015 08:05 AM, Kyotaro HORIGUCHI wrote:
Hi, Tomas. I'll kick the gas pedal.
Thank you, it looks clearer. I have some comment for the brief look
at this. This patchset is relatively large so I will comment on
"per-notice" basis.. which means I'll send comment before examining
the entire of this patchset. Sorry in advance for the desultory
comments.Sure. If you run into something that's not clear enough, I'm happy to
explain that (I tried to cover all the important details in the
comments, but it's a large patch, indeed.)- Single-variate stats have a mechanism to inject arbitrary
values as statistics, that is, get_relation_stats_hook and the
similar stuffs. I want the similar mechanism for multivariate
statistics, too.Fair point, although I'm not sure where should we place the hook,
how exactly should it be defined and how useful that would be in
the end. Can you give an example of how you'd use such hook?
...
We should carefully design the API to be able to point the pertinent
stats for every situation. Mv stats is based on the correlation of
multiple columns so I think only relid and attributes list are
enough as the parameter.| if (st.relid == param.relid && bms_equal(st.attnums, param.attnums))
| /* This is the stats to be wanted */If we can filter the appropriate stats from all the stats using
clauselist, we definitely can make the appropriate parameter (column
set) prior to retrieving mv statistics. Isn't it correct?
Let me briefly explain how the current clauselist_selectivity
implementation works.
(1) check if there are multivariate statistics on the table - if not,
skip the multivariate parts altogether (the point of this is to
minimize impact on users who don't use the new feature)
(2) see if the are clauses compatible with multivariate stats - this
only checks "general compatibility" without actually checking the
existing stats (the point is to terminate early, if the clauses
are not compatible somehow - e.g. if the clauses reference only a
single attribute, use unsupported operators etc.)
(3) if there are multivariate stats and compatible clauses, the
function choose_mv_stats tries to find the best combination of
multivariate stats with respect to the clauses (details later)
(4) the clauses are estimated using the stats, the remaining clauses
are estimated using the current statistics (single attribute)
The only way to reliably inject new stats is by calling a hook before
(1), allowing it to arbitrarily modify the list of stats. Based on the
use cases you provided, I don't think it makes much sense to add
additional hooks in the other phases.
At this place it's however now known what clauses are compatible with
multivariate stats, or what attributes they are referencing. It might be
possible to simply call pull_varattnos() and pass it to the hook, except
that does not work with RestrictInfo :-/
Or maybe we could / should not put the hook into clauselist_selectivity
but somewhere else? Say, to get_relation_info where we actually read the
list of stats for the relation?
0001:
- I also don't think it is right thing for expression_tree_walker
to recognize RestrictInfo since it is not a part of expression.Yes. In my working git repo, I've reworked this to use the second
option, i.e. adding RestrictInfo pull_(varno|varattno)_walker:https://github.com/tvondra/postgres/commit/2dc79b914c759d31becd8ae670b37b79663a595f
Do you think this is the correct solution? If not, how to fix it?
The reason why I think it is not appropreate is that RestrictInfo
is not a part of expression.Increasing selectivity of a condition by column correlation is
occurs only for a set of conjunctive clauses. OR operation
devides the sets. Is it agreeable? RestrictInfos can be nested
each other and we should be aware of the AND/OR operators. This
is what expression_tree_walker doesn't.
I still don't understand why you think we need to differentiate between
AND and OR operators. There's nothing wrong with estimating OR clauses
using multivariate statistics.
Perhaps we should provide the dedicate function such like
find_conjunctive_attr_set which does this,
Perhaps. The reason why I added support for RestrictInfo into the
existing walker implementations is that it seemed like the easiest way
to fix the issue. But if there are reasons why that's incorrect, then
inventing a new function is probably the right way.
- Check the type top expression of the clause
- If it is a RestrictInfo, check clause_relids then check
clause.- If it is a bool OR, stop to search and return empty set of
attributes.- If it is a bool AND, make further check of the components. A
list of RestrictInfo should be treaed as AND connection.- If it is operator exression, collect used relids and attrs
walking the expression tree.I should missing something but I think the outline is correct.
As I said before, there's nothing wrong with estimating OR clauses using
multivariate statistics. So OR and AND should be handled exactly the same.
I think you're missing the fact that it's not enough to look at the
relids from the RestrictInfo - we need to actually check what clauses
are used inside, i.e. we need to check the clauses.
That's because only some of the clauses are compatible with multivariate
stats, and only if all the clauses of the BoolExpr are "compatible" then
we can estimate the clause as a whole. If it's a mix of supported and
unsupported clauses, we can simply pass it to clauselist_selectivity
which will repeat the whole process with.
Addition to that we should carefully avoid duplicate correction
using the same mv statistics.
Sure. That's what choose_mv_statistics does.
I haven't understood what choose_mv_satistics precisely but I
suppose what this function does would be split into the 'making
parameter to find stats' part and 'matching the parameter with
stats in order to retrieve desired stats' part. Could you
reconstruct this process into the form like this?
The goal of choose_mv_statistics does is very simple - given a list of
clauses, it tries to find the best combination of statistics, exploiting
as much information as possible.
So let's say you have clauses
WHERE a=1 AND b=1 AND c=1 AND d=1
but you only have statistics on [a,b], [b,c] and [b,c,d].
The simplest approach would be to use the 'largest' statistics, covering
the most columns from the clauses - in this case [b,c,d]. This is what
the initial patches do.
The last patch improves this significantly, by combining the statistics
using conditional probability. In this case it'd probably use all three
statistics, effectively decomposing the selectivity like this:
P(a=1,b=1,c=1,d=1) = P(a=1,b=1) * P(c=1|b=1) * P(d=1|b=1,c=1)
[a,b] [b,c] [b,c,d]
And each of those probabilities can be estimated using one of the stats.
I feel it is too invasive, or exccesively intermix(?)ed.
I don't think it really fits your model - the hook has to be called much
sooner, effectively at the very beginning of the clauselist_selectivity
or even before that. Otherwise it might not get called at all (e.g. if
there are no multivariate stats on the table, this whole part will be
skipped).
Why should it stop at disjunctions? There's nothing wrong with using
multivariate stats to estimate OR-clauses, IMHO.Mv statistics represents how often *every combination of the
column values* occurs. Is it correct? Where the combination can
be replaced with coexists, that is AND. For example MV-MCV.(a, b, c) freq
(1, 2, 3) 100
(1, 2, 5) 50
(1, 3, 8) 20
(1, 7, 2) 5
===============
total 175| select * from t where a = 1 and b = 2 and c = 3;
| SELECT 100This is correct,
| select * from t where a = 1 and b = 2 or c = 3;
| SELECT 100This is *not* correct. The correct number of tuples is 150.
This is a simple example where OR breaks MV stats assumption.
No, it does not.
I'm not sure where are the numbers coming from, though. So let's see how
this actually works with multivariate statistics. I'll create a table
with the 4 combinations you used in your example, but with 1000x more
rows, to make the estimates a bit more accurate:
CREATE TABLE t (a INT, b INT, c INT);
INSERT INTO t SELECT 1, 2, 3 FROM generate_series(1,100000);
INSERT INTO t SELECT 1, 2, 5 FROM generate_series(1,50000);
INSERT INTO t SELECT 1, 3, 8 FROM generate_series(1,20000);
INSERT INTO t SELECT 1, 7, 2 FROM generate_series(1,5000);
ALTER TABLE t ADD STATISTICS (mcv) ON (a,b,c);
ANALYZE t;
And now let's see the two queries:
EXPLAIN select * from t where a = 1 and b = 2 and c = 3;
QUERY PLAN
----------------------------------------------------------
Seq Scan on t (cost=0.00..4008.50 rows=100403 width=12)
Filter: ((a = 1) AND (b = 2) AND (c = 3))
(2 rows)
EXPLAIN select * from t where a = 1 and b = 2 or c = 3;
QUERY PLAN
----------------------------------------------------------
Seq Scan on t (cost=0.00..4008.50 rows=150103 width=12)
Filter: (((a = 1) AND (b = 2)) OR (c = 3))
(2 rows)
So the first query estimates 100k rows, the second one 150k rows.
Exactly as expected, because MCV lists are discrete, match perfectly the
data and behave exactly like your mental model.
If you try this with histograms though, you'll get the same estimate in
both cases:
ALTER TABLE t DROP STATISTICS ALL;
ALTER TABLE t ADD STATISTICS (histogram) ON (a,b,c);
ANALYZE t;
EXPLAIN select * from t where a = 1 and b = 2 and c = 3;
QUERY PLAN
---------------------------------------------------------
Seq Scan on t (cost=0.00..4008.50 rows=52707 width=12)
Filter: ((a = 1) AND (b = 2) AND (c = 3))
(2 rows)
EXPLAIN select * from t where a = 1 and b = 2 or c = 3;
QUERY PLAN
---------------------------------------------------------
Seq Scan on t (cost=0.00..4008.50 rows=52707 width=12)
Filter: (((a = 1) AND (b = 2)) OR (c = 3))
(2 rows)
That's unfortunate, but it has nothing to do with some assumptions of
multivariate statistics. The "problem" is that histograms are naturally
fuzzy, and both conditions hit the same bucket.
The solution is simple - don't use histograms for such discrete data.
====
=# CREATE TABLE t1 (a int, b int, c int);
=# INSERT INTO t1 (SELECT a, a * 2, a * 3 FROM generate_series(0,
9999) a);
=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
Seq Scan on t1 (cost=0.00..230.00 rows=1 width=12)
=# ALTER TABLE t1 ADD STATISTICS (HISTOGRAM) ON (a, b, c);
=# ANALZYE t1;
=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
Seq Scan on t1 (cost=0.00..230.00 rows=268 width=12)
====
Rows changed unwantedly.That has nothing to do with OR clauses, but rather with using a
type of statistics that does not fit the data and queries.
Histograms are quite inaccurate for discrete data and equality
conditions - in this case the clauses probably match one bucket,
and so we use 1/2 the bucket as an estimate. There's nothing wrong
with that.So let's use MCV instead:
Hmm, it's not a problem what specific number is displayed as
rows. What is crucial is the fact that rows has changed even
though it shouldn't have changed. As I demonstrated above.
Again, that has nothing to do with any assumptions, and it certainly
does not demonstrate that OR clauses should not be handled by
multivariate statistics.
In this case, you're observing two effects.
(1) Natural inaccuracy of histograms when used for discrete data,
especially in combination with equality conditions (because
that's impossible to estimate accurately with histograms).
(2) The original estimate (without multivariate statistics) is only
seemingly accurate, because it falsely assumes independence.
It simply assumes that each condition matches 1/10000 of the
table, and multiplies that, getting ~0.00001 row estimate. This
is rounded up to 1, which is accidentally the exact value.
Let me demonstrate this on two examples - one with discrete data, one
with continuous distribution.
1) discrete data
CREATE TABLE t (a INT, b INT, c INT);
INSERT INTO t SELECT i/1000, 2*(i/1000), 3*(i/1000)
FROM generate_series(1, 1000000) s(i);
ANALYZE t;
-- no multivariate stats (so assumption of independence)
EXPLAIN ANALYZE select * from t where a = 1 and b = 2 and c = 3;
Seq Scan on t (cost=0.00..22906.00 rows=1 width=12)
(actual time=0.290..59.120 rows=1000 loops=1)
EXPLAIN ANALYZE select * from t where a = 1 and b = 2 or c = 3;
Seq Scan on t (cost=0.00..22906.00 rows=966 width=12)
(actual time=0.434..117.643 rows=1000 loops=1)
EXPLAIN ANALYZE select * from t where a = 1 and b = 2 or c = 6;
Seq Scan on t (cost=0.00..22906.00 rows=966 width=12)
(actual time=0.433..96.956 rows=2000 loops=1)
-- now let's add a histogram
ALTER TABLE t ADD STATISTICS (histogram) on (a,b,c);
ANALYZE t;
EXPLAIN ANALYZE select * from t where a = 1 and b = 2 and c = 3;
Seq Scan on t (cost=0.00..22906.00 rows=817 width=12)
(actual time=0.268..116.318 rows=1000 loops=1)
EXPLAIN ANALYZE select * from t where a = 1 and b = 2 or c = 3;
Seq Scan on t (cost=0.00..22906.00 rows=30333 width=12)
(actual time=0.435..93.232 rows=1000 loops=1)
EXPLAIN ANALYZE select * from t where a = 1 and b = 2 or c = 6;
Seq Scan on t (cost=0.00..22906.00 rows=30333 width=12)
(actual time=0.434..122.930 rows=2000 loops=1)
-- now let's use a MCV list
ALTER TABLE t DROP STATISTICS ALL;
ALTER TABLE t ADD STATISTICS (mcv) on (a,b,c);
ANALYZE t;
EXPLAIN ANALYZE select * from t where a = 1 and b = 2 and c = 3;
Seq Scan on t (cost=0.00..22906.00 rows=767 width=12)
(actual time=0.268..70.604 rows=1000 loops=1)
EXPLAIN ANALYZE select * from t where a = 1 and b = 2 or c = 3;
Seq Scan on t (cost=0.00..22906.00 rows=767 width=12)
(actual time=0.268..70.604 rows=1000 loops=1)
EXPLAIN ANALYZE select * from t where a = 1 and b = 2 or c = 6;
Seq Scan on t (cost=0.00..22906.00 rows=1767 width=12)
(actual time=0.428..100.607 rows=2000 loops=1)
The default estimate of AND query is rather bad. For OR clause, it's not
that bad (the OR selectivity is not that bad when it comes to
dependency, but it's not difficult to construct counter examples).
The histogram is not that good - for the OR queries it often results in
over-estimates (for equality conditions on discrete data).
But the MCV estimates are very accurate. The slight under-estimate is
probably caused by the block sampling we're using to get sample rows.
2) continuous data (I'll only show histograms)
CREATE TABLE t (a FLOAT, b FLOAT, c FLOAT);
INSERT INTO t SELECT r,
r + r*(random() - 0.5)/2,
r + r*(random() - 0.5)/2
FROM (SELECT random() as r
FROM generate_series(1,1000000)) foo;
ANALYZE t;
-- no multivariate stats
EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 and c < 0.3;
Seq Scan on t (cost=0.00..23870.00 rows=28768 width=24)
(actual time=0.026..323.383 rows=273897 loops=1)
EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 or c < 0.3;
Seq Scan on t (cost=0.00..23870.00 rows=372362 width=24)
(actual time=0.026..375.005 rows=317533 loops=1)
EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 or c > 0.9;
Seq Scan on t (cost=0.00..23870.00 rows=192979 width=24)
(actual time=0.026..431.376 rows=393528 loops=1)
-- histograms
ALTER TABLE t ADD STATISTICS (histogram) on (a,b,c);
ANALYZE t;
EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 and c < 0.3;
Seq Scan on t (cost=0.00..23870.00 rows=267033 width=24)
(actual time=0.021..330.487 rows=273897 loops=1)
EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 or c > 0.3;
Seq Scan on t (cost=0.00..23870.00 rows=14317 width=24)
(actual time=0.027..906.321 rows=966870 loops=1)
EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 or c > 0.9;
Seq Scan on t (cost=0.00..23870.00 rows=20367 width=24)
(actual time=0.028..452.494 rows=393528 loops=1)
This seems wrong, because the estimate for the OR queries should not be
lower than the estimate for the first query (with just AND), and it
should not increase when increasing the boundary. I'd bet this is a bug
in how the inequalities are handled with histograms, or how the AND/OR
clauses are combined. I'll look into that.
But once again, there's nothing that would make OR clauses somehow
incompatible with multivariate stats.
kind regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello Horiguchi-san!
On 07/07/2015 09:43 PM, Tomas Vondra wrote:
-- histograms
ALTER TABLE t ADD STATISTICS (histogram) on (a,b,c);
ANALYZE t;EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 and c < 0.3;
Seq Scan on t (cost=0.00..23870.00 rows=267033 width=24)
(actual time=0.021..330.487 rows=273897 loops=1)EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 or c > 0.3;
Seq Scan on t (cost=0.00..23870.00 rows=14317 width=24)
(actual time=0.027..906.321 rows=966870 loops=1)EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 or c > 0.9;
Seq Scan on t (cost=0.00..23870.00 rows=20367 width=24)
(actual time=0.028..452.494 rows=393528 loops=1)This seems wrong, because the estimate for the OR queries should not be
lower than the estimate for the first query (with just AND), and it
should not increase when increasing the boundary. I'd bet this is a bug
in how the inequalities are handled with histograms, or how the AND/OR
clauses are combined. I'll look into that.
FWIW this was a stupid bug in update_match_bitmap_histogram(), which
initially handled only AND clauses, and thus assumed the "match" of a
bucket can only decrease. But for OR clauses this is exactly the
opposite (we assume no buckets match and add buckets matching at least
one of the clauses).
With this fixed, the estimates look like this:
EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 and c < 0.3;
Seq Scan on t (cost=0.00..23870.00 rows=267033 width=24)
(actual time=0.102..321.524 rows=273897 loops=1)
EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 or c < 0.3;
Seq Scan on t (cost=0.00..23870.00 rows=319400 width=24)
(actual time=0.103..386.089 rows=317533 loops=1)
EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 or c > 0.3;
Seq Scan on t (cost=0.00..23870.00 rows=956833 width=24)
(actual time=0.133..908.455 rows=966870 loops=1)
EXPLAIN ANALYZE select * from t where a < 0.3 and b < 0.3 or c > 0.9;
Seq Scan on t (cost=0.00..23870.00 rows=393633 width=24)
(actual time=0.105..440.607 rows=393528 loops=1)
IMHO pretty accurate estimates - no issue with OR clauses.
I've pushed this to github [1]https://github.com/tvondra/postgres/tree/mvstats but I need to do some additional fixes. I
also had to remove some optimizations while fixing this, and will have
to reimplement those.
That's not to say that the handling of OR-clauses is perfectly correct.
After looking at clauselist_selectivity_or(), I believe it's a bit
broken and will need a bunch of fixes, as explained in the FIXMEs I
pushed to github.
[1]: https://github.com/tvondra/postgres/tree/mvstats
kind regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi, Thanks for the detailed explaination. I misunderstood the
code (more honest speaking, din't look so close there). Then I
looked it closer.
At Wed, 08 Jul 2015 03:03:16 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <559C76D4.2030805@2ndquadrant.com>
FWIW this was a stupid bug in update_match_bitmap_histogram(), which
initially handled only AND clauses, and thus assumed the "match" of a
bucket can only decrease. But for OR clauses this is exactly the
opposite (we assume no buckets match and add buckets matching at least
one of the clauses).With this fixed, the estimates look like this:
IMHO pretty accurate estimates - no issue with OR clauses.
Ok, I understood the diferrence between what I thought and what
you say. The code is actually concious of OR clause but is looks
somewhat confused.
Currently choosing mv stats in clauselist_selectivity can be
outlined as following,
1. find_stats finds candidate mv stats containing *all*
attributes appeared in the whole clauses regardless of and/or
exprs by walking whole the clause tree.
Perhaps this is the measure to early bailout.
2.1. Within every disjunction elements, collect mv-related
attributes while checking whether the all leaf nodes (binop or
ifnull) are compatible by (eventually) walking whole the
clause tree.
2.2. Check if all the collected attribute are contained in
mv-stats columns.
3. Finally, clauseset_mv_selectivity_histogram() (and others).
This funciton applies every ExprOp onto every attribute in
every histogram backes and (tries to) make the boolean
operation of the result bitmaps.
I have some comments on the implement and I also try to find the
solution for them.
1. The flow above looks doing very similiar thins repeatedly.
2. I believe what the current code does can be simplified.
3. As you mentioned in comments, some additional infrastructure
needed.
After all, I think what we should do after this are as follows,
as the first step.
- Add the means to judge the selectivity operator(?) by other
than oprrest of the op of ExprOp. (You missed neqsel already)
I suppose one solution for this is adding oprmvstats taking
'm', 'h' and 'f' and their combinations. Or for the
convenience, it would be a fixed-length string like this.
oprname | oprmvstats
= | 'mhf'
<> | 'mhf'
< | 'mh-'
| 'mh-'
= | 'mh-'
<= | 'mh-'
This would make the code in clause_is_mv_compatible like this.
oprmvstats = get_mvstatsset(expr->opno); /* bitwise representation */
if (oprmvstats & types)
{
*attnums = bms_add_member(*attnums, var->varattno);
return true;
}
return false;
- Current design just manage to work but it is too complicated
and hardly have affinity with the existing estimation
framework. I proposed separation of finding stats phase and
calculation phase, but I would like to propose transforming
RestrictInfo(and finding mvstat) phase and running the
transformed RestrictInfo phase after looking close to the
patch.
I think transforing RestrictInfo makes the situnation
better. Since it nedds different information, maybe it is
better to have new struct, say, RestrictInfoForEstimate
(boo!). Then provide mvstatssel() to use in the new struct.
The rough looking of the code would be like below.
clauselist_selectivity()
{
...
RestrictInfoForEstmate *esclause =
transformClauseListForEstimation(root, clauses, varRelid);
...
return clause_selectivity(esclause):
}
clause_selectivity(RestrictInfoForEstmate *esclause)
{
if (IsA(clause, RestrictInfo))...
if (IsA(clause, RestrictInfoForEstimate))
{
RestrictInfoForEstimate *ecl = (RestrictInfoForEstimate*) clause;
if (ecl->selfunc)
{
sx = ecl->selfunc(root, ecl);
}
}
if (IsA(clause, Var))...
}
transformClauseListForEstimation(...)
{
...
relid = collect_mvstats_info(root, clause, &attlist);
if (!relid) return;
if (get_mvstats_hook)
mvstats = (*get_mvstats_hoook) (root, relid, attset);
else
mvstats = find_mv_stats(root, relid, attset))
}
...
I've pushed this to github [1] but I need to do some additional
fixes. I also had to remove some optimizations while fixing this, and
will have to reimplement those.That's not to say that the handling of OR-clauses is perfectly
correct. After looking at clauselist_selectivity_or(), I believe it's
a bit broken and will need a bunch of fixes, as explained in the
FIXMEs I pushed to github.
I don't see whether it is doable or not, and I suppose you're
unwilling to change the big picture, so I will consider the idea
and will show you the result, if it turns out to be possible and
promising.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 07/13/2015 10:51 AM, Kyotaro HORIGUCHI wrote:
Ok, I understood the diferrence between what I thought and what you
say. The code is actually concious of OR clause but is looks somewhat
confused.
I'm not sure which part is confused by the OR clauses, but it's
certainly possible. Initially it only handled AND clauses, and the
support for OR clauses was added later, so it's possible some parts are
not behaving correctly.
Currently choosing mv stats in clauselist_selectivity can be
outlined as following,1. find_stats finds candidate mv stats containing *all*
attributes appeared in the whole clauses regardless of and/or
exprs by walking whole the clause tree.Perhaps this is the measure to early bailout.
Not entirely. The goal of find_stats() is to lookup all stats on the
'current' relation - it's coded the way it is because I had to deal with
varRelid=0 cases, in which case I have to inspect the Var nodes. But
maybe I got this wrong and there's much simpler way to do that?
It is an early bailout in the sense that if there are no multivariate
stats defined on the table, there's no point in doing any of the
following steps. So that we don't increase planning times for users not
using multivariate stats.
2.1. Within every disjunction elements, collect mv-related
attributes while checking whether the all leaf nodes (binop or
ifnull) are compatible by (eventually) walking whole the
clause tree.
Generally, yes. The idea is to check whether there are clauses that
might be estimated using multivariate statistics, and whether the
clauses reference at least two different attributes. Imagine a query
like this:
SELECT * FROM t WHERE (a=1) AND (a>0) AND (a<100)
It makes no sense to process this using multivariate statistics, because
all the Var nodes reference a single attribute.
Similarly, the check is not just about the leaf nodes - to be able to
estimate a clause at this point, we have to be able to process the whole
tree, starting from the top-level clause. Although maybe that's no
longer true, now that support for OR clauses was added ... I wonder
whether there are other BoolExpr-like nodes, that might make the tree
incompatible with multivariate statistics (in the sense that the current
implementation does not know how to handle them).
Also note that even though the clause may be "incompatible" at this
level, it may get partially processed by multivariate statistics later.
For example with a query:
SELECT * FROM t WHERE (a=1 OR b=2 OR c ~* 'xyz') AND (q=1 OR r=4)
the first query is "incompatible" because it contains unsupported
operator '~*', but it will eventually be processed as BoolExpr node, and
should be split into two parts - (a=1 OR b=2) which is compatible, and
(c ~* 'xyz') which is incompatible.
This split should happen in clauselist_selectivity_or(), and the other
thing that may be interesting is that it uses (q=1 OR r=4) as a
condition. So if there's a statistics built on (a,b,q,r) we'll compute
conditional probability
P(a=1,b=2 | q=1,r=4)
2.2. Check if all the collected attribute are contained in
mv-stats columns.
No, I think you got this wrong. We do not check that *all* the
attributes are contained in mvstats columns - we only need two such
columns (then there's a chance that the multivariate statistics will get
applied).
Anyway, both 2.1 and 2.2 are meant as a quick bailout, before doing the
most expensive part, which is choose_mv_statistics(). Which is however
missing in this list.
3. Finally, clauseset_mv_selectivity_histogram() (and others).
This funciton applies every ExprOp onto every attribute in
every histogram backes and (tries to) make the boolean
operation of the result bitmaps.
Yes, but this only happens after choose_mv_statistics(), because that's
the code that decides which statistics will be used and in what order.
The list is also missing handling of the 'functional dependencies', so a
complete list of steps would look like this:
1) find_stats - lookup stats on the current relation (from RelOptInfo)
2) apply functional dependencies
a) check if there are equality clauses that may be reduced using
functional dependencies, referencing at least two columns
b) if yes, perform the clause reduction
3) apply MCV lists and histograms
a) check if there are clauses 'compatible' with those types of
statistics, again containing at least two columns
b) if yes, use choose_mv_statistics() to decide which statistics to
apply and in which order
c) apply the selected histograms and MCV lists
4) estimate the remaining clauses using the regular statistics
a) this is where the clauselist_mv_selectivity_histogram and other
are called
I tried to explain this in the comment before clauselist_selectivity(),
but maybe it's not detailed enough / missing some important details.
I have some comments on the implement and I also try to find the
solution for them.1. The flow above looks doing very similiar thins repeatedly.
I worked hard to remove such code duplicities, and believe all the
current steps are necessary - for example 2(a) and 3(a) may seems
similar, but it's really necessary to do that twice.
2. I believe what the current code does can be simplified.
Possibly.
3. As you mentioned in comments, some additional infrastructure
needed.After all, I think what we should do after this are as follows,
as the first step.
OK.
- Add the means to judge the selectivity operator(?) by other
than oprrest of the op of ExprOp. (You missed neqsel already)
Yes, the way we use 'oprno' to determine how to estimate the selectivity
is a bit awkward. It's inspired by handling of range queries, and having
something better would be nice.
But I don't think this is the reason why I missed neqsel, and I don't
see this as a significant design issue at this point. But if we can come
up with a better solution, why not ...
I suppose one solution for this is adding oprmvstats taking
'm', 'h' and 'f' and their combinations. Or for the
convenience, it would be a fixed-length string like this.oprname | oprmvstats
= | 'mhf'
<> | 'mhf'
< | 'mh-'| 'mh-'
= | 'mh-'<= | 'mh-'
This would make the code in clause_is_mv_compatible like this.
oprmvstats = get_mvstatsset(expr->opno); /* bitwise representation */
if (oprmvstats & types)
{
*attnums = bms_add_member(*attnums, var->varattno);
return true;
}
return false;
So this only determines the compatibility of operators with respect to
different types of statistics? How does that solve the neqsel case? It
will probably decide the clause is compatible, but it will later fail at
the actual estimation, no?
- Current design just manage to work but it is too complicated
and hardly have affinity with the existing estimation
framework.
I respectfully disagree. I've strived to make it as affine to the
current implementation as possible - maybe it's possible to improve
that, but I believe there's a natural difference between the two types
of statistics. It may be somewhat simplified, but it will never be
exactly the same.
I proposed separation of finding stats phase and
calculation phase, but I would like to propose transforming
RestrictInfo(and finding mvstat) phase and running the
transformed RestrictInfo phase after looking close to the
patch.
Those phases are already separated, as is illustrated by the steps
explained above.
So technically we might place a hook either right after the find_stats()
call, so that it's possible to process all the stats on the table, or
maybe after the choose_mv_statistics() call, so that we only process the
actually used stats.
I think transforing RestrictInfo makes the situnation
better. Since it nedds different information, maybe it is
better to have new struct, say, RestrictInfoForEstimate
(boo!). Then provide mvstatssel() to use in the new struct.
The rough looking of the code would be like below.clauselist_selectivity()
{
...
RestrictInfoForEstmate *esclause =
transformClauseListForEstimation(root, clauses, varRelid);
...return clause_selectivity(esclause):
}clause_selectivity(RestrictInfoForEstmate *esclause)
{
if (IsA(clause, RestrictInfo))...
if (IsA(clause, RestrictInfoForEstimate))
{
RestrictInfoForEstimate *ecl = (RestrictInfoForEstimate*) clause;
if (ecl->selfunc)
{
sx = ecl->selfunc(root, ecl);
}
}
if (IsA(clause, Var))...
}transformClauseListForEstimation(...)
{
...relid = collect_mvstats_info(root, clause, &attlist);
if (!relid) return;
if (get_mvstats_hook)
mvstats = (*get_mvstats_hoook) (root, relid, attset);
else
mvstats = find_mv_stats(root, relid, attset))
}
...
So you'd transform the clause tree first, replacing parts of the tree
(to be estimated by multivariate stats) by a new node type? That's an
interesting idea, I think ...
I can't really say whether it's a good approach, though. Can you explain
why do you think it'd make the situation better?
The one benefit I can think of is being able to look at the processed
tree and see which parts will be estimated using multivariate stats.
But we'd effectively have to do the same stuff (choosing the stats,
...), and if we move this pre-processing before clauselist_selectivity
(I assume that's the point), we'd end up repeating a lot of the code. Or
maybe not, I'm not sure.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi, I'd like to show you the modified constitution of
multivariate statistics application logic. Please find the
attached. They apply on your v7 patch.
The code to find mv-applicable clause is moved out of the main
flow of clauselist_selectivity. As I said in the previous mail,
the new function transformRestrictInfoForEstimate (too bad name
but just for PoC:) scans clauselist and generates
RestrictStatsData struct which drives mv-aware selectivity
calculation. This struct isolates MV and non-MV estimation.
The struct RestrictStatData mainly consists of the following
three parts,
- clause to be estimated by current logic (MV is not applicable)
- clause to be estimated by MV-staistics.
- list of child RestrictStatDatas, which are to be run
recursively.
mvclause_selectivty() is the topmost function where mv stats
works. This structure effectively prevents main estimation flow
from being broken by modifying mvstats part. Although I haven't
measured but I'm positive the code is far reduced from yours.
I attached two patches to this message. The first one is to
rebase v7 patch to current(maybe) master and the second applies
the refactoring.
I'm a little anxious about performance but I think this makes the
process to apply mv-stats far clearer. Regtests for mvstats
succeeded asis except for fdep, which is not implememted in this
patch.
What do you think about this?
regards,
Hi, Thanks for the detailed explaination. I misunderstood the
code (more honest speaking, din't look so close there). Then I
looked it closer.At Wed, 08 Jul 2015 03:03:16 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <559C76D4.2030805@2ndquadrant.com>
FWIW this was a stupid bug in update_match_bitmap_histogram(), which
initially handled only AND clauses, and thus assumed the "match" of a
bucket can only decrease. But for OR clauses this is exactly the
opposite (we assume no buckets match and add buckets matching at least
one of the clauses).With this fixed, the estimates look like this:
IMHO pretty accurate estimates - no issue with OR clauses.
Ok, I understood the diferrence between what I thought and what
you say. The code is actually concious of OR clause but is looks
somewhat confused.Currently choosing mv stats in clauselist_selectivity can be
outlined as following,1. find_stats finds candidate mv stats containing *all*
attributes appeared in the whole clauses regardless of and/or
exprs by walking whole the clause tree.Perhaps this is the measure to early bailout.
2.1. Within every disjunction elements, collect mv-related
attributes while checking whether the all leaf nodes (binop or
ifnull) are compatible by (eventually) walking whole the
clause tree.2.2. Check if all the collected attribute are contained in
mv-stats columns.3. Finally, clauseset_mv_selectivity_histogram() (and others).
This funciton applies every ExprOp onto every attribute in
every histogram backes and (tries to) make the boolean
operation of the result bitmaps.I have some comments on the implement and I also try to find the
solution for them.1. The flow above looks doing very similiar thins repeatedly.
2. I believe what the current code does can be simplified.
3. As you mentioned in comments, some additional infrastructure
needed.After all, I think what we should do after this are as follows,
as the first step.- Add the means to judge the selectivity operator(?) by other
than oprrest of the op of ExprOp. (You missed neqsel already)I suppose one solution for this is adding oprmvstats taking
'm', 'h' and 'f' and their combinations. Or for the
convenience, it would be a fixed-length string like this.oprname | oprmvstats
= | 'mhf'
<> | 'mhf'
< | 'mh-'| 'mh-'
= | 'mh-'<= | 'mh-'
This would make the code in clause_is_mv_compatible like this.
oprmvstats = get_mvstatsset(expr->opno); /* bitwise representation */
if (oprmvstats & types)
{
*attnums = bms_add_member(*attnums, var->varattno);
return true;
}
return false;- Current design just manage to work but it is too complicated
and hardly have affinity with the existing estimation
framework. I proposed separation of finding stats phase and
calculation phase, but I would like to propose transforming
RestrictInfo(and finding mvstat) phase and running the
transformed RestrictInfo phase after looking close to the
patch.I think transforing RestrictInfo makes the situnation
better. Since it nedds different information, maybe it is
better to have new struct, say, RestrictInfoForEstimate
(boo!). Then provide mvstatssel() to use in the new struct.
The rough looking of the code would be like below.clauselist_selectivity()
{
...
RestrictInfoForEstmate *esclause =
transformClauseListForEstimation(root, clauses, varRelid);
...return clause_selectivity(esclause):
}clause_selectivity(RestrictInfoForEstmate *esclause)
{
if (IsA(clause, RestrictInfo))...
if (IsA(clause, RestrictInfoForEstimate))
{
RestrictInfoForEstimate *ecl = (RestrictInfoForEstimate*) clause;
if (ecl->selfunc)
{
sx = ecl->selfunc(root, ecl);
}
}
if (IsA(clause, Var))...
}transformClauseListForEstimation(...)
{
...relid = collect_mvstats_info(root, clause, &attlist);
if (!relid) return;
if (get_mvstats_hook)
mvstats = (*get_mvstats_hoook) (root, relid, attset);
else
mvstats = find_mv_stats(root, relid, attset))
}
...I've pushed this to github [1] but I need to do some additional
fixes. I also had to remove some optimizations while fixing this, and
will have to reimplement those.That's not to say that the handling of OR-clauses is perfectly
correct. After looking at clauselist_selectivity_or(), I believe it's
a bit broken and will need a bunch of fixes, as explained in the
FIXMEs I pushed to github.I don't see whether it is doable or not, and I suppose you're
unwilling to change the big picture, so I will consider the idea
and will show you the result, if it turns out to be possible and
promising.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-rebase-v7-patch-to-current-master.patchtext/x-patch; charset=us-asciiDownload
>From bd5a497a8eaa3276f4491537d2633268de079b18 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Mon, 6 Jul 2015 17:42:36 +0900
Subject: [PATCH 1/2] rebase v7 patch to current master
---
src/backend/nodes/nodeFuncs.c | 1 +
src/include/catalog/pg_proc.h | 4 ++--
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 9932c8c..115ff98 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -1996,6 +1996,7 @@ expression_tree_walker(Node *node,
case T_RangeTblFunction:
return walker(((RangeTblFunction *) node)->funcexpr, context);
case T_RestrictInfo:
+ elog(LOG, "HOGEEEEE: RestrictInfo");
return walker(((RestrictInfo *) node)->clause, context);
default:
elog(ERROR, "unrecognized node type: %d",
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 7810f97..b1e78a8 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2735,9 +2735,9 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
-DATA(insert OID = 3307 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DATA(insert OID = 3311 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies info");
-DATA(insert OID = 3308 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DATA(insert OID = 3312 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
DESCR("multi-variate statistics: MCV list info");
--
1.8.3.1
0002-PoC-Planner-part-refactoring-of-mv-stats-facility.patchtext/x-patch; charset=us-asciiDownload
>From 77ccd9c8d455a365b2ad6eb779ed76da0d431e3f Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 16 Jul 2015 13:56:58 +0900
Subject: [PATCH 2/2] PoC: Planner part refactoring of mv stats facility
---
src/backend/catalog/pg_operator.c | 6 +
src/backend/nodes/nodeFuncs.c | 2 +-
src/backend/optimizer/path/clausesel.c | 4107 +++++++-------------------------
src/backend/utils/cache/lsyscache.c | 40 +
src/include/catalog/pg_operator.h | 1550 ++++++------
src/include/nodes/nodes.h | 1 +
src/include/nodes/relation.h | 22 +-
src/include/optimizer/cost.h | 6 +-
src/include/utils/lsyscache.h | 1 +
src/include/utils/mvstats.h | 3 +
10 files changed, 1668 insertions(+), 4070 deletions(-)
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index 072f530..dea39d3 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -251,6 +251,9 @@ OperatorShellMake(const char *operatorName,
values[Anum_pg_operator_oprrest - 1] = ObjectIdGetDatum(InvalidOid);
values[Anum_pg_operator_oprjoin - 1] = ObjectIdGetDatum(InvalidOid);
+ /* XXXX: How this should be implemented? */
+ values[Anum_pg_operator_oprmvstat - 1] = CStringGetTextDatum("---");
+
/*
* open pg_operator
*/
@@ -508,6 +511,9 @@ OperatorCreate(const char *operatorName,
values[Anum_pg_operator_oprrest - 1] = ObjectIdGetDatum(restrictionId);
values[Anum_pg_operator_oprjoin - 1] = ObjectIdGetDatum(joinId);
+ /* XXXX: How this should be implemented? */
+ values[Anum_pg_operator_oprmvstat - 1] = CStringGetTextDatum("---");
+
pg_operator_desc = heap_open(OperatorRelationId, RowExclusiveLock);
/*
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 115ff98..00ef04b 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -1996,7 +1996,7 @@ expression_tree_walker(Node *node,
case T_RangeTblFunction:
return walker(((RangeTblFunction *) node)->funcexpr, context);
case T_RestrictInfo:
- elog(LOG, "HOGEEEEE: RestrictInfo");
+// elog(LOG, "HOGEEEEE: RestrictInfo");
return walker(((RestrictInfo *) node)->clause, context);
default:
elog(ERROR, "unrecognized node type: %d",
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index fce77ec..61f3cd8 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -46,13 +46,6 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
-static Selectivity clauselist_selectivity_or(PlannerInfo *root,
- List *clauses,
- int varRelid,
- JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions);
-
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -60,38 +53,6 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_MCV 0x02
#define MV_CLAUSE_TYPE_HIST 0x04
-static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
- int type);
-
-static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
- int type);
-
-static Bitmapset *clause_mv_get_attnums(PlannerInfo *root, Node *clause);
-
-static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
- Oid varRelid, List *stats,
- SpecialJoinInfo *sjinfo);
-
-static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
- List *clauses, Oid varRelid,
- List **mvclauses, MVStatisticInfo *mvstats, int types);
-
-static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- MVStatisticInfo *mvstats, List *clauses,
- List *conditions, bool is_or);
-
-static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- MVStatisticInfo *mvstats,
- List *clauses, List *conditions,
- bool is_or, bool *fullmatch,
- Selectivity *lowsel);
-static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- MVStatisticInfo *mvstats,
- List *clauses, List *conditions,
- bool is_or);
-
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
int nmatches, char * matches,
@@ -104,79 +65,11 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
-/*
- * Describes a combination of multiple statistics to cover attributes
- * referenced by the clauses. The array 'stats' (with nstats elements)
- * lists attributes (in the order as they are applied), and number of
- * clause attributes covered by this solution.
- *
- * choose_mv_statistics_exhaustive() uses this to track both the current
- * and the best solutions, while walking through the state of possible
- * combination.
- */
-typedef struct mv_solution_t {
- int nclauses; /* number of clauses covered */
- int nconditions; /* number of conditions covered */
- int nstats; /* number of stats applied */
- int *stats; /* stats (in the apply order) */
-} mv_solution_t;
-
-static List *choose_mv_statistics(PlannerInfo *root,
- List *mvstats,
- List *clauses, List *conditions,
- Oid varRelid,
- SpecialJoinInfo *sjinfo);
-
-static List *filter_clauses(PlannerInfo *root, Oid varRelid,
- SpecialJoinInfo *sjinfo, int type,
- List *stats, List *clauses,
- Bitmapset **attnums);
-
-static List *filter_stats(List *stats, Bitmapset *new_attnums,
- Bitmapset *all_attnums);
-
-static Bitmapset **make_stats_attnums(MVStatisticInfo *mvstats,
- int nmvstats);
-
-static MVStatisticInfo *make_stats_array(List *stats, int *nmvstats);
-
-static List* filter_redundant_stats(List *stats,
- List *clauses, List *conditions);
-
-static Node** make_clauses_array(List *clauses, int *nclauses);
-
-static Bitmapset ** make_clauses_attnums(PlannerInfo *root, Oid varRelid,
- SpecialJoinInfo *sjinfo, int type,
- Node **clauses, int nclauses);
-
-static bool* make_cover_map(Bitmapset **stats_attnums, int nmvstats,
- Bitmapset **clauses_attnums, int nclauses);
-
-static bool has_stats(List *stats, int type);
-
-static List * find_stats(PlannerInfo *root, List *clauses,
- Oid varRelid, Index *relid);
-
-static Bitmapset* fdeps_collect_attnums(List *stats);
-
-static int *make_idx_to_attnum_mapping(Bitmapset *attnums);
-static int *make_attnum_to_idx_mapping(Bitmapset *attnums);
-
-static bool *build_adjacency_matrix(List *stats, Bitmapset *attnums,
- int *idx_to_attnum, int *attnum_to_idx);
-
-static void multiply_adjacency_matrix(bool *matrix, int natts);
-
static List* fdeps_reduce_clauses(List *clauses,
Bitmapset *attnums, bool *matrix,
int *idx_to_attnum, int *attnum_to_idx,
Index relid);
-static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
- List *clauses, Bitmapset *deps_attnums,
- List **reduced_clauses, List **deps_clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
-
static Bitmapset * get_varattnos(Node * node, Index relid);
int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
@@ -188,397 +81,41 @@ int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
#define UPDATE_RESULT(m,r,isor) \
(m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+typedef enum mv_selec_status
+{
+ NORMAL,
+ FULL_MATCH,
+ FAILURE
+} mv_selec_status;
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
+/***************/
+RestrictStatData *
+transformRestrictInfoForEstimate(PlannerInfo *root, List *clauses, int varRelid, SpecialJoinInfo *sjinfo);
/*
- * clauselist_selectivity -
- * Compute the selectivity of an implicitly-ANDed list of boolean
- * expression clauses. The list can be empty, in which case 1.0
- * must be returned. List elements may be either RestrictInfos
- * or bare expression clauses --- the former is preferred since
- * it allows caching of results.
- *
- * See clause_selectivity() for the meaning of the additional parameters.
- *
- * Our basic approach is to take the product of the selectivities of the
- * subclauses. However, that's only right if the subclauses have independent
- * probabilities, and in reality they are often NOT independent. So,
- * we want to be smarter where we can.
- *
- * Currently, the only extra smarts we have is to recognize "range queries",
- * such as "x > 34 AND x < 42". Clauses are recognized as possible range
- * query components if they are restriction opclauses whose operators have
- * scalarltsel() or scalargtsel() as their restriction selectivity estimator.
- * We pair up clauses of this form that refer to the same variable. An
- * unpairable clause of this kind is simply multiplied into the selectivity
- * product in the normal way. But when we find a pair, we know that the
- * selectivities represent the relative positions of the low and high bounds
- * within the column's range, so instead of figuring the selectivity as
- * hisel * losel, we can figure it as hisel + losel - 1. (To visualize this,
- * see that hisel is the fraction of the range below the high bound, while
- * losel is the fraction above the low bound; so hisel can be interpreted
- * directly as a 0..1 value but we need to convert losel to 1-losel before
- * interpreting it as a value. Then the available range is 1-losel to hisel.
- * However, this calculation double-excludes nulls, so really we need
- * hisel + losel + null_frac - 1.)
- *
- * If either selectivity is exactly DEFAULT_INEQ_SEL, we forget this equation
- * and instead use DEFAULT_RANGE_INEQ_SEL. The same applies if the equation
- * yields an impossible (negative) result.
- *
- * A free side-effect is that we can recognize redundant inequalities such
- * as "x < 4 AND x < 5"; only the tighter constraint will be counted.
- *
- * Of course this is all very dependent on the behavior of
- * scalarltsel/scalargtsel; perhaps some day we can generalize the approach.
- *
- *
- * Multivariate statististics
- * --------------------------
- * This also uses multivariate stats to estimate combinations of
- * conditions, in a way (a) maximizing the estimate accuracy by using
- * as many stats as possible, and (b) minimizing the overhead,
- * especially when there are no suitable multivariate stats (so if you
- * are not using multivariate stats, there's no additional overhead).
- *
- * The following checks are performed (in this order), and the optimizer
- * falls back to regular stats on the first 'false'.
- *
- * NOTE: This explains how this works with all the patches applied, not
- * just the functional dependencies.
- *
- * (0) check if there are multivariate stats on the relation
- *
- * If no, just skip all the following steps (directly to the
- * original code).
- *
- * (1) check how many attributes are there in conditions compatible
- * with functional dependencies
- *
- * Only simple equality clauses are considered compatible with
- * functional dependencies (and that's unlikely to change, because
- * that's the only case when functional dependencies are useful).
- *
- * If there are no conditions that might be handled by multivariate
- * stats, or if the conditions reference just a single column, it
- * makes no sense to use functional dependencies, so skip to (4).
- *
- * (2) reduce the clauses using functional dependencies
- *
- * This simply attempts to 'reduce' the clauses by applying functional
- * dependencies. For example if there are two clauses:
- *
- * WHERE (a = 1) AND (b = 2)
- *
- * and we know that 'a' determines the value of 'b', we may remove
- * the second condition (b = 2) when computing the selectivity.
- * This is of course tricky - see mvstats/dependencies.c for details.
- *
- * After the reduction, step (1) is to be repeated.
- *
- * (3) check how many attributes are there in conditions compatible
- * with MCV lists and histograms
- *
- * What conditions are compatible with multivariate stats is decided
- * by clause_is_mv_compatible(). At this moment, only conditions
- * of the form "column operator constant" (for simple comparison
- * operators), IS [NOT] NULL and some AND/OR clauses are considered
- * compatible with multivariate statistics.
- *
- * Again, see clause_is_mv_compatible() for details.
- *
- * (4) check how many attributes are there in conditions compatible
- * with MCV lists and histograms
- *
- * If there are no conditions that might be handled by MCV lists
- * or histograms, or if the conditions reference just a single
- * column, it makes no sense to continue, so just skip to (7).
- *
- * (5) choose the stats matching the most columns
- *
- * If there are multiple instances of multivariate statistics (e.g.
- * built on different sets of columns), we choose the stats covering
- * the most columns from step (1). It may happen that all available
- * stats match just a single column - for example with conditions
- *
- * WHERE a = 1 AND b = 2
- *
- * and statistics built on (a,c) and (b,c). In such case just fall
- * back to the regular stats because it makes no sense to use the
- * multivariate statistics.
- *
- * For more details about how exactly we choose the stats, see
- * choose_mv_statistics().
- *
- * (6) use the multivariate stats to estimate matching clauses
- *
- * (7) estimate the remaining clauses using the regular statistics
+ * boolop_selectivity -
*/
-Selectivity
-clauselist_selectivity(PlannerInfo *root,
+static Selectivity
+and_clause_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions)
+ SpecialJoinInfo *sjinfo)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
- /* processing mv stats */
- Index relid = InvalidOid;
-
- /* attributes in mv-compatible clauses */
- Bitmapset *mvattnums = NULL;
- List *stats = NIL;
-
- /* use clauses (not conditions), because those are always non-empty */
- stats = find_stats(root, clauses, varRelid, &relid);
-
- /*
- * If there's exactly one clause, then no use in trying to match up
- * pairs, or matching multivariate statistics, so just go directly
- * to clause_selectivity().
- */
- if (list_length(clauses) == 1)
- return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo, conditions);
-
- /*
- * Check that there are some stats with functional dependencies
- * built (by walking the stats list). We're going to find that
- * anyway when trying to apply the functional dependencies, but
- * this is probably a tad faster.
- */
- if (has_stats(stats, MV_CLAUSE_TYPE_FDEP))
- {
- /*
- * Collect attributes referenced by mv-compatible clauses (looking
- * for clauses compatible with functional dependencies for now).
- */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- MV_CLAUSE_TYPE_FDEP);
-
- /*
- * If there are mv-compatible clauses, referencing at least two
- * different columns (otherwise it makes no sense to use mv stats),
- * try to reduce the clauses using functional dependencies, and
- * recollect the attributes from the reduced list.
- *
- * We don't need to select a single statistics for this - we can
- * apply all the functional dependencies we have.
- */
- if (bms_num_members(mvattnums) >= 2)
- clauses = clauselist_apply_dependencies(root, clauses, varRelid,
- stats, sjinfo);
- }
-
- /*
- * Check that there are statistics with MCV list or histogram.
- * If not, we don't need to waste time with the optimization.
- */
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
- {
- /*
- * Recollect attributes from mv-compatible clauses (maybe we've
- * removed so many clauses we have a single mv-compatible attnum).
- * From now on we're only interested in MCV-compatible clauses.
- */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
-
- /*
- * If there still are at least two columns, we'll try to select
- * a suitable combination of multivariate stats. If there are
- * multiple combinations, we'll try to choose the best one.
- * See choose_mv_statistics for more details.
- */
- if (bms_num_members(mvattnums) >= 2)
- {
- int k;
- ListCell *s;
-
- /*
- * Copy the list of conditions, so that we can build a list
- * of local conditions (and keep the original intact, for
- * the other clauses at the same level).
- */
- List *conditions_local = list_copy(conditions);
-
- /* find the best combination of statistics */
- List *solution = choose_mv_statistics(root, stats,
- clauses, conditions,
- varRelid, sjinfo);
-
- /* we have a good solution (list of stats) */
- foreach (s, solution)
- {
- MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
-
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
- List *mvclauses_new = NIL;
- List *mvclauses_conditions = NIL;
- Bitmapset *stat_attnums = NULL;
-
- /* build attnum bitmapset for this statistics */
- for (k = 0; k < mvstat->stakeys->dim1; k++)
- stat_attnums = bms_add_member(stat_attnums,
- mvstat->stakeys->values[k]);
-
- /*
- * Append the compatible conditions (passed from above)
- * to mvclauses_conditions.
- */
- foreach (l, conditions)
- {
- Node *c = (Node*)lfirst(l);
- Bitmapset *tmp = clause_mv_get_attnums(root, c);
-
- if (bms_is_subset(tmp, stat_attnums))
- mvclauses_conditions
- = lappend(mvclauses_conditions, c);
-
- bms_free(tmp);
- }
-
- /* split the clauselist into regular and mv-clauses
- *
- * We keep the list of clauses (we don't remove the
- * clauses yet, because we want to use the clauses
- * as conditions of other clauses).
- *
- * FIXME Do this only once, i.e. filter the clauses
- * once (selecting clauses covered by at least
- * one statistics) and then convert them into
- * smaller per-statistics lists of conditions
- * and estimated clauses.
- */
- clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
-
- /*
- * We've chosen the statistics to match the clauses, so
- * each statistics from the solution should have at least
- * one new clause (not covered by the previous stats).
- */
- Assert(mvclauses != NIL);
-
- /*
- * Mvclauses now contains only clauses compatible
- * with the currently selected stats, but we have to
- * split that into conditions (already matched by
- * the previous stats), and the new clauses we need
- * to estimate using this stats.
- */
- foreach (l, mvclauses)
- {
- ListCell *p;
- bool covered = false;
- Node *clause = (Node *) lfirst(l);
- Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
-
- /*
- * If already covered by previous stats, add it to
- * conditions.
- *
- * TODO Maybe this could be relaxed a bit? Because
- * with complex and/or clauses, this might
- * mean no statistics actually covers such
- * complex clause.
- */
- foreach (p, solution)
- {
- int k;
- Bitmapset *stat_attnums = NULL;
-
- MVStatisticInfo *prev_stat
- = (MVStatisticInfo *)lfirst(p);
-
- /* break if we've ran into current statistic */
- if (prev_stat == mvstat)
- break;
-
- for (k = 0; k < prev_stat->stakeys->dim1; k++)
- stat_attnums = bms_add_member(stat_attnums,
- prev_stat->stakeys->values[k]);
-
- covered = bms_is_subset(clause_attnums, stat_attnums);
-
- bms_free(stat_attnums);
-
- if (covered)
- break;
- }
-
- if (covered)
- mvclauses_conditions
- = lappend(mvclauses_conditions, clause);
- else
- mvclauses_new
- = lappend(mvclauses_new, clause);
- }
-
- /*
- * We need at least one new clause (not just conditions).
- */
- Assert(mvclauses_new != NIL);
-
- /* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvstat,
- mvclauses_new,
- mvclauses_conditions,
- false); /* AND */
- }
-
- /*
- * And now finally remove all the mv-compatible clauses.
- *
- * This only repeats the same split as above, but this
- * time we actually use the result list (and feed it to
- * the next call).
- */
- foreach (s, solution)
- {
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
-
- MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
-
- /* split the list into regular and mv-clauses */
- clauses = clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
-
- /*
- * Add the clauses to the conditions (to be passed
- * to regular clauses), irrespectedly whether it
- * will be used as a condition or a clause here.
- *
- * We only keep the remaining conditions in the
- * clauses (we keep what clauselist_mv_split returns)
- * so we add each MV condition exactly once.
- */
- conditions_local = list_concat(conditions_local, mvclauses);
- }
-
- /* from now on, work with the 'local' list of conditions */
- conditions = conditions_local;
- }
- }
-
/*
* If there's exactly one clause, then no use in trying to match up
* pairs, so just go directly to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo, conditions);
-
+ varRelid, jointype, sjinfo);
/*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
@@ -591,8 +128,7 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
- conditions);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
/*
* Check for being passed a RestrictInfo.
@@ -750,250 +286,334 @@ clauselist_selectivity(PlannerInfo *root,
return s1;
}
-/*
- * Similar to clauselist_selectivity(), but for clauses connected by OR.
- *
- * That means a few differences:
- *
- * - functional dependencies don't apply to OR-clauses
- *
- * - we can't add the previous clauses to conditions
- *
- * - combined selectivities are combined using (s1+s2 - s1*s2)
- * and not as a multiplication (s1*s2)
- *
- * Another way to evaluate this might be turning
- *
- * (a OR b OR c)
- *
- * into
- *
- * NOT ((NOT a) AND (NOT b) AND (NOT c))
- *
- * and computing selectivity of that using clauselist_selectivity().
- * That would allow (a) using the clauselist_selectivity directly and
- * (b) using the previous clauses as conditions. Not sure if it's
- * worth the additional complexity, though.
- */
static Selectivity
-clauselist_selectivity_or(PlannerInfo *root,
- List *clauses,
- int varRelid,
- JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions)
+clause_mcv_selectivity(PlannerInfo *root, MVStatisticInfo *stats,
+ Node *clause, int *status)
{
- Selectivity s1 = 0.0;
- ListCell *l;
-
- /* processing mv stats */
- Index relid = InvalidOid;
-
- /* attributes in mv-compatible clauses */
- Bitmapset *mvattnums = NULL;
- List *stats = NIL;
-
- /* use clauses (not conditions), because those are always non-empty */
- stats = find_stats(root, clauses, varRelid, &relid);
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+ int nconditions = 0;
+ char *matches = NULL;
+ char *condition_matches = NULL;
+ Selectivity s = 0.0;
+ Selectivity t = 0.0;
+ Selectivity u = 0.0;
+ BoolExpr *expr = (BoolExpr*) clause;
+ bool is_or = or_clause(clause);
+ int i;
+ bool fullmatch;
+ Selectivity lowsel;
- /* OR-clauses do not work with functional dependencies */
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
+ Assert(IsA(expr, BoolExpr));
+
+ if (!expr || not_clause(clause)) /* For now!! */
{
- /*
- * Recollect attributes from mv-compatible clauses (maybe we've
- * removed so many clauses we have a single mv-compatible attnum).
- * From now on we're only interested in MCV-compatible clauses.
- */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
-
- /*
- * If there still are at least two columns, we'll try to select
- * a suitable multivariate stats.
- */
- if (bms_num_members(mvattnums) >= 2)
- {
- int k;
- ListCell *s;
+ *status = FAILURE;
+ return 0.0;
+ }
+ if (!stats->mcv_built)
+ {
+ *status = FAILURE;
+ return 0.0;
+ }
+
+ mcvlist = load_mv_mcvlist(stats->mvoid);
+ Assert (mcvlist != NULL);
+ Assert (mcvlist->nitems > 0);
- List *solution
- = choose_mv_statistics(root, stats,
- clauses, conditions,
- varRelid, sjinfo);
+ nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
- /* we have a good solution stats */
- foreach (s, solution)
- {
- Selectivity s2;
- MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+ matches = palloc0(sizeof(char) * nmatches);
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
- List *mvclauses_new = NIL;
- List *mvclauses_conditions = NIL;
- Bitmapset *stat_attnums = NULL;
+ if (!is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
- /* build attnum bitmapset for this statistics */
- for (k = 0; k < mvstat->stakeys->dim1; k++)
- stat_attnums = bms_add_member(stat_attnums,
- mvstat->stakeys->values[k]);
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
- /*
- * Append the compatible conditions (passed from above)
- * to mvclauses_conditions.
- */
- foreach (l, conditions)
- {
- Node *c = (Node*)lfirst(l);
- Bitmapset *tmp = clause_mv_get_attnums(root, c);
+ nmatches = update_match_bitmap_mcvlist(root, expr->args,
+ stats->stakeys, mcvlist,
+ (is_or ? 0 : nmatches), matches,
+ &lowsel, &fullmatch, is_or);
- if (bms_is_subset(tmp, stat_attnums))
- mvclauses_conditions
- = lappend(mvclauses_conditions, c);
-
- bms_free(tmp);
- }
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ u += mcvlist->items[i]->frequency;
+
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
- /* split the clauselist into regular and mv-clauses
- *
- * We keep the list of clauses (we don't remove the
- * clauses yet, because we want to use the clauses
- * as conditions of other clauses).
- *
- * FIXME Do this only once, i.e. filter the clauses
- * once (selecting clauses covered by at least
- * one statistics) and then convert them into
- * smaller per-statistics lists of conditions
- * and estimated clauses.
- */
- clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
- /*
- * We've chosen the statistics to match the clauses, so
- * each statistics from the solution should have at least
- * one new clause (not covered by the previous stats).
- */
- Assert(mvclauses != NIL);
+ t += mcvlist->items[i]->frequency;
+ }
- /*
- * Mvclauses now contains only clauses compatible
- * with the currently selected stats, but we have to
- * split that into conditions (already matched by
- * the previous stats), and the new clauses we need
- * to estimate using this stats.
- *
- * XXX We'll only use the new clauses, but maybe we
- * should use the conditions too, somehow. We can't
- * use that directly in conditional probability, but
- * maybe we might use them in a different way?
- *
- * If we have a clause (a OR b OR c), then knowing
- * that 'a' is TRUE means (b OR c) can't make the
- * whole clause FALSE.
- *
- * This is pretty much what
- *
- * (a OR b) == NOT ((NOT a) AND (NOT b))
- *
- * implies.
- */
- foreach (l, mvclauses)
- {
- ListCell *p;
- bool covered = false;
- Node *clause = (Node *) lfirst(l);
- Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
+ pfree(matches);
+ pfree(condition_matches);
+ pfree(mcvlist);
- /*
- * If already covered by previous stats, add it to
- * conditions.
- *
- * TODO Maybe this could be relaxed a bit? Because
- * with complex and/or clauses, this might
- * mean no statistics actually covers such
- * complex clause.
- */
- foreach (p, solution)
- {
- int k;
- Bitmapset *stat_attnums = NULL;
+ if (fullmatch)
+ *status = FULL_MATCH;
- MVStatisticInfo *prev_stat
- = (MVStatisticInfo *)lfirst(p);
+ /* mcv_low is omitted for now */
- /* break if we've ran into current statistic */
- if (prev_stat == mvstat)
- break;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
- for (k = 0; k < prev_stat->stakeys->dim1; k++)
- stat_attnums = bms_add_member(stat_attnums,
- prev_stat->stakeys->values[k]);
+ return (s / t) * u;
+}
- covered = bms_is_subset(clause_attnums, stat_attnums);
+static Selectivity
+clause_hist_selectivity(PlannerInfo *root, MVStatisticInfo *stats,
+ Node *clause, int *status)
+{
+ MVSerializedHistogram mvhist = NULL;
+ int nmatches = 0;
+ int nconditions = 0;
+ char *matches = NULL;
+ char *condition_matches = NULL;
+ Selectivity s = 0.0;
+ Selectivity t = 0.0;
+ Selectivity u = 0.0;
+ BoolExpr *expr = (BoolExpr*) clause;
+ bool is_or = or_clause(clause);
+ int i;
- bms_free(stat_attnums);
+ Assert(IsA(expr, BoolExpr));
- if (covered)
- break;
- }
+ if (!expr || not_clause(clause)) /* for now */
+ {
+ *status = 0;
+ return 0.0;
+ }
+ if (!stats->hist_built)
+ {
+ *status = 1;
+ return 0.0;
+ }
+ mvhist = load_mv_histogram(stats->mvoid);
+ Assert (mvhist != NULL);
+ Assert (clause != NULL);
- if (! covered)
- mvclauses_new = lappend(mvclauses_new, clause);
- }
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
+ matches = palloc0(sizeof(char) * nmatches);
+ if (!is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
- /*
- * We need at least one new clause (not just conditions).
- */
- Assert(mvclauses_new != NIL);
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
- /* compute the multivariate stats */
- s2 = clauselist_mv_selectivity(root, mvstat,
- mvclauses_new,
- mvclauses_conditions,
- true); /* OR */
+ update_match_bitmap_histogram(root, expr->args, stats->stakeys, mvhist,
+ (is_or ? 0 : nmatches), matches, is_or);
- s1 = s1 + s2 - s1 * s2;
- }
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ float coeff = 1.0;
+ u += mvhist->buckets[i]->ntuples;
- /*
- * And now finally remove all the mv-compatible clauses.
- *
- * This only repeats the same split as above, but this
- * time we actually use the result list (and feed it to
- * the next call).
- */
- foreach (s, solution)
- {
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
- MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+ t += coeff * mvhist->buckets[i]->ntuples;
- /* split the list into regular and mv-clauses */
- clauses = clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
- }
- }
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += coeff * mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
- /*
- * Handle the remaining clauses (either using regular statistics,
- * or by multivariate stats at the next level).
- */
- foreach(l, clauses)
+ pfree(matches);
+ pfree(condition_matches);
+ pfree(mvhist);
+
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
+}
+
+static Selectivity
+apply_mvstats(PlannerInfo *root, Node *clause, bm_mvstat *statent)
+{
+ Selectivity s1 = 0.0;
+ int status;
+
+ if (statent->mvkind & MVSTATISTIC_MCV)
{
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(l),
- varRelid,
- jointype,
- sjinfo,
- conditions);
+ s1 = clause_mcv_selectivity(root, statent->stats, clause, &status);
+ if (status == FULL_MATCH && s1 > 0.0)
+ return s1;
+ }
+
+ if (statent->mvkind & MVSTATISTIC_HIST)
+ s1 = s1 + clause_hist_selectivity(root, statent->stats,
+ clause, &status);
+
+ return s1;
+}
+
+static inline Selectivity
+merge_selectivity(Selectivity s1, Selectivity s2, BoolExprType op)
+{
+ if (op == AND_EXPR)
+ s1 = s1 * s2;
+ else
s1 = s1 + s2 - s1 * s2;
+
+ return s1;
+}
+/*
+ * mvclause_selectivity -
+ */
+static Selectivity
+mvclause_selectivity(PlannerInfo *root,
+ RestrictStatData *rstat,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo)
+{
+ Selectivity s1;
+ ListCell *lc;
+
+ if (!rstat->mvclause && !rstat->nonmvclause && !rstat->children)
+ return clause_selectivity(root, rstat->clause, varRelid, jointype,
+ sjinfo);
+
+ if (rstat->boolop == NOT_EXPR)
+ {
+ RestrictStatData *clause =
+ (RestrictStatData *)linitial(rstat->children);
+
+ s1 = 1.0 - mvclause_selectivity(root, clause, varRelid,
+ jointype, sjinfo);
+ return s1;
+ }
+
+ s1 = (rstat->boolop == AND_EXPR ? 1.0 : 0.0);
+
+ if (rstat->nonmvclause)
+ s1 = merge_selectivity(s1,
+ clause_selectivity(root, rstat->nonmvclause,
+ varRelid, jointype, sjinfo),
+ rstat->boolop);
+
+ if (rstat->mvclause)
+ {
+ bm_mvstat *mvs = (bm_mvstat*)linitial(rstat->mvstats);
+ Selectivity s2 = apply_mvstats(root, rstat->mvclause, mvs);
+
+ /* Fall back to ordinary calculation */
+ if (s2 < 0)
+ s2 = clause_selectivity(root, rstat->mvclause, varRelid,
+ jointype, sjinfo);
+ s1 = merge_selectivity(s1, s2, rstat->boolop);
+ }
+
+ foreach(lc, rstat->children)
+ {
+ RestrictStatData *rsd = (RestrictStatData *) lfirst(lc);
+ Assert(IsA(rsd, RestrictStatData));
+
+ s1 = merge_selectivity(s1,
+ mvclause_selectivity(root, rsd, varRelid,
+ jointype, sjinfo),
+ rstat->boolop);
+ }
+
+ return s1;
+}
+
+
+/*
+ * clauselist_selectivity -
+ * Compute the selectivity of an implicitly-ANDed list of boolean
+ * expression clauses. The list can be empty, in which case 1.0
+ * must be returned. List elements may be either RestrictInfos
+ * or bare expression clauses --- the former is preferred since
+ * it allows caching of results.
+ *
+ * See clause_selectivity() for the meaning of the additional parameters.
+ *
+ * Our basic approach is to take the product of the selectivities of the
+ * subclauses. However, that's only right if the subclauses have independent
+ * probabilities, and in reality they are often NOT independent. So,
+ * we want to be smarter where we can.
+ *
+ * Currently, the only extra smarts we have is to recognize "range queries",
+ * such as "x > 34 AND x < 42". Clauses are recognized as possible range
+ * query components if they are restriction opclauses whose operators have
+ * scalarltsel() or scalargtsel() as their restriction selectivity estimator.
+ * We pair up clauses of this form that refer to the same variable. An
+ * unpairable clause of this kind is simply multiplied into the selectivity
+ * product in the normal way. But when we find a pair, we know that the
+ * selectivities represent the relative positions of the low and high bounds
+ * within the column's range, so instead of figuring the selectivity as
+ * hisel * losel, we can figure it as hisel + losel - 1. (To visualize this,
+ * see that hisel is the fraction of the range below the high bound, while
+ * losel is the fraction above the low bound; so hisel can be interpreted
+ * directly as a 0..1 value but we need to convert losel to 1-losel before
+ * interpreting it as a value. Then the available range is 1-losel to hisel.
+ * However, this calculation double-excludes nulls, so really we need
+ * hisel + losel + null_frac - 1.)
+ *
+ * If either selectivity is exactly DEFAULT_INEQ_SEL, we forget this equation
+ * and instead use DEFAULT_RANGE_INEQ_SEL. The same applies if the equation
+ * yields an impossible (negative) result.
+ *
+ * A free side-effect is that we can recognize redundant inequalities such
+ * as "x < 4 AND x < 5"; only the tighter constraint will be counted.
+ *
+ * Of course this is all very dependent on the behavior of
+ * scalarltsel/scalargtsel; perhaps some day we can generalize the approach.
+ *
+ *
+ * Multivariate statististics
+ * --------------------------
+ * This also uses multivariate stats to estimate combinations of
+ * conditions, in a way (a) maximizing the estimate accuracy by using
+ * as many stats as possible, and (b) minimizing the overhead,
+ * especially when there are no suitable multivariate stats (so if you
+ * are not using multivariate stats, there's no additional overhead).
+ *
+ * The following checks are performed (in this order), and the optimizer
+ * falls back to regular stats on the first 'false'.
+ *
+ * NOTE: This explains how this works with all the patches applied, not
+ * just the functional dependencies.
+ *
+ */
+Selectivity
+clauselist_selectivity(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo)
+{
+ Selectivity s1 = 1.0;
+ RestrictStatData *rstat;
+ List *rinfos = clauses;
+
+ /* Reconstruct clauses so that multivariate statistics can be applied */
+ rstat = transformRestrictInfoForEstimate(root, clauses, varRelid, sjinfo);
+
+ if (rstat)
+ {
+ rinfos = rstat->unusedrinfos;
+
+ s1 = mvclause_selectivity(root, rstat, varRelid, jointype, sjinfo);
}
+ s1 = s1 * and_clause_selectivity(root, rinfos, varRelid, jointype, sjinfo);
+
return s1;
}
@@ -1204,8 +824,7 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions)
+ SpecialJoinInfo *sjinfo)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -1335,28 +954,37 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo,
- conditions);
+ sjinfo);
}
else if (and_clause(clause))
{
- /* share code with clauselist_selectivity() */
- s1 = clauselist_selectivity(root,
+ s1 = and_clause_selectivity(root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo,
- conditions);
+ sjinfo);
}
else if (or_clause(clause))
{
- /* just call to clauselist_selectivity_or() */
- s1 = clauselist_selectivity_or(root,
- ((BoolExpr *) clause)->args,
- varRelid,
- jointype,
- sjinfo,
- conditions);
+ /*
+ * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
+ * account for the probable overlap of selected tuple sets.
+ *
+ * XXX is this too conservative?
+ */
+ ListCell *arg;
+
+ s1 = 0.0;
+ foreach(arg, ((BoolExpr *) clause)->args)
+ {
+ Selectivity s2 = clause_selectivity(root,
+ (Node *) lfirst(arg),
+ varRelid,
+ jointype,
+ sjinfo);
+
+ s1 = s1 + s2 - s1 * s2;
+ }
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -1445,1899 +1073,55 @@ clause_selectivity(PlannerInfo *root,
s1 = booltestsel(root,
((BooleanTest *) clause)->booltesttype,
(Node *) ((BooleanTest *) clause)->arg,
- varRelid,
- jointype,
- sjinfo);
- }
- else if (IsA(clause, CurrentOfExpr))
- {
- /* CURRENT OF selects at most one row of its table */
- CurrentOfExpr *cexpr = (CurrentOfExpr *) clause;
- RelOptInfo *crel = find_base_rel(root, cexpr->cvarno);
-
- if (crel->tuples > 0)
- s1 = 1.0 / crel->tuples;
- }
- else if (IsA(clause, RelabelType))
- {
- /* Not sure this case is needed, but it can't hurt */
- s1 = clause_selectivity(root,
- (Node *) ((RelabelType *) clause)->arg,
- varRelid,
- jointype,
- sjinfo,
- conditions);
- }
- else if (IsA(clause, CoerceToDomain))
- {
- /* Not sure this case is needed, but it can't hurt */
- s1 = clause_selectivity(root,
- (Node *) ((CoerceToDomain *) clause)->arg,
- varRelid,
- jointype,
- sjinfo,
- conditions);
- }
-
- /* Cache the result if possible */
- if (cacheable)
- {
- if (jointype == JOIN_INNER)
- rinfo->norm_selec = s1;
- else
- rinfo->outer_selec = s1;
- }
-
-#ifdef SELECTIVITY_DEBUG
- elog(DEBUG4, "clause_selectivity: s1 %f", s1);
-#endif /* SELECTIVITY_DEBUG */
-
- return s1;
-}
-
-
-/*
- * Estimate selectivity for the list of MV-compatible clauses, using
- * using a MV statistics (combining a histogram and MCV list).
- *
- * This simply passes the estimation to the MCV list and then to the
- * histogram, if available.
- *
- * TODO Clamp the selectivity by min of the per-clause selectivities
- * (i.e. the selectivity of the most restrictive clause), because
- * that's the maximum we can ever get from ANDed list of clauses.
- * This may probably prevent issues with hitting too many buckets
- * and low precision histograms.
- *
- * TODO We may support some additional conditions, most importantly
- * those matching multiple columns (e.g. "a = b" or "a < b").
- * Ultimately we could track multi-table histograms for join
- * cardinality estimation.
- *
- * TODO Further thoughts on processing equality clauses: Maybe it'd be
- * better to look for stats (with MCV) covered by the equality
- * clauses, because then we have a chance to find an exact match
- * in the MCV list, which is pretty much the best we can do. We may
- * also look at the least frequent MCV item, and use it as a upper
- * boundary for the selectivity (had there been a more frequent
- * item, it'd be in the MCV list).
- *
- * TODO There are several options for 'sanity clamping' the estimates.
- *
- * First, if we have selectivities for each condition, then
- *
- * P(A,B) <= MIN(P(A), P(B))
- *
- * Because additional conditions (connected by AND) can only lower
- * the probability.
- *
- * So we can do some basic sanity checks using the single-variate
- * stats (the ones we have right now).
- *
- * Second, when we have multivariate stats with a MCV list, then
- *
- * (a) if we have a full equality condition (one equality condition
- * on each column) and we found a match in the MCV list, this is
- * the selectivity (and it's supposed to be exact)
- *
- * (b) if we have a full equality condition and we haven't found a
- * match in the MCV list, then the selectivity is below the
- * lowest selectivity in the MCV list
- *
- * (c) if we have a equality condition (not full), we can still
- * search the MCV for matches and use the sum of probabilities
- * as a lower boundary for the histogram (if there are no
- * matches in the MCV list, then we have no boundary)
- *
- * Third, if there are multiple (combinations of) multivariate
- * stats for a set of clauses, we may compute all of them and then
- * somehow aggregate them - e.g. by choosing the minimum, median or
- * average. The stats are susceptible to overestimation (because
- * we take 50% of the bucket for partial matches). Some stats may
- * give better estimates than others, but it's very difficult to
- * say that in advance which one is the best (it depends on the
- * number of buckets, number of additional columns not referenced
- * in the clauses, type of condition etc.).
- *
- * So we may compute them all and then choose a sane aggregation
- * (minimum seems like a good approach). Of course, this may result
- * in longer / more expensive estimation (CPU-wise), but it may be
- * worth it.
- *
- * It's possible to add a GUC choosing whether to do a 'simple'
- * (using a single stats expected to give the best estimate) and
- * 'complex' (combining the multiple estimates).
- *
- * multivariate_estimates = (simple|full)
- *
- * Also, this might be enabled at a table level, by something like
- *
- * ALTER TABLE ... SET STATISTICS (simple|full)
- *
- * Which would make it possible to use this only for the tables
- * where the simple approach does not work.
- *
- * Also, there are ways to optimize this algorithmically. E.g. we
- * may try to get an estimate from a matching MCV list first, and
- * if we happen to get a "full equality match" we may stop computing
- * the estimates from other stats (for this condition) because
- * that's probably the best estimate we can really get.
- *
- * TODO When applying the clauses to the histogram/MCV list, we can do
- * that from the most selective clauses first, because that'll
- * eliminate the buckets/items sooner (so we'll be able to skip
- * them without inspection, which is more expensive). But this
- * requires really knowing the per-clause selectivities in advance,
- * and that's not what we do now.
- *
- * TODO All this is based on the assumption that the statistics represent
- * the necessary dependencies, i.e. that if two colunms are not in
- * the same statistics, there's no dependency. If that's not the
- * case, we may get misestimates, just like before. For example
- * assume we have a table with three columns [a,b,c] with exactly
- * the same values, and statistics on [a,b] and [b,c]. So somthing
- * like this:
- *
- * CREATE TABLE test AS SELECT i, i, i
- FROM generate_series(1,1000);
- *
- * ALTER TABLE test ADD STATISTICS (mcv) ON (a,b);
- * ALTER TABLE test ADD STATISTICS (mcv) ON (b,c);
- *
- * ANALYZE test;
- *
- * EXPLAIN ANALYZE SELECT * FROM test
- * WHERE (a < 10) AND (b < 20) AND (c < 10);
- *
- * The problem here is that the only shared column between the two
- * statistics is 'b' so the probability will be computed like this
- *
- * P[(a < 10) & (b < 20) & (c < 10)]
- * = P[(a < 10) & (b < 20)] * P[(c < 10) | (a < 10) & (b < 20)]
- * = P[(a < 10) & (b < 20)] * P[(c < 10) | (b < 20)]
- *
- * or like this
- *
- * P[(a < 10) & (b < 20) & (c < 10)]
- * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20) & (c < 10)]
- * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20)]
- *
- * In both cases the conditional probabilities will be evaluated as
- * 0.5, because they lack the other column (which would make it 1.0).
- *
- * Theoretically it might be possible to transfer the dependency,
- * e.g. by building bitmap for [a,b] and then combine it with [b,c]
- * by doing something like this:
- *
- * 1) build bitmap on [a,b] using [(a<10) & (b < 20)]
- * 2) for each element in [b,c] check the bitmap
- *
- * But that's certainly nontrivial - for example the statistics may
- * be different (MCV list vs. histogram) and/or the items may not
- * match (e.g. MCV items or histogram buckets will be built
- * differently). Also, for one value of 'b' there might be multiple
- * MCV items (because of the other column values) with different
- * bitmap values (some will match, some won't) - so it's not exactly
- * bitmap but a partial match.
- *
- * Maybe a hash table with number of matches and mismatches (or
- * maybe sums of frequencies) would work? The step (2) would then
- * lookup the values and use that to weight the item somehow.
- *
- * Currently the only solution is to build statistics on all three
- * columns.
- */
-static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
- List *clauses, List *conditions, bool is_or)
-{
- bool fullmatch = false;
- Selectivity s1 = 0.0, s2 = 0.0;
-
- /*
- * Lowest frequency in the MCV list (may be used as an upper bound
- * for full equality conditions that did not match any MCV item).
- */
- Selectivity mcv_low = 0.0;
-
- /* TODO Evaluate simple 1D selectivities, use the smallest one as
- * an upper bound, product as lower bound, and sort the
- * clauses in ascending order by selectivity (to optimize the
- * MCV/histogram evaluation).
- */
-
- /* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
- clauses, conditions, is_or,
- &fullmatch, &mcv_low);
-
- /*
- * If we got a full equality match on the MCV list, we're done (and
- * the estimate is pretty good).
- */
- if (fullmatch && (s1 > 0.0))
- return s1;
-
- /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
- * selectivity as upper bound */
-
- s2 = clauselist_mv_selectivity_histogram(root, mvstats,
- clauses, conditions, is_or);
-
- /* TODO clamp to <= 1.0 (or more strictly, when possible) */
- return s1 + s2;
-}
-
-/*
- * Collect attributes from mv-compatible clauses.
- */
-static Bitmapset *
-collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
- Index *relid, SpecialJoinInfo *sjinfo, int types)
-{
- Bitmapset *attnums = NULL;
- ListCell *l;
-
- /*
- * Walk through the clauses and identify the ones we can estimate
- * using multivariate stats, and remember the relid/columns. We'll
- * then cross-check if we have suitable stats, and only if needed
- * we'll split the clauses into multivariate and regular lists.
- *
- * For now we're only interested in RestrictInfo nodes with nested
- * OpExpr, using either a range or equality.
- */
- foreach (l, clauses)
- {
- Node *clause = (Node *) lfirst(l);
-
- /* ignore the result here - we only need the attnums */
- clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
- sjinfo, types);
- }
-
- /*
- * If there are not at least two attributes referenced by the clause(s),
- * we can throw everything out (as we'll revert to simple stats).
- */
- if (bms_num_members(attnums) <= 1)
- {
- bms_free(attnums);
- attnums = NULL;
- *relid = InvalidOid;
- }
-
- return attnums;
-}
-
-/*
- * Selects the best combination of multivariate statistics, in an
- * exhaustive way, where 'best' means:
- *
- * (a) covering the most attributes (referenced by clauses)
- * (b) using the least number of multivariate stats
- * (c) using the most conditions to exploit dependency
- *
- * There may be other optimality criteria, not considered in the initial
- * implementation (more on that 'weaknesses' section).
- *
- * This pretty much splits the probability of clauses (aka selectivity)
- * into a sequence of conditional probabilities, like this
- *
- * P(A,B,C,D) = P(A,B) * P(C|A,B) * P(D|A,B,C)
- *
- * and removing the attributes not referenced by the existing stats,
- * under the assumption that there's no dependency (otherwise the DBA
- * would create the stats).
- *
- * The last criteria means that when we have the choice to compute like
- * this
- *
- * P(A,B,C,D) = P(A,B,C) * P(D|B,C)
- *
- * or like this
- *
- * P(A,B,C,D) = P(A,B,C) * P(D|C)
- *
- * we should use the first option, as that exploits more dependencies.
- *
- * The order of statistics in the solution implicitly determines the
- * order of estimation of clauses, because as we apply a statistics,
- * we always use it to estimate all the clauses covered by it (and
- * then we use those clauses as conditions for the next statistics).
- *
- * Don't call this directly but through choose_mv_statistics().
- *
- *
- * Algorithm
- * ---------
- * The algorithm is a recursive implementation of backtracking, with
- * maximum 'depth' equal to the number of multi-variate statistics
- * available on the table.
- *
- * It explores all the possible permutations of the stats.
- *
- * Whenever it considers adding the next statistics, the clauses it
- * matches are divided into 'conditions' (clauses already matched by at
- * least one previous statistics) and clauses that are estimated.
- *
- * Then several checks are performed:
- *
- * (a) The statistics covers at least 2 columns, referenced in the
- * estimated clauses (otherwise multi-variate stats are useless).
- *
- * (b) The statistics covers at least 1 new column, i.e. column not
- * refefenced by the already used stats (and the new column has
- * to be referenced by the clauses, of couse). Otherwise the
- * statistics would not add any new information.
- *
- * There are some other sanity checks (e.g. that the stats must not be
- * used twice etc.).
- *
- * Finally the new solution is compared to the currently best one, and
- * if it's considered better, it's used instead.
- *
- *
- * Weaknesses
- * ----------
- * The current implemetation uses a somewhat simple optimality criteria,
- * suffering by the following weaknesses.
- *
- * (a) There may be multiple solutions with the same number of covered
- * attributes and number of statistics (e.g. the same solution but
- * with statistics in a different order). It's unclear which solution
- * is the best one - in a sense all of them are equal.
- *
- * TODO It might be possible to compute estimate for each of those
- * solutions, and then combine them to get the final estimate
- * (e.g. by using average or median).
- *
- * (b) Does not consider that some types of stats are a better match for
- * some types of clauses (e.g. MCV list is a good match for equality
- * than a histogram).
- *
- * XXX Maybe MCV is almost always better / more accurate?
- *
- * But maybe this is pointless - generally, each column is either
- * a label (it's not important whether because of the data type or
- * how it's used), or a value with ordering that makes sense. So
- * either a MCV list is more appropriate (labels) or a histogram
- * (values with orderings).
- *
- * Now sure what to do with statistics on columns mixing columns of
- * both types - maybe it'd be beeter to invent a new type of stats
- * combining MCV list and histogram (keeping a small histogram for
- * each MCV item, and a separate histogram for values not on the
- * MCV list). But that's not implemented at this moment.
- *
- * TODO The algorithm should probably count number of Vars (not just
- * attnums) when computing the 'score' of each solution. Computing
- * the ratio of (num of all vars) / (num of condition vars) as a
- * measure of how well the solution uses conditions might be
- * useful.
- */
-static void
-choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
- int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
- int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
- int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
- bool *cover_map, bool *condition_map, int *ruled_out,
- mv_solution_t *current, mv_solution_t **best)
-{
- int i, j;
-
- Assert(best != NULL);
- Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
-
- CHECK_FOR_INTERRUPTS();
-
- if (current == NULL)
- {
- current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
- current->stats = (int*)palloc0(sizeof(int)*nmvstats);
- current->nstats = 0;
- current->nclauses = 0;
- current->nconditions = 0;
- }
-
- /*
- * Now try to apply each statistics, matching at least two attributes,
- * unless it's already used in one of the previous steps.
- */
- for (i = 0; i < nmvstats; i++)
- {
- int c;
-
- int ncovered_clauses = 0; /* number of covered clauses */
- int ncovered_conditions = 0; /* number of covered conditions */
- int nattnums = 0; /* number of covered attributes */
-
- Bitmapset *all_attnums = NULL;
- Bitmapset *new_attnums = NULL;
-
- /* skip statistics that were already used or eliminated */
- if (ruled_out[i] != -1)
- continue;
-
- /*
- * See if we have clauses covered by this statistics, but not
- * yet covered by any of the preceding onces.
- */
- for (c = 0; c < nclauses; c++)
- {
- bool covered = false;
- Bitmapset *clause_attnums = clauses_attnums[c];
- Bitmapset *tmp = NULL;
-
- /*
- * If this clause is not covered by this stats, we can't
- * use the stats to estimate that at all.
- */
- if (! cover_map[i * nclauses + c])
- continue;
-
- /*
- * Now we know we'll use this clause - either as a condition
- * or as a new clause (the estimated one). So let's add the
- * attributes to the attnums from all the clauses usable with
- * this statistics.
- */
- tmp = bms_union(all_attnums, clause_attnums);
-
- /* free the old bitmap */
- bms_free(all_attnums);
- all_attnums = tmp;
-
- /* let's see if it's covered by any of the previous stats */
- for (j = 0; j < step; j++)
- {
- /* already covered by the previous stats */
- if (cover_map[current->stats[j] * nclauses + c])
- covered = true;
-
- if (covered)
- break;
- }
-
- /* if already covered, continue with the next clause */
- if (covered)
- {
- ncovered_conditions += 1;
- continue;
- }
-
- /*
- * OK, this clause is covered by this statistics (and not by
- * any of the previous ones)
- */
- ncovered_clauses += 1;
-
- /* add the attnums into attnums from 'new clauses' */
- // new_attnums = bms_union(new_attnums, clause_attnums);
- }
-
- /* can't have more new clauses than original clauses */
- Assert(nclauses >= ncovered_clauses);
- Assert(ncovered_clauses >= 0); /* mostly paranoia */
-
- nattnums = bms_num_members(all_attnums);
-
- /* free all the bitmapsets - we don't need them anymore */
- bms_free(all_attnums);
- bms_free(new_attnums);
-
- all_attnums = NULL;
- new_attnums = NULL;
-
- /*
- * See if we have clauses covered by this statistics, but not
- * yet covered by any of the preceding onces.
- */
- for (c = 0; c < nconditions; c++)
- {
- Bitmapset *clause_attnums = conditions_attnums[c];
- Bitmapset *tmp = NULL;
-
- /*
- * If this clause is not covered by this stats, we can't
- * use the stats to estimate that at all.
- */
- if (! condition_map[i * nconditions + c])
- continue;
-
- /* count this as a condition */
- ncovered_conditions += 1;
-
- /*
- * Now we know we'll use this clause - either as a condition
- * or as a new clause (the estimated one). So let's add the
- * attributes to the attnums from all the clauses usable with
- * this statistics.
- */
- tmp = bms_union(all_attnums, clause_attnums);
-
- /* free the old bitmap */
- bms_free(all_attnums);
- all_attnums = tmp;
- }
-
- /*
- * Let's mark the statistics as 'ruled out' - either we'll use
- * it (and proceed to the next step), or it's incompatible.
- */
- ruled_out[i] = step;
-
- /*
- * There are no clauses usable with this statistics (not already
- * covered by aome of the previous stats).
- *
- * Similarly, if the clauses only use a single attribute, we
- * can't really use that.
- */
- if ((ncovered_clauses == 0) || (nattnums < 2))
- continue;
-
- /*
- * TODO Not sure if it's possible to add a clause referencing
- * only attributes already covered by previous stats?
- * Introducing only some new dependency, not a new
- * attribute. Couldn't come up with an example, though.
- * Might be worth adding some assert.
- */
-
- /*
- * got a suitable statistics - let's update the current solution,
- * maybe use it as the best solution
- */
- current->nclauses += ncovered_clauses;
- current->nconditions += ncovered_conditions;
- current->nstats += 1;
- current->stats[step] = i;
-
- /*
- * We can never cover more clauses, or use more stats that we
- * actually have at the beginning.
- */
- Assert(nclauses >= current->nclauses);
- Assert(nmvstats >= current->nstats);
- Assert(step < nmvstats);
-
- /* we can't get more conditions that clauses and conditions combined
- *
- * FIXME This assert does not work because we count the conditions
- * repeatedly (once for each statistics covering it).
- */
- /* Assert((nconditions + nclauses) >= current->nconditions); */
-
- if (*best == NULL)
- {
- *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
- (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
- (*best)->nstats = 0;
- (*best)->nclauses = 0;
- (*best)->nconditions = 0;
- }
-
- /* see if it's better than the current 'best' solution */
- if ((current->nclauses > (*best)->nclauses) ||
- ((current->nclauses == (*best)->nclauses) &&
- ((current->nstats > (*best)->nstats))))
- {
- (*best)->nstats = current->nstats;
- (*best)->nclauses = current->nclauses;
- (*best)->nconditions = current->nconditions;
- memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
- }
-
- /*
- * The recursion only makes sense if we haven't covered all the
- * attributes (then adding stats is not really possible).
- */
- if ((step + 1) < nmvstats)
- choose_mv_statistics_exhaustive(root, step+1,
- nmvstats, mvstats, stats_attnums,
- nclauses, clauses, clauses_attnums,
- nconditions, conditions, conditions_attnums,
- cover_map, condition_map, ruled_out,
- current, best);
-
- /* reset the last step */
- current->nclauses -= ncovered_clauses;
- current->nconditions -= ncovered_conditions;
- current->nstats -= 1;
- current->stats[step] = 0;
-
- /* mark the statistics as usable again */
- ruled_out[i] = -1;
-
- Assert(current->nclauses >= 0);
- Assert(current->nstats >= 0);
- }
-
- /* reset all statistics as 'incompatible' in this step */
- for (i = 0; i < nmvstats; i++)
- if (ruled_out[i] == step)
- ruled_out[i] = -1;
-
-}
-
-/*
- * Greedy search for a multivariate solution - a sequence of statistics
- * covering the clauses. This chooses the "best" statistics at each step,
- * so the resulting solution may not be the best solution globally, but
- * this produces the solution in only N steps (where N is the number of
- * statistics), while the exhaustive approach may have to walk through
- * ~N! combinations (although some of those are terminated early).
- *
- * See the comments at choose_mv_statistics_exhaustive() as this does
- * the same thing (but in a different way).
- *
- * Don't call this directly, but through choose_mv_statistics().
- *
- * TODO There are probably other metrics we might use - e.g. using
- * number of columns (num_cond_columns / num_cov_columns), which
- * might work better with a mix of simple and complex clauses.
- *
- * TODO Also the choice at the very first step should be handled
- * in a special way, because there will be 0 conditions at that
- * moment, so there needs to be some other criteria - e.g. using
- * the simplest (or most complex?) clause might be a good idea.
- *
- * TODO We might also select multiple stats using different criteria,
- * and branch the search. This is however tricky, because if we
- * choose k statistics at each step, we get k^N branches to
- * walk through (with N steps). That's not really good with
- * large number of stats (yet better than exhaustive search).
- */
-static void
-choose_mv_statistics_greedy(PlannerInfo *root, int step,
- int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
- int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
- int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
- bool *cover_map, bool *condition_map, int *ruled_out,
- mv_solution_t *current, mv_solution_t **best)
-{
- int i, j;
- int best_stat = -1;
- double gain, max_gain = -1.0;
-
- /*
- * Bitmap tracking which clauses are already covered (by the previous
- * statistics) and may thus serve only as a condition in this step.
- */
- bool *covered_clauses = (bool*)palloc0(nclauses);
-
- /*
- * Number of clauses and columns covered by each statistics - this
- * includes both conditions and clauses covered by the statistics for
- * the first time. The number of columns may count some columns
- * repeatedly - if a column is shared by multiple clauses, it will
- * be counted once for each clause (covered by the statistics).
- * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
- * will be counted twice (if both clauses are covered).
- *
- * The values for reduded statistics (that can't be applied) are
- * not computed, because that'd be pointless.
- */
- int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
- int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
-
- /*
- * Same as above, but this only includes clauses that are already
- * covered by the previous stats (and the current one).
- */
- int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
- int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
-
- /*
- * Number of attributes for each clause.
- *
- * TODO Might be computed in choose_mv_statistics() and then passed
- * here, but then the function would not have the same signature
- * as _exhaustive().
- */
- int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
- int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
-
- CHECK_FOR_INTERRUPTS();
-
- Assert(best != NULL);
- Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
-
- /* compute attributes (columns) for each clause */
- for (i = 0; i < nclauses; i++)
- attnum_counts[i] = bms_num_members(clauses_attnums[i]);
-
- /* compute attributes (columns) for each condition */
- for (i = 0; i < nconditions; i++)
- attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
-
- /* see which clauses are already covered at this point (by previous stats) */
- for (i = 0; i < step; i++)
- for (j = 0; j < nclauses; j++)
- covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
-
- /* which remaining statistics covers most clauses / uses most conditions? */
- for (i = 0; i < nmvstats; i++)
- {
- Bitmapset *attnums_covered = NULL;
- Bitmapset *attnums_conditions = NULL;
-
- /* skip stats that are already ruled out (either used or inapplicable) */
- if (ruled_out[i] != -1)
- continue;
-
- /* count covered clauses and conditions (for the statistics) */
- for (j = 0; j < nclauses; j++)
- {
- if (cover_map[i * nclauses + j])
- {
- Bitmapset *attnums_new
- = bms_union(attnums_covered, clauses_attnums[j]);
-
- /* get rid of the old bitmap and keep the unified result */
- bms_free(attnums_covered);
- attnums_covered = attnums_new;
-
- num_cov_clauses[i] += 1;
- num_cov_columns[i] += attnum_counts[j];
-
- /* is the clause already covered (i.e. a condition)? */
- if (covered_clauses[j])
- {
- num_cond_clauses[i] += 1;
- num_cond_columns[i] += attnum_counts[j];
- attnums_new = bms_union(attnums_conditions,
- clauses_attnums[j]);
-
- bms_free(attnums_conditions);
- attnums_conditions = attnums_new;
- }
- }
- }
-
- /* if all covered clauses are covered by prev stats (thus conditions) */
- if (num_cov_clauses[i] == num_cond_clauses[i])
- ruled_out[i] = step;
-
- /* same if there are no new attributes */
- else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
- ruled_out[i] = step;
-
- bms_free(attnums_covered);
- bms_free(attnums_conditions);
-
- /* if the statistics is inapplicable, try the next one */
- if (ruled_out[i] != -1)
- continue;
-
- /* now let's walk through conditions and count the covered */
- for (j = 0; j < nconditions; j++)
- {
- if (condition_map[i * nconditions + j])
- {
- num_cond_clauses[i] += 1;
- num_cond_columns[i] += attnum_cond_counts[j];
- }
- }
-
- /* otherwise see if this improves the interesting metrics */
- gain = num_cond_columns[i] / (double)num_cov_columns[i];
-
- if (gain > max_gain)
- {
- max_gain = gain;
- best_stat = i;
- }
- }
-
- /*
- * Have we found a suitable statistics? Add it to the solution and
- * try next step.
- */
- if (best_stat != -1)
- {
- /* mark the statistics, so that we skip it in next steps */
- ruled_out[best_stat] = step;
-
- /* allocate current solution if necessary */
- if (current == NULL)
- {
- current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
- current->stats = (int*)palloc0(sizeof(int)*nmvstats);
- current->nstats = 0;
- current->nclauses = 0;
- current->nconditions = 0;
- }
-
- current->nclauses += num_cov_clauses[best_stat];
- current->nconditions += num_cond_clauses[best_stat];
- current->stats[step] = best_stat;
- current->nstats++;
-
- if (*best == NULL)
- {
- (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
- (*best)->nstats = current->nstats;
- (*best)->nclauses = current->nclauses;
- (*best)->nconditions = current->nconditions;
-
- (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
- memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
- }
- else
- {
- /* see if this is a better solution */
- double current_gain = (double)current->nconditions / current->nclauses;
- double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
-
- if ((current_gain > best_gain) ||
- ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
- {
- (*best)->nstats = current->nstats;
- (*best)->nclauses = current->nclauses;
- (*best)->nconditions = current->nconditions;
- memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
- }
- }
-
- /*
- * The recursion only makes sense if we haven't covered all the
- * attributes (then adding stats is not really possible).
- */
- if ((step + 1) < nmvstats)
- choose_mv_statistics_greedy(root, step+1,
- nmvstats, mvstats, stats_attnums,
- nclauses, clauses, clauses_attnums,
- nconditions, conditions, conditions_attnums,
- cover_map, condition_map, ruled_out,
- current, best);
-
- /* reset the last step */
- current->nclauses -= num_cov_clauses[best_stat];
- current->nconditions -= num_cond_clauses[best_stat];
- current->nstats -= 1;
- current->stats[step] = 0;
-
- /* mark the statistics as usable again */
- ruled_out[best_stat] = -1;
- }
-
- /* reset all statistics eliminated in this step */
- for (i = 0; i < nmvstats; i++)
- if (ruled_out[i] == step)
- ruled_out[i] = -1;
-
- /* free everything allocated in this step */
- pfree(covered_clauses);
- pfree(attnum_counts);
- pfree(num_cov_clauses);
- pfree(num_cov_columns);
- pfree(num_cond_clauses);
- pfree(num_cond_columns);
-}
-
-/*
- * Chooses the combination of statistics, optimal for estimation of
- * a particular clause list.
- *
- * This only handles a 'preparation' shared by the exhaustive and greedy
- * implementations (see the previous methods), mostly trying to reduce
- * the size of the problem (eliminate clauses/statistics that can't be
- * really used in the solution).
- *
- * It also precomputes bitmaps for attributes covered by clauses and
- * statistics, so that we don't need to do that over and over in the
- * actual optimizations (as it's both CPU and memory intensive).
- *
- * TODO This will probably have to consider compatibility of clauses,
- * because 'dependencies' will probably work only with equality
- * clauses.
- *
- * TODO Another way to make the optimization problems smaller might
- * be splitting the statistics into several disjoint subsets, i.e.
- * if we can split the graph of statistics (after the elimination)
- * into multiple components (so that stats in different components
- * share no attributes), we can do the optimization for each
- * component separately.
- *
- * TODO If we could compute what is a "perfect solution" maybe we could
- * terminate the search after reaching ~90% of it? Say, if we knew
- * that we can cover 10 clauses and reuse 8 dependencies, maybe
- * covering 9 clauses and 7 dependencies would be OK?
- */
-static List*
-choose_mv_statistics(PlannerInfo *root, List *stats,
- List *clauses, List *conditions,
- Oid varRelid, SpecialJoinInfo *sjinfo)
-{
- int i;
- mv_solution_t *best = NULL;
- List *result = NIL;
-
- int nmvstats;
- MVStatisticInfo *mvstats;
-
- /* we only work with MCV lists and histograms here */
- int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
-
- bool *clause_cover_map = NULL,
- *condition_cover_map = NULL;
- int *ruled_out = NULL;
-
- /* build bitmapsets for all stats and clauses */
- Bitmapset **stats_attnums;
- Bitmapset **clauses_attnums;
- Bitmapset **conditions_attnums;
-
- int nclauses, nconditions;
- Node ** clauses_array;
- Node ** conditions_array;
-
- /* copy lists, so that we can free them during elimination easily */
- clauses = list_copy(clauses);
- conditions = list_copy(conditions);
- stats = list_copy(stats);
-
- /*
- * Reduce the optimization problem size as much as possible.
- *
- * Eliminate clauses and conditions not covered by any statistics,
- * or statistics not matching at least two attributes (one of them
- * has to be in a regular clause).
- *
- * It's possible that removing a statistics in one iteration
- * eliminates clause in the next one, so we'll repeat this until we
- * eliminate no clauses/stats in that iteration.
- *
- * This can only happen after eliminating a statistics - clauses are
- * eliminated first, so statistics always reflect that.
- */
- while (true)
- {
- List *tmp;
-
- Bitmapset *compatible_attnums = NULL;
- Bitmapset *condition_attnums = NULL;
- Bitmapset *all_attnums = NULL;
-
- /*
- * Clauses
- *
- * Walk through clauses and keep only those covered by at least
- * one of the statistics we still have. We'll also keep info
- * about attnums in clauses (without conditions) so that we can
- * ignore stats covering just conditions (which is pointless).
- */
- tmp = filter_clauses(root, varRelid, sjinfo, type,
- stats, clauses, &compatible_attnums);
-
- /* discard the original list */
- list_free(clauses);
- clauses = tmp;
-
- /*
- * Conditions
- *
- * Walk through clauses and keep only those covered by at least
- * one of the statistics we still have. Also, collect bitmap of
- * attributes so that we can make sure we add at least one new
- * attribute (by comparing with clauses).
- */
- if (conditions != NIL)
- {
- tmp = filter_clauses(root, varRelid, sjinfo, type,
- stats, conditions, &condition_attnums);
-
- /* discard the original list */
- list_free(conditions);
- conditions = tmp;
- }
-
- /* get a union of attnums (from conditions and new clauses) */
- all_attnums = bms_union(compatible_attnums, condition_attnums);
-
- /*
- * Statisitics
- *
- * Walk through statistics and only keep those covering at least
- * one new attribute (excluding conditions) and at two attributes
- * in both clauses and conditions.
- */
- tmp = filter_stats(stats, compatible_attnums, all_attnums);
-
- /* if we've not eliminated anything, terminate */
- if (list_length(stats) == list_length(tmp))
- break;
-
- /* work only with filtered statistics from now */
- list_free(stats);
- stats = tmp;
- }
-
- /* only do the optimization if we have clauses/statistics */
- if ((list_length(stats) == 0) || (list_length(clauses) == 0))
- return NULL;
-
- /* remove redundant stats (stats covered by another stats) */
- stats = filter_redundant_stats(stats, clauses, conditions);
-
- /*
- * TODO We should sort the stats to make the order deterministic,
- * otherwise we may get different estimates on different
- * executions - if there are multiple "equally good" solutions,
- * we'll keep the first solution we see.
- *
- * Sorting by OID probably is not the right solution though,
- * because we'd like it to be somehow reproducible,
- * irrespectedly of the order of ADD STATISTICS commands.
- * So maybe statkeys?
- */
- mvstats = make_stats_array(stats, &nmvstats);
- stats_attnums = make_stats_attnums(mvstats, nmvstats);
-
- /* collect clauses an bitmap of attnums */
- clauses_array = make_clauses_array(clauses, &nclauses);
- clauses_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
- clauses_array, nclauses);
-
- /* collect conditions and bitmap of attnums */
- conditions_array = make_clauses_array(conditions, &nconditions);
- conditions_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
- conditions_array, nconditions);
-
- /*
- * Build bitmaps with info about which clauses/conditions are
- * covered by each statistics (so that we don't need to call the
- * bms_is_subset over and over again).
- */
- clause_cover_map = make_cover_map(stats_attnums, nmvstats,
- clauses_attnums, nclauses);
-
- condition_cover_map = make_cover_map(stats_attnums, nmvstats,
- conditions_attnums, nconditions);
-
- ruled_out = (int*)palloc0(nmvstats * sizeof(int));
-
- /* no stats are ruled out by default */
- for (i = 0; i < nmvstats; i++)
- ruled_out[i] = -1;
-
- /* do the optimization itself */
- if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
- choose_mv_statistics_exhaustive(root, 0,
- nmvstats, mvstats, stats_attnums,
- nclauses, clauses_array, clauses_attnums,
- nconditions, conditions_array, conditions_attnums,
- clause_cover_map, condition_cover_map,
- ruled_out, NULL, &best);
- else
- choose_mv_statistics_greedy(root, 0,
- nmvstats, mvstats, stats_attnums,
- nclauses, clauses_array, clauses_attnums,
- nconditions, conditions_array, conditions_attnums,
- clause_cover_map, condition_cover_map,
- ruled_out, NULL, &best);
-
- /* create a list of statistics from the array */
- if (best != NULL)
- {
- for (i = 0; i < best->nstats; i++)
- {
- MVStatisticInfo *info = makeNode(MVStatisticInfo);
- memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
- result = lappend(result, info);
- }
- pfree(best);
- }
-
- /* cleanup (maybe leave it up to the memory context?) */
- for (i = 0; i < nmvstats; i++)
- bms_free(stats_attnums[i]);
-
- for (i = 0; i < nclauses; i++)
- bms_free(clauses_attnums[i]);
-
- for (i = 0; i < nconditions; i++)
- bms_free(conditions_attnums[i]);
-
- pfree(stats_attnums);
- pfree(clauses_attnums);
- pfree(conditions_attnums);
-
- pfree(clauses_array);
- pfree(conditions_array);
- pfree(clause_cover_map);
- pfree(condition_cover_map);
- pfree(ruled_out);
- pfree(mvstats);
-
- list_free(clauses);
- list_free(conditions);
- list_free(stats);
-
- return result;
-}
-
-
-/*
- * This splits the clauses list into two parts - one containing clauses
- * that will be evaluated using the chosen statistics, and the remaining
- * clauses (either non-mvcompatible, or not related to the histogram).
- */
-static List *
-clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
- List *clauses, Oid varRelid, List **mvclauses,
- MVStatisticInfo *mvstats, int types)
-{
- int i;
- ListCell *l;
- List *non_mvclauses = NIL;
-
- /* FIXME is there a better way to get info on int2vector? */
- int2vector * attrs = mvstats->stakeys;
- int numattrs = mvstats->stakeys->dim1;
-
- Bitmapset *mvattnums = NULL;
-
- /* build bitmap of attributes covered by the stats, so we can
- * do bms_is_subset later */
- for (i = 0; i < numattrs; i++)
- mvattnums = bms_add_member(mvattnums, attrs->values[i]);
-
- /* erase the list of mv-compatible clauses */
- *mvclauses = NIL;
-
- foreach (l, clauses)
- {
- bool match = false; /* by default not mv-compatible */
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(l);
-
- if (clause_is_mv_compatible(root, clause, varRelid, NULL,
- &attnums, sjinfo, types))
- {
- /* are all the attributes part of the selected stats? */
- if (bms_is_subset(attnums, mvattnums))
- match = true;
- }
-
- /*
- * The clause matches the selected stats, so put it to the list
- * of mv-compatible clauses. Otherwise, keep it in the list of
- * 'regular' clauses (that may be selected later).
- */
- if (match)
- *mvclauses = lappend(*mvclauses, clause);
- else
- non_mvclauses = lappend(non_mvclauses, clause);
- }
-
- /*
- * Perform regular estimation using the clauses incompatible
- * with the chosen histogram (or MV stats in general).
- */
- return non_mvclauses;
-
-}
-
-/*
- * Determines whether the clause is compatible with multivariate stats,
- * and if it is, returns some additional information - varno (index
- * into simple_rte_array) and a bitmap of attributes. This is then
- * used to fetch related multivariate statistics.
- *
- * At this moment we only support basic conditions of the form
- *
- * variable OP constant
- *
- * where OP is one of [=,<,<=,>=,>] (which is however determined by
- * looking at the associated function for estimating selectivity, just
- * like with the single-dimensional case).
- *
- * TODO Support 'OR clauses' - shouldn't be all that difficult to
- * evaluate them using multivariate stats.
- */
-static bool
-clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
- int types)
-{
- Relids clause_relids;
- Relids left_relids;
- Relids right_relids;
-
- if (IsA(clause, RestrictInfo))
- {
- RestrictInfo *rinfo = (RestrictInfo *) clause;
-
- /* Pseudoconstants are not really interesting here. */
- if (rinfo->pseudoconstant)
- return false;
-
- /* get the actual clause from the RestrictInfo (it's not an OR clause) */
- clause = (Node*)rinfo->clause;
-
- /* we don't support join conditions at this moment */
- if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
- return false;
-
- clause_relids = rinfo->clause_relids;
- left_relids = rinfo->left_relids;
- right_relids = rinfo->right_relids;
- }
- else if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
- {
- left_relids = pull_varnos(get_leftop((Expr*)clause));
- right_relids = pull_varnos(get_rightop((Expr*)clause));
-
- clause_relids = bms_union(left_relids,
- right_relids);
- }
- else
- {
- /* Not a binary opclause, so mark left/right relid sets as empty */
- left_relids = NULL;
- right_relids = NULL;
- /* and get the total relid set the hard way */
- clause_relids = pull_varnos((Node *) clause);
- }
-
- /*
- * Only simple opclauses and IS NULL tests are compatible with
- * multivariate stats at this point.
- */
- if ((is_opclause(clause))
- && (list_length(((OpExpr *) clause)->args) == 2))
- {
- OpExpr *expr = (OpExpr *) clause;
- bool varonleft = true;
- bool ok;
-
- /* is it 'variable op constant' ? */
- ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
- (is_pseudo_constant_clause_relids(lsecond(expr->args),
- right_relids) ||
- (varonleft = false,
- is_pseudo_constant_clause_relids(linitial(expr->args),
- left_relids)));
-
- if (ok)
- {
- Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
-
- /*
- * Simple variables only - otherwise the planner_rt_fetch seems to fail
- * (return NULL).
- *
- * TODO Maybe use examine_variable() would fix that?
- */
- if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
- return false;
-
- /*
- * Only consider this variable if (varRelid == 0) or when the varno
- * matches varRelid (see explanation at clause_selectivity).
- *
- * FIXME I suspect this may not be really necessary. The (varRelid == 0)
- * part seems to be enforced by treat_as_join_clause().
- */
- if (! ((varRelid == 0) || (varRelid == var->varno)))
- return false;
-
- /* Also skip special varno values, and system attributes ... */
- if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
- return false;
-
- /* Lookup info about the base relation (we need to pass the OID out) */
- if (relid != NULL)
- *relid = var->varno;
-
- /*
- * If it's not a "<" or ">" or "=" operator, just ignore the
- * clause. Otherwise note the relid and attnum for the variable.
- * This uses the function for estimating selectivity, ont the
- * operator directly (a bit awkward, but well ...).
- */
- switch (get_oprrest(expr->opno))
- {
- case F_SCALARLTSEL:
- case F_SCALARGTSEL:
- /* not compatible with functional dependencies */
- if (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
- {
- *attnums = bms_add_member(*attnums, var->varattno);
- return (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
- }
- return false;
-
- case F_EQSEL:
- *attnums = bms_add_member(*attnums, var->varattno);
- return true;
- }
- }
- }
- else if (IsA(clause, NullTest)
- && IsA(((NullTest*)clause)->arg, Var))
- {
- Var * var = (Var*)((NullTest*)clause)->arg;
-
- /*
- * Simple variables only - otherwise the planner_rt_fetch seems to fail
- * (return NULL).
- *
- * TODO Maybe use examine_variable() would fix that?
- */
- if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
- return false;
-
- /*
- * Only consider this variable if (varRelid == 0) or when the varno
- * matches varRelid (see explanation at clause_selectivity).
- *
- * FIXME I suspect this may not be really necessary. The (varRelid == 0)
- * part seems to be enforced by treat_as_join_clause().
- */
- if (! ((varRelid == 0) || (varRelid == var->varno)))
- return false;
-
- /* Also skip special varno values, and system attributes ... */
- if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
- return false;
-
- /* Lookup info about the base relation (we need to pass the OID out) */
- if (relid != NULL)
- *relid = var->varno;
-
- *attnums = bms_add_member(*attnums, var->varattno);
-
- return true;
- }
- else if (or_clause(clause) || and_clause(clause))
- {
- /*
- * AND/OR-clauses are supported if all sub-clauses are supported
- *
- * TODO We might support mixed case, where some of the clauses
- * are supported and some are not, and treat all supported
- * subclauses as a single clause, compute it's selectivity
- * using mv stats, and compute the total selectivity using
- * the current algorithm.
- *
- * TODO For RestrictInfo above an OR-clause, we might use the
- * orclause with nested RestrictInfo - we won't have to
- * call pull_varnos() for each clause, saving time.
- */
- Bitmapset *tmp = NULL;
- ListCell *l;
- foreach (l, ((BoolExpr*)clause)->args)
- {
- if (! clause_is_mv_compatible(root, (Node*)lfirst(l),
- varRelid, relid, &tmp, sjinfo, types))
- return false;
- }
-
- /* add the attnums from the OR-clause to the set of attnums */
- *attnums = bms_join(*attnums, tmp);
-
- return true;
- }
-
- return false;
-}
-
-
-static Bitmapset *
-clause_mv_get_attnums(PlannerInfo *root, Node *clause)
-{
- Bitmapset * attnums = NULL;
-
- /* Extract clause from restrict info, if needed. */
- if (IsA(clause, RestrictInfo))
- clause = (Node*)((RestrictInfo*)clause)->clause;
-
- /*
- * Only simple opclauses and IS NULL tests are compatible with
- * multivariate stats at this point.
- */
- if ((is_opclause(clause))
- && (list_length(((OpExpr *) clause)->args) == 2))
- {
- OpExpr *expr = (OpExpr *) clause;
-
- if (IsA(linitial(expr->args), Var))
- attnums = bms_add_member(attnums,
- ((Var*)linitial(expr->args))->varattno);
- else
- attnums = bms_add_member(attnums,
- ((Var*)lsecond(expr->args))->varattno);
- }
- else if (IsA(clause, NullTest)
- && IsA(((NullTest*)clause)->arg, Var))
- {
- attnums = bms_add_member(attnums,
- ((Var*)((NullTest*)clause)->arg)->varattno);
- }
- else if (or_clause(clause) || and_clause(clause))
- {
- ListCell *l;
- foreach (l, ((BoolExpr*)clause)->args)
- {
- attnums = bms_join(attnums,
- clause_mv_get_attnums(root, (Node*)lfirst(l)));
- }
- }
-
- return attnums;
-}
-
-/*
- * Performs reduction of clauses using functional dependencies, i.e.
- * removes clauses that are considered redundant. It simply walks
- * through dependencies, and checks whether the dependency 'matches'
- * the clauses, i.e. if there's a clause matching the condition. If yes,
- * all clauses matching the implied part of the dependency are removed
- * from the list.
- *
- * This simply looks at attnums references by the clauses, not at the
- * type of the operator (equality, inequality, ...). This may not be the
- * right way to do - it certainly works best for equalities, which is
- * naturally consistent with functional dependencies (implications).
- * It's not clear that other operators are handled sensibly - for
- * example for inequalities, like
- *
- * WHERE (A >= 10) AND (B <= 20)
- *
- * and a trivial case where [A == B], resulting in symmetric pair of
- * rules [A => B], [B => A], it's rather clear we can't remove either of
- * those clauses.
- *
- * That only highlights that functional dependencies are most suitable
- * for label-like data, where using non-equality operators is very rare.
- * Using the common city/zipcode example, clauses like
- *
- * (zipcode <= 12345)
- *
- * or
- *
- * (cityname >= 'Washington')
- *
- * are rare. So restricting the reduction to equality should not harm
- * the usefulness / applicability.
- *
- * The other assumption is that this assumes 'compatible' clauses. For
- * example by using mismatching zip code and city name, this is unable
- * to identify the discrepancy and eliminates one of the clauses. The
- * usual approach (multiplying both selectivities) thus produces a more
- * accurate estimate, although mostly by luck - the multiplication
- * comes from assumption of statistical independence of the two
- * conditions (which is not not valid in this case), but moves the
- * estimate in the right direction (towards 0%).
- *
- * This might be somewhat improved by cross-checking the selectivities
- * against MCV and/or histogram.
- *
- * The implementation needs to be careful about cyclic rules, i.e. rules
- * like [A => B] and [B => A] at the same time. This must not reduce
- * clauses on both attributes at the same time.
- *
- * Technically we might consider selectivities here too, somehow. E.g.
- * when (A => B) and (B => A), we might use the clauses with minimum
- * selectivity.
- *
- * TODO Consider restricting the reduction to equality clauses. Or maybe
- * use equality classes somehow?
- *
- * TODO Merge this docs to dependencies.c, as it's saying mostly the
- * same things as the comments there.
- *
- * TODO Currently this is applied only to the top-level clauses, but
- * maybe we could apply it to lists at subtrees too, e.g. to the
- * two AND-clauses in
- *
- * (x=1 AND y=2) OR (z=3 AND q=10)
- *
- */
-static List *
-clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
- Oid varRelid, List *stats,
- SpecialJoinInfo *sjinfo)
-{
- List *reduced_clauses = NIL;
- Index relid;
-
- /*
- * matrix of (natts x natts), 1 means x=>y
- *
- * This serves two purposes - first, it merges dependencies from all
- * the statistics, second it makes generating all the transitive
- * dependencies easier.
- *
- * We need to build this only for attributes from the dependencies,
- * not for all attributes in the table.
- *
- * We can't do that only for attributes from the clauses, because we
- * want to build transitive dependencies (including those going
- * through attributes not listed in the stats).
- *
- * This only works for A=>B dependencies, not sure how to do that
- * for complex dependencies.
- */
- bool *deps_matrix;
- int deps_natts; /* size of the matric */
-
- /* mapping attnum <=> matrix index */
- int *deps_idx_to_attnum;
- int *deps_attnum_to_idx;
-
- /* attnums in dependencies and clauses (and intersection) */
- List *deps_clauses = NIL;
- Bitmapset *deps_attnums = NULL;
- Bitmapset *clause_attnums = NULL;
- Bitmapset *intersect_attnums = NULL;
-
- /*
- * Is there at least one statistics with functional dependencies?
- * If not, return the original clauses right away.
- *
- * XXX Isn't this pointless, thanks to exactly the same check in
- * clauselist_selectivity()? Can we trigger the condition here?
- */
- if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
- return clauses;
-
- /*
- * Build the dependency matrix, i.e. attribute adjacency matrix,
- * where 1 means (a=>b). Once we have the adjacency matrix, we'll
- * multiply it by itself, to get transitive dependencies.
- *
- * Note: This is pretty much transitive closure from graph theory.
- *
- * First, let's see what attributes are covered by functional
- * dependencies (sides of the adjacency matrix), and also a maximum
- * attribute (size of mapping to simple integer indexes);
- */
- deps_attnums = fdeps_collect_attnums(stats);
-
- /*
- * Walk through the clauses - clauses that are (one of)
- *
- * (a) not mv-compatible
- * (b) are using more than a single attnum
- * (c) using attnum not covered by functional depencencies
- *
- * may be copied directly to the result. The interesting clauses are
- * kept in 'deps_clauses' and will be processed later.
- */
- clause_attnums = fdeps_filter_clauses(root, clauses, deps_attnums,
- &reduced_clauses, &deps_clauses,
- varRelid, &relid, sjinfo);
-
- /*
- * we need at least two clauses referencing two different attributes
- * referencing to do the reduction
- */
- if ((list_length(deps_clauses) < 2) || (bms_num_members(clause_attnums) < 2))
- {
- bms_free(clause_attnums);
- list_free(reduced_clauses);
- list_free(deps_clauses);
-
- return clauses;
- }
-
-
- /*
- * We need at least two matching attributes in the clauses and
- * dependencies, otherwise we can't really reduce anything.
- */
- intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
- if (bms_num_members(intersect_attnums) < 2)
- {
- bms_free(clause_attnums);
- bms_free(deps_attnums);
- bms_free(intersect_attnums);
-
- list_free(deps_clauses);
- list_free(reduced_clauses);
-
- return clauses;
- }
-
- /*
- * Build mapping between matrix indexes and attnums, and then the
- * adjacency matrix itself.
- */
- deps_idx_to_attnum = make_idx_to_attnum_mapping(deps_attnums);
- deps_attnum_to_idx = make_attnum_to_idx_mapping(deps_attnums);
-
- /* build the adjacency matrix */
- deps_matrix = build_adjacency_matrix(stats, deps_attnums,
- deps_idx_to_attnum,
- deps_attnum_to_idx);
-
- deps_natts = bms_num_members(deps_attnums);
-
- /*
- * Multiply the matrix N-times (N = size of the matrix), so that we
- * get all the transitive dependencies. That makes the next step
- * much easier and faster.
- *
- * This is essentially an adjacency matrix from graph theory, and
- * by multiplying it we get transitive edges. We don't really care
- * about the exact number (number of paths between vertices) though,
- * so we can do the multiplication in-place (we don't care whether
- * we found the dependency in this round or in the previous one).
- *
- * Track how many new dependencies were added, and stop when 0, but
- * we can't multiply more than N-times (longest path in the graph).
- */
- multiply_adjacency_matrix(deps_matrix, deps_natts);
-
- /*
- * Walk through the clauses, and see which other clauses we may
- * reduce. The matrix contains all transitive dependencies, which
- * makes this very fast.
- *
- * We have to be careful not to reduce the clause using itself, or
- * reducing all clauses forming a cycle (so we have to skip already
- * eliminated clauses).
- *
- * I'm not sure whether this guarantees finding the best solution,
- * i.e. reducing the most clauses, but it probably does (thanks to
- * having all the transitive dependencies).
- */
- deps_clauses = fdeps_reduce_clauses(deps_clauses,
- deps_attnums, deps_matrix,
- deps_idx_to_attnum,
- deps_attnum_to_idx, relid);
-
- /* join the two lists of clauses */
- reduced_clauses = list_union(reduced_clauses, deps_clauses);
-
- pfree(deps_matrix);
- pfree(deps_idx_to_attnum);
- pfree(deps_attnum_to_idx);
-
- bms_free(deps_attnums);
- bms_free(clause_attnums);
- bms_free(intersect_attnums);
-
- return reduced_clauses;
-}
-
-static bool
-has_stats(List *stats, int type)
-{
- ListCell *s;
-
- foreach (s, stats)
- {
- MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
-
- if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
- return true;
-
- if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
- return true;
-
- if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
- return true;
- }
-
- return false;
-}
-
-/*
- * Determing relid (either from varRelid or from clauses) and then
- * lookup stats using the relid.
- */
-static List *
-find_stats(PlannerInfo *root, List *clauses, Oid varRelid, Index *relid)
-{
- /* unknown relid by default */
- *relid = InvalidOid;
-
- /*
- * First we need to find the relid (index info simple_rel_array).
- * If varRelid is not 0, we already have it, otherwise we have to
- * look it up from the clauses.
- */
- if (varRelid != 0)
- *relid = varRelid;
- else
- {
- Relids relids = pull_varnos((Node*)clauses);
-
- /*
- * We only expect 0 or 1 members in the bitmapset. If there are
- * no vars, we'll get empty bitmapset, otherwise we'll get the
- * relid as the single member.
- *
- * FIXME For some reason we can get 2 relids here (e.g. \d in
- * psql does that).
- */
- if (bms_num_members(relids) == 1)
- *relid = bms_singleton_member(relids);
-
- bms_free(relids);
- }
-
- /*
- * if we found the relid, we can get the stats from simple_rel_array
- *
- * This only gets stats that are already built, because that's how
- * we load it into RelOptInfo (see get_relation_info), but we don't
- * detoast the whole stats yet. That'll be done later, after we
- * decide which stats to use.
- */
- if (*relid != InvalidOid)
- return root->simple_rel_array[*relid]->mvstatlist;
-
- return NIL;
-}
-
-static Bitmapset*
-fdeps_collect_attnums(List *stats)
-{
- ListCell *lc;
- Bitmapset *attnums = NULL;
-
- foreach (lc, stats)
- {
- int j;
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
-
- int2vector *stakeys = info->stakeys;
-
- /* skip stats without functional dependencies built */
- if (! info->deps_built)
- continue;
-
- for (j = 0; j < stakeys->dim1; j++)
- attnums = bms_add_member(attnums, stakeys->values[j]);
- }
-
- return attnums;
-}
-
-
-static int*
-make_idx_to_attnum_mapping(Bitmapset *attnums)
-{
- int attidx = 0;
- int attnum = -1;
-
- int *mapping = (int*)palloc0(bms_num_members(attnums) * sizeof(int));
-
- while ((attnum = bms_next_member(attnums, attnum)) >= 0)
- mapping[attidx++] = attnum;
-
- Assert(attidx == bms_num_members(attnums));
-
- return mapping;
-}
-
-static int*
-make_attnum_to_idx_mapping(Bitmapset *attnums)
-{
- int attidx = 0;
- int attnum = -1;
- int maxattnum = -1;
- int *mapping;
-
- while ((attnum = bms_next_member(attnums, attnum)) >= 0)
- maxattnum = attnum;
-
- mapping = (int*)palloc0((maxattnum+1) * sizeof(int));
-
- attnum = -1;
- while ((attnum = bms_next_member(attnums, attnum)) >= 0)
- mapping[attnum] = attidx++;
-
- Assert(attidx == bms_num_members(attnums));
-
- return mapping;
-}
-
-static bool*
-build_adjacency_matrix(List *stats, Bitmapset *attnums,
- int *idx_to_attnum, int *attnum_to_idx)
-{
- ListCell *lc;
- int natts = bms_num_members(attnums);
- bool *matrix = (bool*)palloc0(natts * natts * sizeof(bool));
-
- foreach (lc, stats)
- {
- int j;
- MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
- MVDependencies dependencies = NULL;
-
- /* skip stats without functional dependencies built */
- if (! stat->deps_built)
- continue;
-
- /* fetch and deserialize dependencies */
- dependencies = load_mv_dependencies(stat->mvoid);
- if (dependencies == NULL)
- {
- elog(WARNING, "failed to deserialize func deps %d", stat->mvoid);
- continue;
- }
-
- /* set matrix[a,b] to 'true' if 'a=>b' */
- for (j = 0; j < dependencies->ndeps; j++)
- {
- int aidx = attnum_to_idx[dependencies->deps[j]->a];
- int bidx = attnum_to_idx[dependencies->deps[j]->b];
-
- /* a=> b */
- matrix[aidx * natts + bidx] = true;
- }
+ varRelid,
+ jointype,
+ sjinfo);
}
+ else if (IsA(clause, CurrentOfExpr))
+ {
+ /* CURRENT OF selects at most one row of its table */
+ CurrentOfExpr *cexpr = (CurrentOfExpr *) clause;
+ RelOptInfo *crel = find_base_rel(root, cexpr->cvarno);
- return matrix;
-}
-
-static void
-multiply_adjacency_matrix(bool *matrix, int natts)
-{
- int i;
-
- for (i = 0; i < natts; i++)
+ if (crel->tuples > 0)
+ s1 = 1.0 / crel->tuples;
+ }
+ else if (IsA(clause, RelabelType))
+ {
+ /* Not sure this case is needed, but it can't hurt */
+ s1 = clause_selectivity(root,
+ (Node *) ((RelabelType *) clause)->arg,
+ varRelid,
+ jointype,
+ sjinfo);
+ }
+ else if (IsA(clause, CoerceToDomain))
{
- int k, l, m;
- int nchanges = 0;
+ /* Not sure this case is needed, but it can't hurt */
+ s1 = clause_selectivity(root,
+ (Node *) ((CoerceToDomain *) clause)->arg,
+ varRelid,
+ jointype,
+ sjinfo);
+ }
- /* k => l */
- for (k = 0; k < natts; k++)
- {
- for (l = 0; l < natts; l++)
- {
- /* we already have this dependency */
- if (matrix[k * natts + l])
- continue;
+ /* Cache the result if possible */
+ if (cacheable)
+ {
+ if (jointype == JOIN_INNER)
+ rinfo->norm_selec = s1;
+ else
+ rinfo->outer_selec = s1;
+ }
- /* we don't really care about the exact value, just 0/1 */
- for (m = 0; m < natts; m++)
- {
- if (matrix[k * natts + m] * matrix[m * natts + l])
- {
- matrix[k * natts + l] = true;
- nchanges += 1;
- break;
- }
- }
- }
- }
+#ifdef SELECTIVITY_DEBUG
+ elog(DEBUG4, "clause_selectivity: s1 %f", s1);
+#endif /* SELECTIVITY_DEBUG */
- /* no transitive dependency added here, so terminate */
- if (nchanges == 0)
- break;
- }
+ return s1;
}
+
static List*
fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
int *idx_to_attnum, int *attnum_to_idx, Index relid)
@@ -3427,55 +1211,6 @@ fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
}
-static Bitmapset *
-fdeps_filter_clauses(PlannerInfo *root,
- List *clauses, Bitmapset *deps_attnums,
- List **reduced_clauses, List **deps_clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo)
-{
- ListCell *lc;
- Bitmapset *clause_attnums = NULL;
-
- foreach (lc, clauses)
- {
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(lc);
-
- if (! clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
- sjinfo, MV_CLAUSE_TYPE_FDEP))
-
- /* clause incompatible with functional dependencies */
- *reduced_clauses = lappend(*reduced_clauses, clause);
-
- else if (bms_num_members(attnums) > 1)
-
- /*
- * clause referencing multiple attributes (strange, should
- * this be handled by clause_is_mv_compatible directly)
- */
- *reduced_clauses = lappend(*reduced_clauses, clause);
-
- else if (! bms_is_member(bms_singleton_member(attnums), deps_attnums))
-
- /* clause not covered by the dependencies */
- *reduced_clauses = lappend(*reduced_clauses, clause);
-
- else
- {
- /* ok, clause compatible with existing dependencies */
- Assert(bms_num_members(attnums) == 1);
-
- *deps_clauses = lappend(*deps_clauses, clause);
- clause_attnums = bms_add_member(clause_attnums,
- bms_singleton_member(attnums));
- }
-
- bms_free(attnums);
- }
-
- return clause_attnums;
-}
-
/*
* Pull varattnos from the clauses, similarly to pull_varattnos() but:
*
@@ -3509,162 +1244,6 @@ get_varattnos(Node * node, Index relid)
return result;
}
-/*
- * Estimate selectivity of clauses using a MCV list.
- *
- * If there's no MCV list for the stats, the function returns 0.0.
- *
- * While computing the estimate, the function checks whether all the
- * columns were matched with an equality condition. If that's the case,
- * we can skip processing the histogram, as there can be no rows in
- * it with the same values - all the rows matching the condition are
- * represented by the MCV item. This can only happen with equality
- * on all the attributes.
- *
- * The algorithm works like this:
- *
- * 1) mark all items as 'match'
- * 2) walk through all the clauses
- * 3) for a particular clause, walk through all the items
- * 4) skip items that are already 'no match'
- * 5) check clause for items that still match
- * 6) sum frequencies for items to get selectivity
- *
- * The function also returns the frequency of the least frequent item
- * on the MCV list, which may be useful for clamping estimate from the
- * histogram (all items not present in the MCV list are less frequent).
- * This however seems useful only for cases with conditions on all
- * attributes.
- *
- * TODO This only handles AND-ed clauses, but it might work for OR-ed
- * lists too - it just needs to reverse the logic a bit. I.e. start
- * with 'no match' for all items, and mark the items as a match
- * as the clauses are processed (and skip items that are 'match').
- */
-static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
- List *clauses, List *conditions, bool is_or,
- bool *fullmatch, Selectivity *lowsel)
-{
- int i;
- Selectivity s = 0.0;
- Selectivity t = 0.0;
- Selectivity u = 0.0;
-
- MCVList mcvlist = NULL;
-
- int nmatches = 0;
- int nconditions = 0;
-
- /* match/mismatch bitmap for each MCV item */
- char * matches = NULL;
- char * condition_matches = NULL;
-
- Assert(clauses != NIL);
- Assert(list_length(clauses) >= 1);
-
- /* there's no MCV list built yet */
- if (! mvstats->mcv_built)
- return 0.0;
-
- mcvlist = load_mv_mcvlist(mvstats->mvoid);
-
- Assert(mcvlist != NULL);
- Assert(mcvlist->nitems > 0);
-
- /* number of matching MCV items */
- nmatches = mcvlist->nitems;
- nconditions = mcvlist->nitems;
-
- /*
- * Bitmap of bucket matches (mismatch, partial, full).
- *
- * For AND clauses all buckets match (and we'll eliminate them).
- * For OR clauses no buckets match (and we'll add them).
- *
- * We only need to do the memset for AND clauses (for OR clauses
- * it's already set correctly by the palloc0).
- */
- matches = palloc0(sizeof(char) * nmatches);
-
- if (! is_or) /* AND-clause */
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
-
- /* Conditions are treated as AND clause, so match by default. */
- condition_matches = palloc0(sizeof(char) * nconditions);
- memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
-
- /*
- * build the match bitmap for the conditions (conditions are always
- * connected by AND)
- */
- if (conditions != NIL)
- nconditions = update_match_bitmap_mcvlist(root, conditions,
- mvstats->stakeys, mcvlist,
- nconditions, condition_matches,
- lowsel, fullmatch, false);
-
- /*
- * build the match bitmap for the estimated clauses
- *
- * TODO This evaluates the clauses for all MCV items, even those
- * ruled out by the conditions. The final result should be the
- * same, but it might be faster.
- */
- nmatches = update_match_bitmap_mcvlist(root, clauses,
- mvstats->stakeys, mcvlist,
- ((is_or) ? 0 : nmatches), matches,
- lowsel, fullmatch, is_or);
-
- /* sum frequencies for all the matching MCV items */
- for (i = 0; i < mcvlist->nitems; i++)
- {
- /*
- * Find out what part of the data is covered by the MCV list,
- * so that we can 'scale' the selectivity properly (e.g. when
- * only 50% of the sample items got into the MCV, and the rest
- * is either in a histogram, or not covered by stats).
- *
- * TODO This might be handled by keeping a global "frequency"
- * for the whole list, which might save us a bit of time
- * spent on accessing the not-matching part of the MCV list.
- * Although it's likely in a cache, so it's very fast.
- */
- u += mcvlist->items[i]->frequency;
-
- /* skit MCV items not matching the conditions */
- if (condition_matches[i] == MVSTATS_MATCH_NONE)
- continue;
-
- if (matches[i] != MVSTATS_MATCH_NONE)
- s += mcvlist->items[i]->frequency;
-
- t += mcvlist->items[i]->frequency;
- }
-
- pfree(matches);
- pfree(condition_matches);
- pfree(mcvlist);
-
- /* no condition matches */
- if (t == 0.0)
- return (Selectivity)0.0;
-
- return (s / t) * u;
-}
-
-/*
- * Evaluate clauses using the MCV list, and update the match bitmap.
- *
- * The bitmap may be already partially set, so this is really a way to
- * combine results of several clause lists - either when computing
- * conditional probability P(A|B) or a combination of AND/OR clauses.
- *
- * TODO This works with 'bitmap' where each bit is represented as a char,
- * which is slightly wasteful. Instead, we could use a regular
- * bitmap, reducing the size to ~1/8. Another thing is merging the
- * bitmaps using & and |, which might be faster than min/max.
- */
static int
update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -3952,213 +1531,58 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
/* match/mismatch bitmap for each MCV item */
int tmp_nmatches = 0;
- char * tmp_matches = NULL;
-
- Assert(tmp_clauses != NIL);
- Assert(list_length(tmp_clauses) >= 2);
-
- /* number of matching MCV items */
- tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
-
- /* by default none of the MCV items matches the clauses */
- tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
-
- /* AND clauses assume everything matches, initially */
- if (! or_clause(clause))
- memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
- /* build the match bitmap for the OR-clauses */
- tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
- stakeys, mcvlist,
- tmp_nmatches, tmp_matches,
- lowsel, fullmatch, or_clause(clause));
-
- /* merge the bitmap into the existing one*/
- for (i = 0; i < mcvlist->nitems; i++)
- {
- /*
- * To AND-merge the bitmaps, a MIN() semantics is used.
- * For OR-merge, use MAX().
- *
- * FIXME this does not decrease the number of matches
- */
- UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
- }
-
- pfree(tmp_matches);
-
- }
- else
- elog(ERROR, "unknown clause type: %d", clause->type);
- }
-
- /*
- * If all the columns were matched by equality, it's a full match.
- * In this case there can be just a single MCV item, matching the
- * clause (if there were two, both would match the other one).
- */
- *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
-
- /* free the allocated pieces */
- if (eqmatches)
- pfree(eqmatches);
-
- return nmatches;
-}
-
-/*
- * Estimate selectivity of clauses using a histogram.
- *
- * If there's no histogram for the stats, the function returns 0.0.
- *
- * The general idea of this method is similar to how MCV lists are
- * processed, except that this introduces the concept of a partial
- * match (MCV only works with full match / mismatch).
- *
- * The algorithm works like this:
- *
- * 1) mark all buckets as 'full match'
- * 2) walk through all the clauses
- * 3) for a particular clause, walk through all the buckets
- * 4) skip buckets that are already 'no match'
- * 5) check clause for buckets that still match (at least partially)
- * 6) sum frequencies for buckets to get selectivity
- *
- * Unlike MCV lists, histograms have a concept of a partial match. In
- * that case we use 1/2 the bucket, to minimize the average error. The
- * MV histograms are usually less detailed than the per-column ones,
- * meaning the sum is often quite high (thanks to combining a lot of
- * "partially hit" buckets).
- *
- * Maybe we could use per-bucket information with number of distinct
- * values it contains (for each dimension), and then use that to correct
- * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
- * frequency). We might also scale the value depending on the actual
- * ndistinct estimate (not just the values observed in the sample).
- *
- * Another option would be to multiply the selectivities, i.e. if we get
- * 'partial match' for a bucket for multiple conditions, we might use
- * 0.5^k (where k is the number of conditions), instead of 0.5. This
- * probably does not minimize the average error, though.
- *
- * TODO This might use a similar shortcut to MCV lists - count buckets
- * marked as partial/full match, and terminate once this drop to 0.
- * Not sure if it's really worth it - for MCV lists a situation like
- * this is not uncommon, but for histograms it's not that clear.
- */
-static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
- List *clauses, List *conditions, bool is_or)
-{
- int i;
- Selectivity s = 0.0;
- Selectivity t = 0.0;
- Selectivity u = 0.0;
-
- int nmatches = 0;
- int nconditions = 0;
- char *matches = NULL;
- char *condition_matches = NULL;
-
- MVSerializedHistogram mvhist = NULL;
-
- /* there's no histogram */
- if (! mvstats->hist_built)
- return 0.0;
-
- /* There may be no histogram in the stats (check hist_built flag) */
- mvhist = load_mv_histogram(mvstats->mvoid);
-
- Assert (mvhist != NULL);
- Assert (clauses != NIL);
- Assert (list_length(clauses) >= 1);
-
- nmatches = mvhist->nbuckets;
- nconditions = mvhist->nbuckets;
-
- /*
- * Bitmap of bucket matches (mismatch, partial, full).
- *
- * For AND clauses all buckets match (and we'll eliminate them).
- * For OR clauses no buckets match (and we'll add them).
- *
- * We only need to do the memset for AND clauses (for OR clauses
- * it's already set correctly by the palloc0).
- */
- matches = palloc0(sizeof(char) * nmatches);
-
- if (! is_or) /* AND-clause */
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+ char * tmp_matches = NULL;
- /* Conditions are treated as AND clause, so match by default. */
- condition_matches = palloc0(sizeof(char)*nconditions);
- memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+ Assert(tmp_clauses != NIL);
+ Assert(list_length(tmp_clauses) >= 2);
- /* build the match bitmap for the conditions */
- if (conditions != NIL)
- update_match_bitmap_histogram(root, conditions,
- mvstats->stakeys, mvhist,
- nconditions, condition_matches, is_or);
+ /* number of matching MCV items */
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
- /*
- * build the match bitmap for the estimated clauses
- *
- * TODO This evaluates the clauses for all buckets, even those
- * ruled out by the conditions. The final result should be
- * the same, but it might be faster.
- */
- update_match_bitmap_histogram(root, clauses,
- mvstats->stakeys, mvhist,
- ((is_or) ? 0 : nmatches), matches,
- is_or);
+ /* by default none of the MCV items matches the clauses */
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- /* now, walk through the buckets and sum the selectivities */
- for (i = 0; i < mvhist->nbuckets; i++)
- {
- float coeff = 1.0;
+ /* AND clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
- /*
- * Find out what part of the data is covered by the histogram,
- * so that we can 'scale' the selectivity properly (e.g. when
- * only 50% of the sample got into the histogram, and the rest
- * is in a MCV list).
- *
- * TODO This might be handled by keeping a global "frequency"
- * for the whole histogram, which might save us some time
- * spent accessing the not-matching part of the histogram.
- * Although it's likely in a cache, so it's very fast.
- */
- u += mvhist->buckets[i]->ntuples;
+ /* build the match bitmap for the OR-clauses */
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
+ stakeys, mcvlist,
+ tmp_nmatches, tmp_matches,
+ lowsel, fullmatch, or_clause(clause));
- /* skip buckets not matching the conditions */
- if (condition_matches[i] == MVSTATS_MATCH_NONE)
- continue;
- else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
- coeff = 0.5;
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
+ }
- t += coeff * mvhist->buckets[i]->ntuples;
+ pfree(tmp_matches);
- if (matches[i] == MVSTATS_MATCH_FULL)
- s += coeff * mvhist->buckets[i]->ntuples;
- else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- /*
- * TODO If both conditions and clauses match partially, this
- * will use 0.25 match - not sure if that's the right
- * thing solution, but seems about right.
- */
- s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
}
- /* release the allocated bitmap and deserialized histogram */
- pfree(matches);
- pfree(condition_matches);
- pfree(mvhist);
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
- /* no condition matches */
- if (t == 0.0)
- return (Selectivity)0.0;
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
- return (s / t) * u;
+ return nmatches;
}
/*
@@ -4715,362 +2139,463 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
return nmatches;
}
-/*
- * Walk through clauses and keep only those covered by at least
- * one of the statistics.
- */
-static List *
-filter_clauses(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
- int type, List *stats, List *clauses, Bitmapset **attnums)
+static Node *
+stripRestrictStatData(List *clauses, BoolExprType boolop, Bitmapset **attrs)
{
- ListCell *c;
- ListCell *s;
-
- /* results (list of compatible clauses, attnums) */
- List *rclauses = NIL;
+ Expr *newexpr;
+ ListCell *lc;
- foreach (c, clauses)
+ if (attrs) *attrs = NULL;
+
+ if (list_length(clauses) == 0)
+ newexpr = NULL;
+ else if (list_length(clauses) == 1)
{
- Node *clause = (Node*)lfirst(c);
- Bitmapset *clause_attnums = NULL;
- Index relid;
+ RestrictStatData *rsd = (RestrictStatData *) linitial(clauses);
+ Assert(IsA(rsd, RestrictStatData));
- /*
- * The clause has to be mv-compatible (suitable operators etc.).
- */
- if (! clause_is_mv_compatible(root, clause, varRelid,
- &relid, &clause_attnums, sjinfo, type))
- elog(ERROR, "should not get non-mv-compatible cluase");
+ newexpr = (Expr*)(rsd->clause);
+ if (attrs) *attrs = rsd->mvattrs;
+ }
+ else
+ {
+ BoolExpr *newboolexpr;
+ newboolexpr = makeNode(BoolExpr);
+ newboolexpr->boolop = boolop;
- /* is there a statistics covering this clause? */
- foreach (s, stats)
+ foreach (lc, clauses)
{
- int k, matches = 0;
- MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
-
- for (k = 0; k < stat->stakeys->dim1; k++)
- {
- if (bms_is_member(stat->stakeys->values[k],
- clause_attnums))
- matches += 1;
- }
-
- /*
- * The clause is compatible if all attributes it references
- * are covered by the statistics.
- */
- if (bms_num_members(clause_attnums) == matches)
- {
- *attnums = bms_union(*attnums, clause_attnums);
- rclauses = lappend(rclauses, clause);
- break;
- }
+ RestrictStatData *rsd = (RestrictStatData *) lfirst(lc);
+ Assert(IsA(rsd, RestrictStatData));
+ newboolexpr->args =
+ lappend(newboolexpr->args, rsd->clause);
+ if (attrs)
+ *attrs = bms_add_members(*attrs, rsd->mvattrs);
}
-
- bms_free(clause_attnums);
+ newexpr = (Expr*) newboolexpr;
}
- /* we can't have more compatible conditions than source conditions */
- Assert(list_length(clauses) >= list_length(rclauses));
-
- return rclauses;
+ return (Node*)newexpr;
}
-
-/*
- * Walk through statistics and only keep those covering at least
- * one new attribute (excluding conditions) and at two attributes
- * in both clauses and conditions.
- *
- * This check might be made more strict by checking against individual
- * clauses, because by using the bitmapsets of all attnums we may
- * actually use attnums from clauses that are not covered by the
- * statistics. For example, we may have a condition
- *
- * (a=1 AND b=2)
- *
- * and a new clause
- *
- * (c=1 AND d=1)
- *
- * With only bitmapsets, statistics on [b,c] will pass through this
- * (assuming there are some statistics covering both clases).
- *
- * TODO Do the more strict check.
- */
-static List *
-filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+RestrictStatData *
+transformRestrictInfoForEstimate(PlannerInfo *root, List *clauses,
+ int relid, SpecialJoinInfo *sjinfo)
{
- ListCell *s;
- List *stats_filtered = NIL;
+ static int level = 0;
+ int i = -1;
+ char head[100];
+ RestrictStatData *rdata = makeNode(RestrictStatData);
+ Node *clause;
- foreach (s, stats)
+ memset(head, '.', 100);
+ head[level] = 0;
+
+ if (list_length(clauses) == 1 &&
+ !IsA((Node*)linitial(clauses), RestrictInfo))
{
- int k;
- int matches_new = 0,
- matches_all = 0;
-
- MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
-
- /* see how many attributes the statistics covers */
- for (k = 0; k < stat->stakeys->dim1; k++)
- {
- /* attributes from new clauses */
- if (bms_is_member(stat->stakeys->values[k], new_attnums))
- matches_new += 1;
-
- /* attributes from onditions */
- if (bms_is_member(stat->stakeys->values[k], all_attnums))
- matches_all += 1;
- }
-
- /* check we have enough attributes for this statistics */
- if ((matches_new >= 1) && (matches_all >= 2))
- stats_filtered = lappend(stats_filtered, stat);
+ Assert(relid > 0);
+ clause = (Node*)linitial(clauses);
}
+ else
+ {
+ /* This is top level clauselist. Convert it to and expression */
+ ListCell *lc;
+ Index clauserelid = 0;
+ Relids relids = pull_varnos((Node*)clauses);
- /* we can't have more useful stats than we had originally */
- Assert(list_length(stats) >= list_length(stats_filtered));
-
- return stats_filtered;
-}
+ if (bms_num_members(relids) != 1)
+ return NULL;
-static MVStatisticInfo *
-make_stats_array(List *stats, int *nmvstats)
-{
- int i;
- ListCell *l;
+ clauserelid = bms_singleton_member(relids);
+ if (relid != 0 && relid != clauserelid)
+ return NULL;
- MVStatisticInfo *mvstats = NULL;
- *nmvstats = list_length(stats);
+ relid = clauserelid;
- mvstats
- = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
+ if (list_length(clauses) == 1)
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) linitial(clauses);
+ Assert(IsA(rinfo, RestrictInfo));
+
+ clause = (Node*) rinfo->clause;
+ }
+ else
+ {
+ BoolExpr *andexpr = makeNode(BoolExpr);
+ andexpr->boolop = AND_EXPR;
+ foreach (lc, clauses)
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+
+ Assert(IsA(rinfo, RestrictInfo));
+ if (rinfo->pseudoconstant ||
+ treat_as_join_clause((Node*)rinfo->clause,
+ rinfo, 0, sjinfo))
+ rdata->unusedrinfos = lappend(rdata->unusedrinfos,
+ rinfo);
+ else
+ andexpr->args = lappend(andexpr->args, rinfo->clause);
+ }
+ clause = (Node*)andexpr;
+ }
- i = 0;
- foreach (l, stats)
- {
- MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
- memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
}
- return mvstats;
-}
-
-static Bitmapset **
-make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
-{
- int i, j;
- Bitmapset **stats_attnums = NULL;
-
- Assert(nmvstats > 0);
+ Assert(!IsA(clause, RestrictInfo));
- /* build bitmaps of attnums for the stats (easier to compare) */
- stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+ rdata->clause = clause;
+ rdata->boolop = AND_EXPR;
- for (i = 0; i < nmvstats; i++)
- for (j = 0; j < mvstats[i].stakeys->dim1; j++)
- stats_attnums[i]
- = bms_add_member(stats_attnums[i],
- mvstats[i].stakeys->values[j]);
+ if (and_clause(clause) || or_clause(clause))
+ {
+ BoolExpr *boolexpr = (BoolExpr *)clause;
+ ListCell *lc;
+ List *mvclauses = NIL;
+ List *nonmvclauses = NIL;
+ List *partialclauses = NIL;
+ Bitmapset *resultattrs = NULL;
+ List *resultstats = NIL;
- return stats_attnums;
-}
+ rdata->boolop = boolexpr->boolop;
+ ereport(DEBUG1,
+ (errmsg ("%s%s[%d][%d](%d)",
+ head,
+ and_clause(clause)?"AND":
+ (or_clause(clause)?"OR":"NOT"),
+ level, i, list_length(boolexpr->args)),
+ errhidestmt(level)));
+ /* Recursively process the subexpressions */
+ level++;
+ foreach (lc, (boolexpr->args))
+ {
+ Node *nd = (Node*) lfirst(lc);
+ RestrictStatData *tmpsd;
-/*
- * Now let's remove redundant statistics, covering the same columns
- * as some other stats, when restricted to the attributes from
- * remaining clauses.
- *
- * If statistics S1 covers S2 (covers S2 attributes and possibly
- * some more), we can probably remove S2. What actually matters are
- * attributes from covered clauses (not all the attributes). This
- * might however prefer larger, and thus less accurate, statistics.
- *
- * When a redundancy is detected, we simply keep the smaller
- * statistics (less number of columns), on the assumption that it's
- * more accurate and faster to process. That might be incorrect for
- * two reasons - first, the accuracy really depends on number of
- * buckets/MCV items, not the number of columns. Second, we might
- * prefer MCV lists over histograms or something like that.
- */
-static List*
-filter_redundant_stats(List *stats, List *clauses, List *conditions)
-{
- int i, j, nmvstats;
+ tmpsd = transformRestrictInfoForEstimate(root,
+ list_make1(nd),
+ relid, sjinfo);
+ /*
+ * mvclauses is to hold the child RestrictStatData that
+ * potentially can be pulled-up to this node's mvclause, which is
+ * to be estimated using multivariate statistics.
+ *
+ * partialclauses is to hold the child RestrictStatData that
+ * cannot be pulled-up.
+ *
+ * nonmvclauses is to hold the child RestrictStatData to be
+ * pulled-up into the clause to be estimated in the normal way.
+ */
+ if (tmpsd->mvattrs)
+ mvclauses = lappend(mvclauses, tmpsd);
+ else if (tmpsd->mvclause)
+ partialclauses = lappend(partialclauses, tmpsd);
+ else
+ nonmvclauses = lappend(nonmvclauses, tmpsd);
+ }
+ level--;
- MVStatisticInfo *mvstats;
- bool *redundant;
- Bitmapset **stats_attnums;
- Bitmapset *varattnos;
- Index relid;
- Assert(list_length(stats) > 0);
- Assert(list_length(clauses) > 0);
+ if (list_length(mvclauses) == 1)
+ {
+ /*
+ * If this boolean clause has only one mv clause, pull it up for
+ * now.
+ */
+ RestrictStatData *rsd = (RestrictStatData *) linitial(mvclauses);
+ resultattrs = rsd->mvattrs;
+ resultstats = rsd->mvstats;
+ }
+ if (list_length(mvclauses) > 1)
+ {
+ /*
+ * Pick up the smallest mv-stats that covers as large part as
+ * possible of the attrutes appeard in the subclauses, then remove
+ * clauses that is not covered by the selected mv-stats.
+ */
+ int nmvstats = 0;
+ ListCell *lc;
+ bm_mvstat *mvstatslist[16];
+ int maxnattrs = 0;
+ int candidatestats;
+ int i;
+
+ /* Check functional dependency first, maybe.. */
+// if (list_length(mvclauses) == 2)
+// {
+// RestrictStatData *rsd1 =
+// (RestrictStatData *) linitial(mvclauses);
+// RestrictStatData *rsd2 =
+// (RestrictStatData *) lsecond(mvclauses);
+// /* To do more...*/
+// }
- /*
- * We'll convert the list of statistics into an array now, because
- * the reduction of redundant statistics is easier to do that way
- * (we can mark previous stats as redundant, etc.).
- */
- mvstats = make_stats_array(stats, &nmvstats);
- stats_attnums = make_stats_attnums(mvstats, nmvstats);
+ /*
+ * Collect all mvstats from all subclauses. Attribute set should
+ * be unique so use it as key. There should be not so many stats.
+ */
+ foreach (lc, mvclauses)
+ {
+ RestrictStatData *rsd = (RestrictStatData *) lfirst(lc);
+ Bitmapset *mvattrs = rsd->mvattrs;
+ ListCell *lcs;
- /* by default, none of the stats is redundant (so palloc0) */
- redundant = palloc0(nmvstats * sizeof(bool));
+ /* make a covering attribute set of all cluases */
+ resultattrs = bms_add_members(resultattrs, mvattrs);
- /*
- * We only expect a single relid here, and also we should get the
- * same relid from clauses and conditions (but we get it from
- * clauses, because those are certainly non-empty).
- */
- relid = bms_singleton_member(pull_varnos((Node*)clauses));
+ /* pick up new mv stats */
+ foreach (lcs, rsd->mvstats)
+ {
+ bm_mvstat *mvs = (bm_mvstat*) lfirst(lcs);
+ bool found = false;
- /*
- * Get the varattnos from both conditions and clauses.
- *
- * This skips system attributes, although that should be impossible
- * thanks to previous filtering out of incompatible clauses.
- *
- * XXX Is that really true?
- */
- varattnos = bms_union(get_varattnos((Node*)clauses, relid),
- get_varattnos((Node*)conditions, relid));
+ for (i = 0 ; !found && i < nmvstats ; i++)
+ {
+ if(bms_equal(mvstatslist[i]->attrs, mvs->attrs))
+ found = true;
+ }
+ if (!found)
+ {
+ mvstatslist[nmvstats] = mvs;
+ nmvstats++;
+ }
- for (i = 1; i < nmvstats; i++)
- {
- /* intersect with current statistics */
- Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
+ /* ignore more than 15(!) stats for a clause */
+ if (nmvstats > 15)
+ break;
+ }
+ }
- /* walk through 'previous' stats and check redundancy */
- for (j = 0; j < i; j++)
- {
- /* intersect with current statistics */
- Bitmapset *prev;
+ /* we try functional dependency first? */
+ //if (clauseboolop == AND_EXPR && ...
+
+ /*
+ * find a mv stats covers the largest number of attribute used in
+ * the cluases and having the smallest attrubute set.
+ */
+ maxnattrs = 0;
+ candidatestats = -1;
+ for (i = 0 ; i < nmvstats ; i++)
+ {
+ Bitmapset *matchattr =
+ bms_intersect(resultattrs, mvstatslist[i]->attrs);
+ int nmatchattrs = bms_num_members(matchattr);
- /* skip stats already identified as redundant */
- if (redundant[j])
- continue;
+ if (maxnattrs < nmatchattrs)
+ {
+ candidatestats = i;
+ maxnattrs = nmatchattrs;
+ }
+ else if (maxnattrs > 0 && maxnattrs == nmatchattrs)
+ {
+ if (bms_num_members(mvstatslist[i]->attrs) <
+ bms_num_members(mvstatslist[candidatestats]->attrs))
+ candidatestats = i;
+ }
+ }
- prev = bms_intersect(stats_attnums[j], varattnos);
+ Assert(candidatestats >= 0);
- switch (bms_subset_compare(curr, prev))
+ if (maxnattrs == 1)
{
- case BMS_EQUAL:
+ /*
+ * No two of mvclauses share a mv statistics. Make this node
+ * non-mv.
+ */
+ mvclauses = NIL;
+ nonmvclauses = NIL;
+ resultattrs = NULL;
+ resultstats = NIL;
+ }
+ else
+ {
+ if (!bms_is_subset(resultattrs,
+ mvstatslist[candidatestats]->attrs))
+ {
/*
- * Use the smaller one (hopefully more accurate).
- * If both have the same size, use the first one.
+ * move out the clauses that is not covered by the
+ * candidate stats
*/
- if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
- redundant[i] = TRUE;
- else
- redundant[j] = TRUE;
-
- break;
-
- case BMS_SUBSET1: /* curr is subset of prev */
- redundant[i] = TRUE;
- break;
+ List *old_mvclauses = mvclauses;
+ ListCell *lc;
+ Bitmapset *statsattrs =
+ mvstatslist[candidatestats]->attrs;
+ mvclauses = NIL;
- case BMS_SUBSET2: /* prev is subset of curr */
- redundant[j] = TRUE;
- break;
+ foreach(lc, old_mvclauses)
+ {
+ RestrictStatData *rsd = (RestrictStatData *) lfirst(lc);
+ Assert(IsA(rsd, RestrictStatData));
- case BMS_DIFFERENT:
- /* do nothing - keep both stats */
- break;
+ if (bms_is_subset(rsd->mvattrs, statsattrs))
+ mvclauses = lappend(mvclauses, rsd);
+ else
+ nonmvclauses = lappend(nonmvclauses, rsd);
+ }
+ resultattrs = bms_intersect(resultattrs,
+ mvstatslist[candidatestats]->attrs);
+ }
+ resultstats = list_make1(mvstatslist[candidatestats]);
}
-
- bms_free(prev);
}
- bms_free(curr);
- }
-
- /* can't reduce all statistics (at least one has to remain) */
- Assert(nmvstats > 0);
+ if (bms_num_members(resultattrs) < 2)
+ {
+ /*
+ * make this non-mv if mvclause covers only one mv-attribute.
+ */
+ nonmvclauses = list_concat(nonmvclauses, mvclauses);
+ mvclauses = NULL;
+ resultattrs = NULL;
+ resultstats = NIL;
+ }
- /* now, let's remove the reduced statistics from the arrays */
- list_free(stats);
- stats = NIL;
+ /*
+ * All mvclauses are covered by the candidate stats here.
+ */
+ rdata->mvclause =
+ stripRestrictStatData(mvclauses, rdata->boolop, NULL);
+ rdata->children = partialclauses;
+ rdata->mvattrs = resultattrs;
+ rdata->nonmvclause =
+ stripRestrictStatData(nonmvclauses, rdata->boolop, NULL);
+ rdata->mvstats = resultstats;
- for (i = 0; i < nmvstats; i++)
+ }
+ else if (not_clause(clause))
{
- MVStatisticInfo *info;
-
- pfree(stats_attnums[i]);
+ Node *nd = (Node *) linitial(((BoolExpr*)clause)->args);
+ RestrictStatData *tmpsd;
- if (redundant[i])
- continue;
-
- info = makeNode(MVStatisticInfo);
- memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
-
- stats = lappend(stats, info);
+ tmpsd = transformRestrictInfoForEstimate(root, list_make1(nd),
+ relid, sjinfo);
+ rdata->children = list_make1(tmpsd);
}
-
- pfree(mvstats);
- pfree(stats_attnums);
- pfree(redundant);
-
- return stats;
-}
-
-static Node**
-make_clauses_array(List *clauses, int *nclauses)
-{
- int i;
- ListCell *l;
-
- Node** clauses_array;
-
- *nclauses = list_length(clauses);
- clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
-
- i = 0;
- foreach (l, clauses)
- clauses_array[i++] = (Node *)lfirst(l);
-
- *nclauses = i;
-
- return clauses_array;
-}
-
-static Bitmapset **
-make_clauses_attnums(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
- int type, Node **clauses, int nclauses)
-{
- int i;
- Index relid;
- Bitmapset **clauses_attnums
- = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
-
- for (i = 0; i < nclauses; i++)
+ else if (is_opclause(clause) &&
+ list_length(((OpExpr *) clause)->args) == 2)
{
- Bitmapset * attnums = NULL;
+ Node *varnode = get_leftop((Expr*)clause);
+ Node *nonvarnode = get_rightop((Expr*)clause);
- if (! clause_is_mv_compatible(root, clauses[i], varRelid,
- &relid, &attnums, sjinfo, type))
- elog(ERROR, "should not get non-mv-compatible cluase");
+ /* Place var on vernode if any */
+ if (!IsA(varnode, Var))
+ {
+ Node *tmp = nonvarnode;
+ nonvarnode = varnode;
+ varnode = tmp;
+ }
+
+ if (IsA(varnode, Var) && is_pseudo_constant_clause(nonvarnode))
+ {
+ Var *var = (Var *)varnode;
+ List *statslist = root->simple_rel_array[relid]->mvstatlist;
+ Oid opno = ((OpExpr*)clause)->opno;
+ int varmvbitmap = get_oprmvstat(opno);
+
+ if (varmvbitmap &&
+ !IS_SPECIAL_VARNO(var->varno) &&
+ AttrNumberIsForUserDefinedAttr(var->varattno))
+ {
+ List *mvstats = NIL;
+ ListCell *lc;
+ Bitmapset *varattrs = bms_make_singleton(var->varattno);
- clauses_attnums[i] = attnums;
+ /*
+ * Add mv statistics if it is applicable on this expression
+ */
+ foreach (lc, statslist)
+ {
+ int k;
+ MVStatisticInfo *stats = (MVStatisticInfo *) lfirst(lc);
+ Bitmapset *statsattrs = NULL;
+ int statsmvbitmap =
+ (stats->mcv_built ? MVSTATISTIC_MCV : 0) |
+ (stats->hist_built ? MVSTATISTIC_HIST : 0) |
+ (stats->deps_built ? MVSTATISTIC_FDEP : 0);
+
+ for (k = 0 ; k < stats->stakeys->dim1 ; k++)
+ statsattrs = bms_add_member(statsattrs,
+ stats->stakeys->values[k]);
+ /* XXX: Does this work as expected? */
+ if (bms_is_subset(varattrs, statsattrs) &&
+ (statsmvbitmap & varmvbitmap))
+ {
+ bm_mvstat *mvstatsent = palloc0(sizeof(bm_mvstat));
+ mvstatsent->attrs = statsattrs;
+ mvstatsent->stats = stats;
+ mvstatsent->mvkind = statsmvbitmap;
+ mvstats = lappend(mvstats, mvstatsent);
+ }
+ }
+ if (mvstats)
+ {
+ /* MV stats is potentially applicable on this expression */
+ ereport(DEBUG1,
+ (errmsg ("%sMATCH[%d][%d](varno = %d, attno = %d)",
+ head, level, i,
+ var->varno, var->varattno),
+ errhidestmt(level)));
+
+ rdata->mvstats = mvstats;
+ rdata->mvattrs = varattrs;
+ }
+ }
+ }
+ else
+ {
+ ereport(DEBUG1,
+ (errmsg ("%sno match BinOp[%d][%d]: r=%d, l=%d",
+ head, level, i,
+ varnode->type, nonvarnode->type),
+ errhidestmt(level)));
+ }
}
+ else if (IsA(clause, NullTest))
+ {
+ NullTest *expr = (NullTest*)clause;
+ Var *var = (Var *)(expr->arg);
- return clauses_attnums;
-}
-
-static bool*
-make_cover_map(Bitmapset **stats_attnums, int nmvstats,
- Bitmapset **clauses_attnums, int nclauses)
-{
- int i, j;
- bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+ if (IsA(var, Var) &&
+ !IS_SPECIAL_VARNO(var->varno) &&
+ AttrNumberIsForUserDefinedAttr(var->varattno))
+ {
+ Bitmapset *varattrs = bms_make_singleton(var->varattno);
+ List *mvstats = NIL;
+ ListCell *lc;
- for (i = 0; i < nmvstats; i++)
- for (j = 0; j < nclauses; j++)
- cover_map[i * nclauses + j]
- = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+ foreach(lc, root->simple_rel_array[relid]->mvstatlist)
+ {
+ MVStatisticInfo *stats = (MVStatisticInfo *) lfirst(lc);
+ Bitmapset *statsattrs = NULL;
+ int k;
+
+ for (k = 0 ; k < stats->stakeys->dim1 ; k++)
+ statsattrs = bms_add_member(statsattrs,
+ stats->stakeys->values[k]);
+ if (bms_is_subset(varattrs, statsattrs))
+ {
+ bm_mvstat *mvstatsent = palloc0(sizeof(bm_mvstat));
+ mvstatsent->stats = stats;
+ mvstatsent->attrs = statsattrs;
+ mvstatsent->mvkind = (MVSTATISTIC_MCV |MVSTATISTIC_HIST);
+ mvstats = lappend(mvstats, mvstatsent);
+ }
+ }
+ if (mvstats)
+ {
+ rdata->mvstats = mvstats;
+ rdata->mvattrs = varattrs;
+ }
+ }
+ }
+ else
+ {
+ ereport(DEBUG1,
+ (errmsg ("%sno match node(%d)[%d][%d]",
+ head, clause->type, level, i),
+ errhidestmt(level)));
+ }
- return cover_map;
+ return rdata;
}
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 7b32247..61e578f 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -45,6 +45,7 @@
#include "utils/rel.h"
#include "utils/syscache.h"
#include "utils/typcache.h"
+#include "utils/mvstats.h"
/* Hook for plugins to get control in get_attavgwidth() */
get_attavgwidth_hook_type get_attavgwidth_hook = NULL;
@@ -1345,6 +1346,45 @@ get_oprjoin(Oid opno)
return (RegProcedure) InvalidOid;
}
+/*
+ * get_oprmvstat
+ *
+ * Returns mv stats compatibility for computing selectivity
+ * Return valueis bitwise or of MVSTATISTIC_* symbols
+ */
+int
+get_oprmvstat(Oid opno)
+{
+ HeapTuple tp;
+
+ tp = SearchSysCache1(OPEROID, ObjectIdGetDatum(opno));
+ if (HeapTupleIsValid(tp))
+ {
+ Datum tmp;
+ bool isnull;
+ char *str;
+ int result = 0;
+
+ tmp = SysCacheGetAttr(OPEROID, tp,
+ Anum_pg_operator_oprmvstat, &isnull);
+ if (!isnull)
+ {
+ str = TextDatumGetCString(tmp);
+ if (strlen(str) == 3)
+ {
+ if (str[0] != '-') result |= MVSTATISTIC_MCV;
+ if (str[1] != '-') result |= MVSTATISTIC_HIST;
+ if (str[2] != '-') result |= MVSTATISTIC_FDEP;
+ }
+ }
+ ReleaseSysCache(tp);
+ return result;
+ }
+ else
+ return 0;
+}
+
+
/* ---------- FUNCTION CACHE ---------- */
/*
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index 26c9d4e..c75ac72 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -49,6 +49,9 @@ CATALOG(pg_operator,2617)
regproc oprcode; /* OID of underlying function */
regproc oprrest; /* OID of restriction estimator, or 0 */
regproc oprjoin; /* OID of join estimator, or 0 */
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ text oprmvstat; /* MV stat compatibility in '[m-][h-][f-]' */
+#endif
} FormData_pg_operator;
/* ----------------
@@ -63,7 +66,7 @@ typedef FormData_pg_operator *Form_pg_operator;
* ----------------
*/
-#define Natts_pg_operator 14
+#define Natts_pg_operator 15
#define Anum_pg_operator_oprname 1
#define Anum_pg_operator_oprnamespace 2
#define Anum_pg_operator_oprowner 3
@@ -78,6 +81,7 @@ typedef FormData_pg_operator *Form_pg_operator;
#define Anum_pg_operator_oprcode 12
#define Anum_pg_operator_oprrest 13
#define Anum_pg_operator_oprjoin 14
+#define Anum_pg_operator_oprmvstat 15
/* ----------------
* initial contents of pg_operator
@@ -91,1735 +95,1735 @@ typedef FormData_pg_operator *Form_pg_operator;
* for the underlying function.
*/
-DATA(insert OID = 15 ( "=" PGNSP PGUID b t t 23 20 16 416 36 int48eq eqsel eqjoinsel ));
+DATA(insert OID = 15 ( "=" PGNSP PGUID b t t 23 20 16 416 36 int48eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 36 ( "<>" PGNSP PGUID b f f 23 20 16 417 15 int48ne neqsel neqjoinsel ));
+DATA(insert OID = 36 ( "<>" PGNSP PGUID b f f 23 20 16 417 15 int48ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 37 ( "<" PGNSP PGUID b f f 23 20 16 419 82 int48lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 37 ( "<" PGNSP PGUID b f f 23 20 16 419 82 int48lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 76 ( ">" PGNSP PGUID b f f 23 20 16 418 80 int48gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 76 ( ">" PGNSP PGUID b f f 23 20 16 418 80 int48gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 80 ( "<=" PGNSP PGUID b f f 23 20 16 430 76 int48le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 80 ( "<=" PGNSP PGUID b f f 23 20 16 430 76 int48le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 82 ( ">=" PGNSP PGUID b f f 23 20 16 420 37 int48ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 82 ( ">=" PGNSP PGUID b f f 23 20 16 420 37 int48ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 58 ( "<" PGNSP PGUID b f f 16 16 16 59 1695 boollt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 58 ( "<" PGNSP PGUID b f f 16 16 16 59 1695 boollt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 59 ( ">" PGNSP PGUID b f f 16 16 16 58 1694 boolgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 59 ( ">" PGNSP PGUID b f f 16 16 16 58 1694 boolgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 85 ( "<>" PGNSP PGUID b f f 16 16 16 85 91 boolne neqsel neqjoinsel ));
+DATA(insert OID = 85 ( "<>" PGNSP PGUID b f f 16 16 16 85 91 boolne neqsel neqjoinsel "mhf"));
DESCR("not equal");
#define BooleanNotEqualOperator 85
-DATA(insert OID = 91 ( "=" PGNSP PGUID b t t 16 16 16 91 85 booleq eqsel eqjoinsel ));
+DATA(insert OID = 91 ( "=" PGNSP PGUID b t t 16 16 16 91 85 booleq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define BooleanEqualOperator 91
-DATA(insert OID = 1694 ( "<=" PGNSP PGUID b f f 16 16 16 1695 59 boolle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1694 ( "<=" PGNSP PGUID b f f 16 16 16 1695 59 boolle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1695 ( ">=" PGNSP PGUID b f f 16 16 16 1694 58 boolge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1695 ( ">=" PGNSP PGUID b f f 16 16 16 1694 58 boolge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 92 ( "=" PGNSP PGUID b t t 18 18 16 92 630 chareq eqsel eqjoinsel ));
+DATA(insert OID = 92 ( "=" PGNSP PGUID b t t 18 18 16 92 630 chareq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 93 ( "=" PGNSP PGUID b t t 19 19 16 93 643 nameeq eqsel eqjoinsel ));
+DATA(insert OID = 93 ( "=" PGNSP PGUID b t t 19 19 16 93 643 nameeq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 94 ( "=" PGNSP PGUID b t t 21 21 16 94 519 int2eq eqsel eqjoinsel ));
+DATA(insert OID = 94 ( "=" PGNSP PGUID b t t 21 21 16 94 519 int2eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 95 ( "<" PGNSP PGUID b f f 21 21 16 520 524 int2lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 95 ( "<" PGNSP PGUID b f f 21 21 16 520 524 int2lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 96 ( "=" PGNSP PGUID b t t 23 23 16 96 518 int4eq eqsel eqjoinsel ));
+DATA(insert OID = 96 ( "=" PGNSP PGUID b t t 23 23 16 96 518 int4eq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define Int4EqualOperator 96
-DATA(insert OID = 97 ( "<" PGNSP PGUID b f f 23 23 16 521 525 int4lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 97 ( "<" PGNSP PGUID b f f 23 23 16 521 525 int4lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define Int4LessOperator 97
-DATA(insert OID = 98 ( "=" PGNSP PGUID b t t 25 25 16 98 531 texteq eqsel eqjoinsel ));
+DATA(insert OID = 98 ( "=" PGNSP PGUID b t t 25 25 16 98 531 texteq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define TextEqualOperator 98
-DATA(insert OID = 349 ( "||" PGNSP PGUID b f f 2277 2283 2277 0 0 array_append - - ));
+DATA(insert OID = 349 ( "||" PGNSP PGUID b f f 2277 2283 2277 0 0 array_append - - "---"));
DESCR("append element onto end of array");
-DATA(insert OID = 374 ( "||" PGNSP PGUID b f f 2283 2277 2277 0 0 array_prepend - - ));
+DATA(insert OID = 374 ( "||" PGNSP PGUID b f f 2283 2277 2277 0 0 array_prepend - - "---"));
DESCR("prepend element onto front of array");
-DATA(insert OID = 375 ( "||" PGNSP PGUID b f f 2277 2277 2277 0 0 array_cat - - ));
+DATA(insert OID = 375 ( "||" PGNSP PGUID b f f 2277 2277 2277 0 0 array_cat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 352 ( "=" PGNSP PGUID b f t 28 28 16 352 0 xideq eqsel eqjoinsel ));
+DATA(insert OID = 352 ( "=" PGNSP PGUID b f t 28 28 16 352 0 xideq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 353 ( "=" PGNSP PGUID b f f 28 23 16 0 0 xideqint4 eqsel eqjoinsel ));
+DATA(insert OID = 353 ( "=" PGNSP PGUID b f f 28 23 16 0 0 xideqint4 eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 388 ( "!" PGNSP PGUID r f f 20 0 1700 0 0 numeric_fac - - ));
+DATA(insert OID = 388 ( "!" PGNSP PGUID r f f 20 0 1700 0 0 numeric_fac - - "---"));
DESCR("factorial");
-DATA(insert OID = 389 ( "!!" PGNSP PGUID l f f 0 20 1700 0 0 numeric_fac - - ));
+DATA(insert OID = 389 ( "!!" PGNSP PGUID l f f 0 20 1700 0 0 numeric_fac - - "---"));
DESCR("deprecated, use ! instead");
-DATA(insert OID = 385 ( "=" PGNSP PGUID b f t 29 29 16 385 0 cideq eqsel eqjoinsel ));
+DATA(insert OID = 385 ( "=" PGNSP PGUID b f t 29 29 16 385 0 cideq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 386 ( "=" PGNSP PGUID b f t 22 22 16 386 0 int2vectoreq eqsel eqjoinsel ));
+DATA(insert OID = 386 ( "=" PGNSP PGUID b f t 22 22 16 386 0 int2vectoreq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 387 ( "=" PGNSP PGUID b t f 27 27 16 387 402 tideq eqsel eqjoinsel ));
+DATA(insert OID = 387 ( "=" PGNSP PGUID b t f 27 27 16 387 402 tideq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define TIDEqualOperator 387
-DATA(insert OID = 402 ( "<>" PGNSP PGUID b f f 27 27 16 402 387 tidne neqsel neqjoinsel ));
+DATA(insert OID = 402 ( "<>" PGNSP PGUID b f f 27 27 16 402 387 tidne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2799 ( "<" PGNSP PGUID b f f 27 27 16 2800 2802 tidlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2799 ( "<" PGNSP PGUID b f f 27 27 16 2800 2802 tidlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define TIDLessOperator 2799
-DATA(insert OID = 2800 ( ">" PGNSP PGUID b f f 27 27 16 2799 2801 tidgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2800 ( ">" PGNSP PGUID b f f 27 27 16 2799 2801 tidgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2801 ( "<=" PGNSP PGUID b f f 27 27 16 2802 2800 tidle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2801 ( "<=" PGNSP PGUID b f f 27 27 16 2802 2800 tidle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2802 ( ">=" PGNSP PGUID b f f 27 27 16 2801 2799 tidge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2802 ( ">=" PGNSP PGUID b f f 27 27 16 2801 2799 tidge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 410 ( "=" PGNSP PGUID b t t 20 20 16 410 411 int8eq eqsel eqjoinsel ));
+DATA(insert OID = 410 ( "=" PGNSP PGUID b t t 20 20 16 410 411 int8eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 411 ( "<>" PGNSP PGUID b f f 20 20 16 411 410 int8ne neqsel neqjoinsel ));
+DATA(insert OID = 411 ( "<>" PGNSP PGUID b f f 20 20 16 411 410 int8ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 412 ( "<" PGNSP PGUID b f f 20 20 16 413 415 int8lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 412 ( "<" PGNSP PGUID b f f 20 20 16 413 415 int8lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define Int8LessOperator 412
-DATA(insert OID = 413 ( ">" PGNSP PGUID b f f 20 20 16 412 414 int8gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 413 ( ">" PGNSP PGUID b f f 20 20 16 412 414 int8gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 414 ( "<=" PGNSP PGUID b f f 20 20 16 415 413 int8le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 414 ( "<=" PGNSP PGUID b f f 20 20 16 415 413 int8le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 415 ( ">=" PGNSP PGUID b f f 20 20 16 414 412 int8ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 415 ( ">=" PGNSP PGUID b f f 20 20 16 414 412 int8ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 416 ( "=" PGNSP PGUID b t t 20 23 16 15 417 int84eq eqsel eqjoinsel ));
+DATA(insert OID = 416 ( "=" PGNSP PGUID b t t 20 23 16 15 417 int84eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 417 ( "<>" PGNSP PGUID b f f 20 23 16 36 416 int84ne neqsel neqjoinsel ));
+DATA(insert OID = 417 ( "<>" PGNSP PGUID b f f 20 23 16 36 416 int84ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 418 ( "<" PGNSP PGUID b f f 20 23 16 76 430 int84lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 418 ( "<" PGNSP PGUID b f f 20 23 16 76 430 int84lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 419 ( ">" PGNSP PGUID b f f 20 23 16 37 420 int84gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 419 ( ">" PGNSP PGUID b f f 20 23 16 37 420 int84gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 420 ( "<=" PGNSP PGUID b f f 20 23 16 82 419 int84le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 420 ( "<=" PGNSP PGUID b f f 20 23 16 82 419 int84le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 430 ( ">=" PGNSP PGUID b f f 20 23 16 80 418 int84ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 430 ( ">=" PGNSP PGUID b f f 20 23 16 80 418 int84ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 439 ( "%" PGNSP PGUID b f f 20 20 20 0 0 int8mod - - ));
+DATA(insert OID = 439 ( "%" PGNSP PGUID b f f 20 20 20 0 0 int8mod - - "---"));
DESCR("modulus");
-DATA(insert OID = 473 ( "@" PGNSP PGUID l f f 0 20 20 0 0 int8abs - - ));
+DATA(insert OID = 473 ( "@" PGNSP PGUID l f f 0 20 20 0 0 int8abs - - "---"));
DESCR("absolute value");
-DATA(insert OID = 484 ( "-" PGNSP PGUID l f f 0 20 20 0 0 int8um - - ));
+DATA(insert OID = 484 ( "-" PGNSP PGUID l f f 0 20 20 0 0 int8um - - "---"));
DESCR("negate");
-DATA(insert OID = 485 ( "<<" PGNSP PGUID b f f 604 604 16 0 0 poly_left positionsel positionjoinsel ));
+DATA(insert OID = 485 ( "<<" PGNSP PGUID b f f 604 604 16 0 0 poly_left positionsel positionjoinsel "---"));
DESCR("is left of");
-DATA(insert OID = 486 ( "&<" PGNSP PGUID b f f 604 604 16 0 0 poly_overleft positionsel positionjoinsel ));
+DATA(insert OID = 486 ( "&<" PGNSP PGUID b f f 604 604 16 0 0 poly_overleft positionsel positionjoinsel "---"));
DESCR("overlaps or is left of");
-DATA(insert OID = 487 ( "&>" PGNSP PGUID b f f 604 604 16 0 0 poly_overright positionsel positionjoinsel ));
+DATA(insert OID = 487 ( "&>" PGNSP PGUID b f f 604 604 16 0 0 poly_overright positionsel positionjoinsel "---"));
DESCR("overlaps or is right of");
-DATA(insert OID = 488 ( ">>" PGNSP PGUID b f f 604 604 16 0 0 poly_right positionsel positionjoinsel ));
+DATA(insert OID = 488 ( ">>" PGNSP PGUID b f f 604 604 16 0 0 poly_right positionsel positionjoinsel "---"));
DESCR("is right of");
-DATA(insert OID = 489 ( "<@" PGNSP PGUID b f f 604 604 16 490 0 poly_contained contsel contjoinsel ));
+DATA(insert OID = 489 ( "<@" PGNSP PGUID b f f 604 604 16 490 0 poly_contained contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 490 ( "@>" PGNSP PGUID b f f 604 604 16 489 0 poly_contain contsel contjoinsel ));
+DATA(insert OID = 490 ( "@>" PGNSP PGUID b f f 604 604 16 489 0 poly_contain contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 491 ( "~=" PGNSP PGUID b f f 604 604 16 491 0 poly_same eqsel eqjoinsel ));
+DATA(insert OID = 491 ( "~=" PGNSP PGUID b f f 604 604 16 491 0 poly_same eqsel eqjoinsel "mhf"));
DESCR("same as");
-DATA(insert OID = 492 ( "&&" PGNSP PGUID b f f 604 604 16 492 0 poly_overlap areasel areajoinsel ));
+DATA(insert OID = 492 ( "&&" PGNSP PGUID b f f 604 604 16 492 0 poly_overlap areasel areajoinsel "---"));
DESCR("overlaps");
-DATA(insert OID = 493 ( "<<" PGNSP PGUID b f f 603 603 16 0 0 box_left positionsel positionjoinsel ));
+DATA(insert OID = 493 ( "<<" PGNSP PGUID b f f 603 603 16 0 0 box_left positionsel positionjoinsel "---"));
DESCR("is left of");
-DATA(insert OID = 494 ( "&<" PGNSP PGUID b f f 603 603 16 0 0 box_overleft positionsel positionjoinsel ));
+DATA(insert OID = 494 ( "&<" PGNSP PGUID b f f 603 603 16 0 0 box_overleft positionsel positionjoinsel "---"));
DESCR("overlaps or is left of");
-DATA(insert OID = 495 ( "&>" PGNSP PGUID b f f 603 603 16 0 0 box_overright positionsel positionjoinsel ));
+DATA(insert OID = 495 ( "&>" PGNSP PGUID b f f 603 603 16 0 0 box_overright positionsel positionjoinsel "---"));
DESCR("overlaps or is right of");
-DATA(insert OID = 496 ( ">>" PGNSP PGUID b f f 603 603 16 0 0 box_right positionsel positionjoinsel ));
+DATA(insert OID = 496 ( ">>" PGNSP PGUID b f f 603 603 16 0 0 box_right positionsel positionjoinsel "---"));
DESCR("is right of");
-DATA(insert OID = 497 ( "<@" PGNSP PGUID b f f 603 603 16 498 0 box_contained contsel contjoinsel ));
+DATA(insert OID = 497 ( "<@" PGNSP PGUID b f f 603 603 16 498 0 box_contained contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 498 ( "@>" PGNSP PGUID b f f 603 603 16 497 0 box_contain contsel contjoinsel ));
+DATA(insert OID = 498 ( "@>" PGNSP PGUID b f f 603 603 16 497 0 box_contain contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 499 ( "~=" PGNSP PGUID b f f 603 603 16 499 0 box_same eqsel eqjoinsel ));
+DATA(insert OID = 499 ( "~=" PGNSP PGUID b f f 603 603 16 499 0 box_same eqsel eqjoinsel "mhf"));
DESCR("same as");
-DATA(insert OID = 500 ( "&&" PGNSP PGUID b f f 603 603 16 500 0 box_overlap areasel areajoinsel ));
+DATA(insert OID = 500 ( "&&" PGNSP PGUID b f f 603 603 16 500 0 box_overlap areasel areajoinsel "---"));
DESCR("overlaps");
-DATA(insert OID = 501 ( ">=" PGNSP PGUID b f f 603 603 16 505 504 box_ge areasel areajoinsel ));
+DATA(insert OID = 501 ( ">=" PGNSP PGUID b f f 603 603 16 505 504 box_ge areasel areajoinsel "---"));
DESCR("greater than or equal by area");
-DATA(insert OID = 502 ( ">" PGNSP PGUID b f f 603 603 16 504 505 box_gt areasel areajoinsel ));
+DATA(insert OID = 502 ( ">" PGNSP PGUID b f f 603 603 16 504 505 box_gt areasel areajoinsel "---"));
DESCR("greater than by area");
-DATA(insert OID = 503 ( "=" PGNSP PGUID b f f 603 603 16 503 0 box_eq eqsel eqjoinsel ));
+DATA(insert OID = 503 ( "=" PGNSP PGUID b f f 603 603 16 503 0 box_eq eqsel eqjoinsel "mhf"));
DESCR("equal by area");
-DATA(insert OID = 504 ( "<" PGNSP PGUID b f f 603 603 16 502 501 box_lt areasel areajoinsel ));
+DATA(insert OID = 504 ( "<" PGNSP PGUID b f f 603 603 16 502 501 box_lt areasel areajoinsel "---"));
DESCR("less than by area");
-DATA(insert OID = 505 ( "<=" PGNSP PGUID b f f 603 603 16 501 502 box_le areasel areajoinsel ));
+DATA(insert OID = 505 ( "<=" PGNSP PGUID b f f 603 603 16 501 502 box_le areasel areajoinsel "---"));
DESCR("less than or equal by area");
-DATA(insert OID = 506 ( ">^" PGNSP PGUID b f f 600 600 16 0 0 point_above positionsel positionjoinsel ));
+DATA(insert OID = 506 ( ">^" PGNSP PGUID b f f 600 600 16 0 0 point_above positionsel positionjoinsel "---"));
DESCR("is above");
-DATA(insert OID = 507 ( "<<" PGNSP PGUID b f f 600 600 16 0 0 point_left positionsel positionjoinsel ));
+DATA(insert OID = 507 ( "<<" PGNSP PGUID b f f 600 600 16 0 0 point_left positionsel positionjoinsel "---"));
DESCR("is left of");
-DATA(insert OID = 508 ( ">>" PGNSP PGUID b f f 600 600 16 0 0 point_right positionsel positionjoinsel ));
+DATA(insert OID = 508 ( ">>" PGNSP PGUID b f f 600 600 16 0 0 point_right positionsel positionjoinsel "---"));
DESCR("is right of");
-DATA(insert OID = 509 ( "<^" PGNSP PGUID b f f 600 600 16 0 0 point_below positionsel positionjoinsel ));
+DATA(insert OID = 509 ( "<^" PGNSP PGUID b f f 600 600 16 0 0 point_below positionsel positionjoinsel "---"));
DESCR("is below");
-DATA(insert OID = 510 ( "~=" PGNSP PGUID b f f 600 600 16 510 713 point_eq eqsel eqjoinsel ));
+DATA(insert OID = 510 ( "~=" PGNSP PGUID b f f 600 600 16 510 713 point_eq eqsel eqjoinsel "mhf"));
DESCR("same as");
-DATA(insert OID = 511 ( "<@" PGNSP PGUID b f f 600 603 16 433 0 on_pb contsel contjoinsel ));
+DATA(insert OID = 511 ( "<@" PGNSP PGUID b f f 600 603 16 433 0 on_pb contsel contjoinsel "---"));
DESCR("point inside box");
-DATA(insert OID = 433 ( "@>" PGNSP PGUID b f f 603 600 16 511 0 box_contain_pt contsel contjoinsel ));
+DATA(insert OID = 433 ( "@>" PGNSP PGUID b f f 603 600 16 511 0 box_contain_pt contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 512 ( "<@" PGNSP PGUID b f f 600 602 16 755 0 on_ppath - - ));
+DATA(insert OID = 512 ( "<@" PGNSP PGUID b f f 600 602 16 755 0 on_ppath - - "---"));
DESCR("point within closed path, or point on open path");
-DATA(insert OID = 513 ( "@@" PGNSP PGUID l f f 0 603 600 0 0 box_center - - ));
+DATA(insert OID = 513 ( "@@" PGNSP PGUID l f f 0 603 600 0 0 box_center - - "---"));
DESCR("center of");
-DATA(insert OID = 514 ( "*" PGNSP PGUID b f f 23 23 23 514 0 int4mul - - ));
+DATA(insert OID = 514 ( "*" PGNSP PGUID b f f 23 23 23 514 0 int4mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 517 ( "<->" PGNSP PGUID b f f 600 600 701 517 0 point_distance - - ));
+DATA(insert OID = 517 ( "<->" PGNSP PGUID b f f 600 600 701 517 0 point_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 518 ( "<>" PGNSP PGUID b f f 23 23 16 518 96 int4ne neqsel neqjoinsel ));
+DATA(insert OID = 518 ( "<>" PGNSP PGUID b f f 23 23 16 518 96 int4ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 519 ( "<>" PGNSP PGUID b f f 21 21 16 519 94 int2ne neqsel neqjoinsel ));
+DATA(insert OID = 519 ( "<>" PGNSP PGUID b f f 21 21 16 519 94 int2ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 520 ( ">" PGNSP PGUID b f f 21 21 16 95 522 int2gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 520 ( ">" PGNSP PGUID b f f 21 21 16 95 522 int2gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 521 ( ">" PGNSP PGUID b f f 23 23 16 97 523 int4gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 521 ( ">" PGNSP PGUID b f f 23 23 16 97 523 int4gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 522 ( "<=" PGNSP PGUID b f f 21 21 16 524 520 int2le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 522 ( "<=" PGNSP PGUID b f f 21 21 16 524 520 int2le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 523 ( "<=" PGNSP PGUID b f f 23 23 16 525 521 int4le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 523 ( "<=" PGNSP PGUID b f f 23 23 16 525 521 int4le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 524 ( ">=" PGNSP PGUID b f f 21 21 16 522 95 int2ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 524 ( ">=" PGNSP PGUID b f f 21 21 16 522 95 int2ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 525 ( ">=" PGNSP PGUID b f f 23 23 16 523 97 int4ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 525 ( ">=" PGNSP PGUID b f f 23 23 16 523 97 int4ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 526 ( "*" PGNSP PGUID b f f 21 21 21 526 0 int2mul - - ));
+DATA(insert OID = 526 ( "*" PGNSP PGUID b f f 21 21 21 526 0 int2mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 527 ( "/" PGNSP PGUID b f f 21 21 21 0 0 int2div - - ));
+DATA(insert OID = 527 ( "/" PGNSP PGUID b f f 21 21 21 0 0 int2div - - "---"));
DESCR("divide");
-DATA(insert OID = 528 ( "/" PGNSP PGUID b f f 23 23 23 0 0 int4div - - ));
+DATA(insert OID = 528 ( "/" PGNSP PGUID b f f 23 23 23 0 0 int4div - - "---"));
DESCR("divide");
-DATA(insert OID = 529 ( "%" PGNSP PGUID b f f 21 21 21 0 0 int2mod - - ));
+DATA(insert OID = 529 ( "%" PGNSP PGUID b f f 21 21 21 0 0 int2mod - - "---"));
DESCR("modulus");
-DATA(insert OID = 530 ( "%" PGNSP PGUID b f f 23 23 23 0 0 int4mod - - ));
+DATA(insert OID = 530 ( "%" PGNSP PGUID b f f 23 23 23 0 0 int4mod - - "---"));
DESCR("modulus");
-DATA(insert OID = 531 ( "<>" PGNSP PGUID b f f 25 25 16 531 98 textne neqsel neqjoinsel ));
+DATA(insert OID = 531 ( "<>" PGNSP PGUID b f f 25 25 16 531 98 textne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 532 ( "=" PGNSP PGUID b t t 21 23 16 533 538 int24eq eqsel eqjoinsel ));
+DATA(insert OID = 532 ( "=" PGNSP PGUID b t t 21 23 16 533 538 int24eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 533 ( "=" PGNSP PGUID b t t 23 21 16 532 539 int42eq eqsel eqjoinsel ));
+DATA(insert OID = 533 ( "=" PGNSP PGUID b t t 23 21 16 532 539 int42eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 534 ( "<" PGNSP PGUID b f f 21 23 16 537 542 int24lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 534 ( "<" PGNSP PGUID b f f 21 23 16 537 542 int24lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 535 ( "<" PGNSP PGUID b f f 23 21 16 536 543 int42lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 535 ( "<" PGNSP PGUID b f f 23 21 16 536 543 int42lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 536 ( ">" PGNSP PGUID b f f 21 23 16 535 540 int24gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 536 ( ">" PGNSP PGUID b f f 21 23 16 535 540 int24gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 537 ( ">" PGNSP PGUID b f f 23 21 16 534 541 int42gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 537 ( ">" PGNSP PGUID b f f 23 21 16 534 541 int42gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 538 ( "<>" PGNSP PGUID b f f 21 23 16 539 532 int24ne neqsel neqjoinsel ));
+DATA(insert OID = 538 ( "<>" PGNSP PGUID b f f 21 23 16 539 532 int24ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 539 ( "<>" PGNSP PGUID b f f 23 21 16 538 533 int42ne neqsel neqjoinsel ));
+DATA(insert OID = 539 ( "<>" PGNSP PGUID b f f 23 21 16 538 533 int42ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 540 ( "<=" PGNSP PGUID b f f 21 23 16 543 536 int24le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 540 ( "<=" PGNSP PGUID b f f 21 23 16 543 536 int24le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 541 ( "<=" PGNSP PGUID b f f 23 21 16 542 537 int42le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 541 ( "<=" PGNSP PGUID b f f 23 21 16 542 537 int42le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 542 ( ">=" PGNSP PGUID b f f 21 23 16 541 534 int24ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 542 ( ">=" PGNSP PGUID b f f 21 23 16 541 534 int24ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 543 ( ">=" PGNSP PGUID b f f 23 21 16 540 535 int42ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 543 ( ">=" PGNSP PGUID b f f 23 21 16 540 535 int42ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 544 ( "*" PGNSP PGUID b f f 21 23 23 545 0 int24mul - - ));
+DATA(insert OID = 544 ( "*" PGNSP PGUID b f f 21 23 23 545 0 int24mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 545 ( "*" PGNSP PGUID b f f 23 21 23 544 0 int42mul - - ));
+DATA(insert OID = 545 ( "*" PGNSP PGUID b f f 23 21 23 544 0 int42mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 546 ( "/" PGNSP PGUID b f f 21 23 23 0 0 int24div - - ));
+DATA(insert OID = 546 ( "/" PGNSP PGUID b f f 21 23 23 0 0 int24div - - "---"));
DESCR("divide");
-DATA(insert OID = 547 ( "/" PGNSP PGUID b f f 23 21 23 0 0 int42div - - ));
+DATA(insert OID = 547 ( "/" PGNSP PGUID b f f 23 21 23 0 0 int42div - - "---"));
DESCR("divide");
-DATA(insert OID = 550 ( "+" PGNSP PGUID b f f 21 21 21 550 0 int2pl - - ));
+DATA(insert OID = 550 ( "+" PGNSP PGUID b f f 21 21 21 550 0 int2pl - - "---"));
DESCR("add");
-DATA(insert OID = 551 ( "+" PGNSP PGUID b f f 23 23 23 551 0 int4pl - - ));
+DATA(insert OID = 551 ( "+" PGNSP PGUID b f f 23 23 23 551 0 int4pl - - "---"));
DESCR("add");
-DATA(insert OID = 552 ( "+" PGNSP PGUID b f f 21 23 23 553 0 int24pl - - ));
+DATA(insert OID = 552 ( "+" PGNSP PGUID b f f 21 23 23 553 0 int24pl - - "---"));
DESCR("add");
-DATA(insert OID = 553 ( "+" PGNSP PGUID b f f 23 21 23 552 0 int42pl - - ));
+DATA(insert OID = 553 ( "+" PGNSP PGUID b f f 23 21 23 552 0 int42pl - - "---"));
DESCR("add");
-DATA(insert OID = 554 ( "-" PGNSP PGUID b f f 21 21 21 0 0 int2mi - - ));
+DATA(insert OID = 554 ( "-" PGNSP PGUID b f f 21 21 21 0 0 int2mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 555 ( "-" PGNSP PGUID b f f 23 23 23 0 0 int4mi - - ));
+DATA(insert OID = 555 ( "-" PGNSP PGUID b f f 23 23 23 0 0 int4mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 556 ( "-" PGNSP PGUID b f f 21 23 23 0 0 int24mi - - ));
+DATA(insert OID = 556 ( "-" PGNSP PGUID b f f 21 23 23 0 0 int24mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 557 ( "-" PGNSP PGUID b f f 23 21 23 0 0 int42mi - - ));
+DATA(insert OID = 557 ( "-" PGNSP PGUID b f f 23 21 23 0 0 int42mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 558 ( "-" PGNSP PGUID l f f 0 23 23 0 0 int4um - - ));
+DATA(insert OID = 558 ( "-" PGNSP PGUID l f f 0 23 23 0 0 int4um - - "---"));
DESCR("negate");
-DATA(insert OID = 559 ( "-" PGNSP PGUID l f f 0 21 21 0 0 int2um - - ));
+DATA(insert OID = 559 ( "-" PGNSP PGUID l f f 0 21 21 0 0 int2um - - "---"));
DESCR("negate");
-DATA(insert OID = 560 ( "=" PGNSP PGUID b t t 702 702 16 560 561 abstimeeq eqsel eqjoinsel ));
+DATA(insert OID = 560 ( "=" PGNSP PGUID b t t 702 702 16 560 561 abstimeeq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 561 ( "<>" PGNSP PGUID b f f 702 702 16 561 560 abstimene neqsel neqjoinsel ));
+DATA(insert OID = 561 ( "<>" PGNSP PGUID b f f 702 702 16 561 560 abstimene neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 562 ( "<" PGNSP PGUID b f f 702 702 16 563 565 abstimelt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 562 ( "<" PGNSP PGUID b f f 702 702 16 563 565 abstimelt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 563 ( ">" PGNSP PGUID b f f 702 702 16 562 564 abstimegt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 563 ( ">" PGNSP PGUID b f f 702 702 16 562 564 abstimegt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 564 ( "<=" PGNSP PGUID b f f 702 702 16 565 563 abstimele scalarltsel scalarltjoinsel ));
+DATA(insert OID = 564 ( "<=" PGNSP PGUID b f f 702 702 16 565 563 abstimele scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 565 ( ">=" PGNSP PGUID b f f 702 702 16 564 562 abstimege scalargtsel scalargtjoinsel ));
+DATA(insert OID = 565 ( ">=" PGNSP PGUID b f f 702 702 16 564 562 abstimege scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 566 ( "=" PGNSP PGUID b t t 703 703 16 566 567 reltimeeq eqsel eqjoinsel ));
+DATA(insert OID = 566 ( "=" PGNSP PGUID b t t 703 703 16 566 567 reltimeeq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 567 ( "<>" PGNSP PGUID b f f 703 703 16 567 566 reltimene neqsel neqjoinsel ));
+DATA(insert OID = 567 ( "<>" PGNSP PGUID b f f 703 703 16 567 566 reltimene neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 568 ( "<" PGNSP PGUID b f f 703 703 16 569 571 reltimelt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 568 ( "<" PGNSP PGUID b f f 703 703 16 569 571 reltimelt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 569 ( ">" PGNSP PGUID b f f 703 703 16 568 570 reltimegt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 569 ( ">" PGNSP PGUID b f f 703 703 16 568 570 reltimegt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 570 ( "<=" PGNSP PGUID b f f 703 703 16 571 569 reltimele scalarltsel scalarltjoinsel ));
+DATA(insert OID = 570 ( "<=" PGNSP PGUID b f f 703 703 16 571 569 reltimele scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 571 ( ">=" PGNSP PGUID b f f 703 703 16 570 568 reltimege scalargtsel scalargtjoinsel ));
+DATA(insert OID = 571 ( ">=" PGNSP PGUID b f f 703 703 16 570 568 reltimege scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 572 ( "~=" PGNSP PGUID b f f 704 704 16 572 0 tintervalsame eqsel eqjoinsel ));
+DATA(insert OID = 572 ( "~=" PGNSP PGUID b f f 704 704 16 572 0 tintervalsame eqsel eqjoinsel "mhf"));
DESCR("same as");
-DATA(insert OID = 573 ( "<<" PGNSP PGUID b f f 704 704 16 0 0 tintervalct - - ));
+DATA(insert OID = 573 ( "<<" PGNSP PGUID b f f 704 704 16 0 0 tintervalct - - "---"));
DESCR("contains");
-DATA(insert OID = 574 ( "&&" PGNSP PGUID b f f 704 704 16 574 0 tintervalov - - ));
+DATA(insert OID = 574 ( "&&" PGNSP PGUID b f f 704 704 16 574 0 tintervalov - - "---"));
DESCR("overlaps");
-DATA(insert OID = 575 ( "#=" PGNSP PGUID b f f 704 703 16 0 576 tintervalleneq - - ));
+DATA(insert OID = 575 ( "#=" PGNSP PGUID b f f 704 703 16 0 576 tintervalleneq - - "---"));
DESCR("equal by length");
-DATA(insert OID = 576 ( "#<>" PGNSP PGUID b f f 704 703 16 0 575 tintervallenne - - ));
+DATA(insert OID = 576 ( "#<>" PGNSP PGUID b f f 704 703 16 0 575 tintervallenne - - "---"));
DESCR("not equal by length");
-DATA(insert OID = 577 ( "#<" PGNSP PGUID b f f 704 703 16 0 580 tintervallenlt - - ));
+DATA(insert OID = 577 ( "#<" PGNSP PGUID b f f 704 703 16 0 580 tintervallenlt - - "---"));
DESCR("less than by length");
-DATA(insert OID = 578 ( "#>" PGNSP PGUID b f f 704 703 16 0 579 tintervallengt - - ));
+DATA(insert OID = 578 ( "#>" PGNSP PGUID b f f 704 703 16 0 579 tintervallengt - - "---"));
DESCR("greater than by length");
-DATA(insert OID = 579 ( "#<=" PGNSP PGUID b f f 704 703 16 0 578 tintervallenle - - ));
+DATA(insert OID = 579 ( "#<=" PGNSP PGUID b f f 704 703 16 0 578 tintervallenle - - "---"));
DESCR("less than or equal by length");
-DATA(insert OID = 580 ( "#>=" PGNSP PGUID b f f 704 703 16 0 577 tintervallenge - - ));
+DATA(insert OID = 580 ( "#>=" PGNSP PGUID b f f 704 703 16 0 577 tintervallenge - - "---"));
DESCR("greater than or equal by length");
-DATA(insert OID = 581 ( "+" PGNSP PGUID b f f 702 703 702 0 0 timepl - - ));
+DATA(insert OID = 581 ( "+" PGNSP PGUID b f f 702 703 702 0 0 timepl - - "---"));
DESCR("add");
-DATA(insert OID = 582 ( "-" PGNSP PGUID b f f 702 703 702 0 0 timemi - - ));
+DATA(insert OID = 582 ( "-" PGNSP PGUID b f f 702 703 702 0 0 timemi - - "---"));
DESCR("subtract");
-DATA(insert OID = 583 ( "<?>" PGNSP PGUID b f f 702 704 16 0 0 intinterval - - ));
+DATA(insert OID = 583 ( "<?>" PGNSP PGUID b f f 702 704 16 0 0 intinterval - - "---"));
DESCR("is contained by");
-DATA(insert OID = 584 ( "-" PGNSP PGUID l f f 0 700 700 0 0 float4um - - ));
+DATA(insert OID = 584 ( "-" PGNSP PGUID l f f 0 700 700 0 0 float4um - - "---"));
DESCR("negate");
-DATA(insert OID = 585 ( "-" PGNSP PGUID l f f 0 701 701 0 0 float8um - - ));
+DATA(insert OID = 585 ( "-" PGNSP PGUID l f f 0 701 701 0 0 float8um - - "---"));
DESCR("negate");
-DATA(insert OID = 586 ( "+" PGNSP PGUID b f f 700 700 700 586 0 float4pl - - ));
+DATA(insert OID = 586 ( "+" PGNSP PGUID b f f 700 700 700 586 0 float4pl - - "---"));
DESCR("add");
-DATA(insert OID = 587 ( "-" PGNSP PGUID b f f 700 700 700 0 0 float4mi - - ));
+DATA(insert OID = 587 ( "-" PGNSP PGUID b f f 700 700 700 0 0 float4mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 588 ( "/" PGNSP PGUID b f f 700 700 700 0 0 float4div - - ));
+DATA(insert OID = 588 ( "/" PGNSP PGUID b f f 700 700 700 0 0 float4div - - "---"));
DESCR("divide");
-DATA(insert OID = 589 ( "*" PGNSP PGUID b f f 700 700 700 589 0 float4mul - - ));
+DATA(insert OID = 589 ( "*" PGNSP PGUID b f f 700 700 700 589 0 float4mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 590 ( "@" PGNSP PGUID l f f 0 700 700 0 0 float4abs - - ));
+DATA(insert OID = 590 ( "@" PGNSP PGUID l f f 0 700 700 0 0 float4abs - - "---"));
DESCR("absolute value");
-DATA(insert OID = 591 ( "+" PGNSP PGUID b f f 701 701 701 591 0 float8pl - - ));
+DATA(insert OID = 591 ( "+" PGNSP PGUID b f f 701 701 701 591 0 float8pl - - "---"));
DESCR("add");
-DATA(insert OID = 592 ( "-" PGNSP PGUID b f f 701 701 701 0 0 float8mi - - ));
+DATA(insert OID = 592 ( "-" PGNSP PGUID b f f 701 701 701 0 0 float8mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 593 ( "/" PGNSP PGUID b f f 701 701 701 0 0 float8div - - ));
+DATA(insert OID = 593 ( "/" PGNSP PGUID b f f 701 701 701 0 0 float8div - - "---"));
DESCR("divide");
-DATA(insert OID = 594 ( "*" PGNSP PGUID b f f 701 701 701 594 0 float8mul - - ));
+DATA(insert OID = 594 ( "*" PGNSP PGUID b f f 701 701 701 594 0 float8mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 595 ( "@" PGNSP PGUID l f f 0 701 701 0 0 float8abs - - ));
+DATA(insert OID = 595 ( "@" PGNSP PGUID l f f 0 701 701 0 0 float8abs - - "---"));
DESCR("absolute value");
-DATA(insert OID = 596 ( "|/" PGNSP PGUID l f f 0 701 701 0 0 dsqrt - - ));
+DATA(insert OID = 596 ( "|/" PGNSP PGUID l f f 0 701 701 0 0 dsqrt - - "---"));
DESCR("square root");
-DATA(insert OID = 597 ( "||/" PGNSP PGUID l f f 0 701 701 0 0 dcbrt - - ));
+DATA(insert OID = 597 ( "||/" PGNSP PGUID l f f 0 701 701 0 0 dcbrt - - "---"));
DESCR("cube root");
-DATA(insert OID = 1284 ( "|" PGNSP PGUID l f f 0 704 702 0 0 tintervalstart - - ));
+DATA(insert OID = 1284 ( "|" PGNSP PGUID l f f 0 704 702 0 0 tintervalstart - - "---"));
DESCR("start of interval");
-DATA(insert OID = 606 ( "<#>" PGNSP PGUID b f f 702 702 704 0 0 mktinterval - - ));
+DATA(insert OID = 606 ( "<#>" PGNSP PGUID b f f 702 702 704 0 0 mktinterval - - "---"));
DESCR("convert to tinterval");
-DATA(insert OID = 607 ( "=" PGNSP PGUID b t t 26 26 16 607 608 oideq eqsel eqjoinsel ));
+DATA(insert OID = 607 ( "=" PGNSP PGUID b t t 26 26 16 607 608 oideq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 608 ( "<>" PGNSP PGUID b f f 26 26 16 608 607 oidne neqsel neqjoinsel ));
+DATA(insert OID = 608 ( "<>" PGNSP PGUID b f f 26 26 16 608 607 oidne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 609 ( "<" PGNSP PGUID b f f 26 26 16 610 612 oidlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 609 ( "<" PGNSP PGUID b f f 26 26 16 610 612 oidlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 610 ( ">" PGNSP PGUID b f f 26 26 16 609 611 oidgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 610 ( ">" PGNSP PGUID b f f 26 26 16 609 611 oidgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 611 ( "<=" PGNSP PGUID b f f 26 26 16 612 610 oidle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 611 ( "<=" PGNSP PGUID b f f 26 26 16 612 610 oidle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 612 ( ">=" PGNSP PGUID b f f 26 26 16 611 609 oidge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 612 ( ">=" PGNSP PGUID b f f 26 26 16 611 609 oidge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 644 ( "<>" PGNSP PGUID b f f 30 30 16 644 649 oidvectorne neqsel neqjoinsel ));
+DATA(insert OID = 644 ( "<>" PGNSP PGUID b f f 30 30 16 644 649 oidvectorne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 645 ( "<" PGNSP PGUID b f f 30 30 16 646 648 oidvectorlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 645 ( "<" PGNSP PGUID b f f 30 30 16 646 648 oidvectorlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 646 ( ">" PGNSP PGUID b f f 30 30 16 645 647 oidvectorgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 646 ( ">" PGNSP PGUID b f f 30 30 16 645 647 oidvectorgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 647 ( "<=" PGNSP PGUID b f f 30 30 16 648 646 oidvectorle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 647 ( "<=" PGNSP PGUID b f f 30 30 16 648 646 oidvectorle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 648 ( ">=" PGNSP PGUID b f f 30 30 16 647 645 oidvectorge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 648 ( ">=" PGNSP PGUID b f f 30 30 16 647 645 oidvectorge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 649 ( "=" PGNSP PGUID b t t 30 30 16 649 644 oidvectoreq eqsel eqjoinsel ));
+DATA(insert OID = 649 ( "=" PGNSP PGUID b t t 30 30 16 649 644 oidvectoreq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 613 ( "<->" PGNSP PGUID b f f 600 628 701 0 0 dist_pl - - ));
+DATA(insert OID = 613 ( "<->" PGNSP PGUID b f f 600 628 701 0 0 dist_pl - - "---"));
DESCR("distance between");
-DATA(insert OID = 614 ( "<->" PGNSP PGUID b f f 600 601 701 0 0 dist_ps - - ));
+DATA(insert OID = 614 ( "<->" PGNSP PGUID b f f 600 601 701 0 0 dist_ps - - "---"));
DESCR("distance between");
-DATA(insert OID = 615 ( "<->" PGNSP PGUID b f f 600 603 701 0 0 dist_pb - - ));
+DATA(insert OID = 615 ( "<->" PGNSP PGUID b f f 600 603 701 0 0 dist_pb - - "---"));
DESCR("distance between");
-DATA(insert OID = 616 ( "<->" PGNSP PGUID b f f 601 628 701 0 0 dist_sl - - ));
+DATA(insert OID = 616 ( "<->" PGNSP PGUID b f f 601 628 701 0 0 dist_sl - - "---"));
DESCR("distance between");
-DATA(insert OID = 617 ( "<->" PGNSP PGUID b f f 601 603 701 0 0 dist_sb - - ));
+DATA(insert OID = 617 ( "<->" PGNSP PGUID b f f 601 603 701 0 0 dist_sb - - "---"));
DESCR("distance between");
-DATA(insert OID = 618 ( "<->" PGNSP PGUID b f f 600 602 701 0 0 dist_ppath - - ));
+DATA(insert OID = 618 ( "<->" PGNSP PGUID b f f 600 602 701 0 0 dist_ppath - - "---"));
DESCR("distance between");
-DATA(insert OID = 620 ( "=" PGNSP PGUID b t t 700 700 16 620 621 float4eq eqsel eqjoinsel ));
+DATA(insert OID = 620 ( "=" PGNSP PGUID b t t 700 700 16 620 621 float4eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 621 ( "<>" PGNSP PGUID b f f 700 700 16 621 620 float4ne neqsel neqjoinsel ));
+DATA(insert OID = 621 ( "<>" PGNSP PGUID b f f 700 700 16 621 620 float4ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 622 ( "<" PGNSP PGUID b f f 700 700 16 623 625 float4lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 622 ( "<" PGNSP PGUID b f f 700 700 16 623 625 float4lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 623 ( ">" PGNSP PGUID b f f 700 700 16 622 624 float4gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 623 ( ">" PGNSP PGUID b f f 700 700 16 622 624 float4gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 624 ( "<=" PGNSP PGUID b f f 700 700 16 625 623 float4le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 624 ( "<=" PGNSP PGUID b f f 700 700 16 625 623 float4le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 625 ( ">=" PGNSP PGUID b f f 700 700 16 624 622 float4ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 625 ( ">=" PGNSP PGUID b f f 700 700 16 624 622 float4ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 630 ( "<>" PGNSP PGUID b f f 18 18 16 630 92 charne neqsel neqjoinsel ));
+DATA(insert OID = 630 ( "<>" PGNSP PGUID b f f 18 18 16 630 92 charne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 631 ( "<" PGNSP PGUID b f f 18 18 16 633 634 charlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 631 ( "<" PGNSP PGUID b f f 18 18 16 633 634 charlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 632 ( "<=" PGNSP PGUID b f f 18 18 16 634 633 charle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 632 ( "<=" PGNSP PGUID b f f 18 18 16 634 633 charle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 633 ( ">" PGNSP PGUID b f f 18 18 16 631 632 chargt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 633 ( ">" PGNSP PGUID b f f 18 18 16 631 632 chargt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 634 ( ">=" PGNSP PGUID b f f 18 18 16 632 631 charge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 634 ( ">=" PGNSP PGUID b f f 18 18 16 632 631 charge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 639 ( "~" PGNSP PGUID b f f 19 25 16 0 640 nameregexeq regexeqsel regexeqjoinsel ));
+DATA(insert OID = 639 ( "~" PGNSP PGUID b f f 19 25 16 0 640 nameregexeq regexeqsel regexeqjoinsel "mhf"));
DESCR("matches regular expression, case-sensitive");
#define OID_NAME_REGEXEQ_OP 639
-DATA(insert OID = 640 ( "!~" PGNSP PGUID b f f 19 25 16 0 639 nameregexne regexnesel regexnejoinsel ));
+DATA(insert OID = 640 ( "!~" PGNSP PGUID b f f 19 25 16 0 639 nameregexne regexnesel regexnejoinsel "---"));
DESCR("does not match regular expression, case-sensitive");
-DATA(insert OID = 641 ( "~" PGNSP PGUID b f f 25 25 16 0 642 textregexeq regexeqsel regexeqjoinsel ));
+DATA(insert OID = 641 ( "~" PGNSP PGUID b f f 25 25 16 0 642 textregexeq regexeqsel regexeqjoinsel "mhf"));
DESCR("matches regular expression, case-sensitive");
#define OID_TEXT_REGEXEQ_OP 641
-DATA(insert OID = 642 ( "!~" PGNSP PGUID b f f 25 25 16 0 641 textregexne regexnesel regexnejoinsel ));
+DATA(insert OID = 642 ( "!~" PGNSP PGUID b f f 25 25 16 0 641 textregexne regexnesel regexnejoinsel "---"));
DESCR("does not match regular expression, case-sensitive");
-DATA(insert OID = 643 ( "<>" PGNSP PGUID b f f 19 19 16 643 93 namene neqsel neqjoinsel ));
+DATA(insert OID = 643 ( "<>" PGNSP PGUID b f f 19 19 16 643 93 namene neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 654 ( "||" PGNSP PGUID b f f 25 25 25 0 0 textcat - - ));
+DATA(insert OID = 654 ( "||" PGNSP PGUID b f f 25 25 25 0 0 textcat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 660 ( "<" PGNSP PGUID b f f 19 19 16 662 663 namelt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 660 ( "<" PGNSP PGUID b f f 19 19 16 662 663 namelt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 661 ( "<=" PGNSP PGUID b f f 19 19 16 663 662 namele scalarltsel scalarltjoinsel ));
+DATA(insert OID = 661 ( "<=" PGNSP PGUID b f f 19 19 16 663 662 namele scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 662 ( ">" PGNSP PGUID b f f 19 19 16 660 661 namegt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 662 ( ">" PGNSP PGUID b f f 19 19 16 660 661 namegt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 663 ( ">=" PGNSP PGUID b f f 19 19 16 661 660 namege scalargtsel scalargtjoinsel ));
+DATA(insert OID = 663 ( ">=" PGNSP PGUID b f f 19 19 16 661 660 namege scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 664 ( "<" PGNSP PGUID b f f 25 25 16 666 667 text_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 664 ( "<" PGNSP PGUID b f f 25 25 16 666 667 text_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 665 ( "<=" PGNSP PGUID b f f 25 25 16 667 666 text_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 665 ( "<=" PGNSP PGUID b f f 25 25 16 667 666 text_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 666 ( ">" PGNSP PGUID b f f 25 25 16 664 665 text_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 666 ( ">" PGNSP PGUID b f f 25 25 16 664 665 text_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 667 ( ">=" PGNSP PGUID b f f 25 25 16 665 664 text_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 667 ( ">=" PGNSP PGUID b f f 25 25 16 665 664 text_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 670 ( "=" PGNSP PGUID b t t 701 701 16 670 671 float8eq eqsel eqjoinsel ));
+DATA(insert OID = 670 ( "=" PGNSP PGUID b t t 701 701 16 670 671 float8eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 671 ( "<>" PGNSP PGUID b f f 701 701 16 671 670 float8ne neqsel neqjoinsel ));
+DATA(insert OID = 671 ( "<>" PGNSP PGUID b f f 701 701 16 671 670 float8ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 672 ( "<" PGNSP PGUID b f f 701 701 16 674 675 float8lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 672 ( "<" PGNSP PGUID b f f 701 701 16 674 675 float8lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define Float8LessOperator 672
-DATA(insert OID = 673 ( "<=" PGNSP PGUID b f f 701 701 16 675 674 float8le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 673 ( "<=" PGNSP PGUID b f f 701 701 16 675 674 float8le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 674 ( ">" PGNSP PGUID b f f 701 701 16 672 673 float8gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 674 ( ">" PGNSP PGUID b f f 701 701 16 672 673 float8gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 675 ( ">=" PGNSP PGUID b f f 701 701 16 673 672 float8ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 675 ( ">=" PGNSP PGUID b f f 701 701 16 673 672 float8ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 682 ( "@" PGNSP PGUID l f f 0 21 21 0 0 int2abs - - ));
+DATA(insert OID = 682 ( "@" PGNSP PGUID l f f 0 21 21 0 0 int2abs - - "---"));
DESCR("absolute value");
-DATA(insert OID = 684 ( "+" PGNSP PGUID b f f 20 20 20 684 0 int8pl - - ));
+DATA(insert OID = 684 ( "+" PGNSP PGUID b f f 20 20 20 684 0 int8pl - - "---"));
DESCR("add");
-DATA(insert OID = 685 ( "-" PGNSP PGUID b f f 20 20 20 0 0 int8mi - - ));
+DATA(insert OID = 685 ( "-" PGNSP PGUID b f f 20 20 20 0 0 int8mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 686 ( "*" PGNSP PGUID b f f 20 20 20 686 0 int8mul - - ));
+DATA(insert OID = 686 ( "*" PGNSP PGUID b f f 20 20 20 686 0 int8mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 687 ( "/" PGNSP PGUID b f f 20 20 20 0 0 int8div - - ));
+DATA(insert OID = 687 ( "/" PGNSP PGUID b f f 20 20 20 0 0 int8div - - "---"));
DESCR("divide");
-DATA(insert OID = 688 ( "+" PGNSP PGUID b f f 20 23 20 692 0 int84pl - - ));
+DATA(insert OID = 688 ( "+" PGNSP PGUID b f f 20 23 20 692 0 int84pl - - "---"));
DESCR("add");
-DATA(insert OID = 689 ( "-" PGNSP PGUID b f f 20 23 20 0 0 int84mi - - ));
+DATA(insert OID = 689 ( "-" PGNSP PGUID b f f 20 23 20 0 0 int84mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 690 ( "*" PGNSP PGUID b f f 20 23 20 694 0 int84mul - - ));
+DATA(insert OID = 690 ( "*" PGNSP PGUID b f f 20 23 20 694 0 int84mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 691 ( "/" PGNSP PGUID b f f 20 23 20 0 0 int84div - - ));
+DATA(insert OID = 691 ( "/" PGNSP PGUID b f f 20 23 20 0 0 int84div - - "---"));
DESCR("divide");
-DATA(insert OID = 692 ( "+" PGNSP PGUID b f f 23 20 20 688 0 int48pl - - ));
+DATA(insert OID = 692 ( "+" PGNSP PGUID b f f 23 20 20 688 0 int48pl - - "---"));
DESCR("add");
-DATA(insert OID = 693 ( "-" PGNSP PGUID b f f 23 20 20 0 0 int48mi - - ));
+DATA(insert OID = 693 ( "-" PGNSP PGUID b f f 23 20 20 0 0 int48mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 694 ( "*" PGNSP PGUID b f f 23 20 20 690 0 int48mul - - ));
+DATA(insert OID = 694 ( "*" PGNSP PGUID b f f 23 20 20 690 0 int48mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 695 ( "/" PGNSP PGUID b f f 23 20 20 0 0 int48div - - ));
+DATA(insert OID = 695 ( "/" PGNSP PGUID b f f 23 20 20 0 0 int48div - - "---"));
DESCR("divide");
-DATA(insert OID = 818 ( "+" PGNSP PGUID b f f 20 21 20 822 0 int82pl - - ));
+DATA(insert OID = 818 ( "+" PGNSP PGUID b f f 20 21 20 822 0 int82pl - - "---"));
DESCR("add");
-DATA(insert OID = 819 ( "-" PGNSP PGUID b f f 20 21 20 0 0 int82mi - - ));
+DATA(insert OID = 819 ( "-" PGNSP PGUID b f f 20 21 20 0 0 int82mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 820 ( "*" PGNSP PGUID b f f 20 21 20 824 0 int82mul - - ));
+DATA(insert OID = 820 ( "*" PGNSP PGUID b f f 20 21 20 824 0 int82mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 821 ( "/" PGNSP PGUID b f f 20 21 20 0 0 int82div - - ));
+DATA(insert OID = 821 ( "/" PGNSP PGUID b f f 20 21 20 0 0 int82div - - "---"));
DESCR("divide");
-DATA(insert OID = 822 ( "+" PGNSP PGUID b f f 21 20 20 818 0 int28pl - - ));
+DATA(insert OID = 822 ( "+" PGNSP PGUID b f f 21 20 20 818 0 int28pl - - "---"));
DESCR("add");
-DATA(insert OID = 823 ( "-" PGNSP PGUID b f f 21 20 20 0 0 int28mi - - ));
+DATA(insert OID = 823 ( "-" PGNSP PGUID b f f 21 20 20 0 0 int28mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 824 ( "*" PGNSP PGUID b f f 21 20 20 820 0 int28mul - - ));
+DATA(insert OID = 824 ( "*" PGNSP PGUID b f f 21 20 20 820 0 int28mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 825 ( "/" PGNSP PGUID b f f 21 20 20 0 0 int28div - - ));
+DATA(insert OID = 825 ( "/" PGNSP PGUID b f f 21 20 20 0 0 int28div - - "---"));
DESCR("divide");
-DATA(insert OID = 706 ( "<->" PGNSP PGUID b f f 603 603 701 706 0 box_distance - - ));
+DATA(insert OID = 706 ( "<->" PGNSP PGUID b f f 603 603 701 706 0 box_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 707 ( "<->" PGNSP PGUID b f f 602 602 701 707 0 path_distance - - ));
+DATA(insert OID = 707 ( "<->" PGNSP PGUID b f f 602 602 701 707 0 path_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 708 ( "<->" PGNSP PGUID b f f 628 628 701 708 0 line_distance - - ));
+DATA(insert OID = 708 ( "<->" PGNSP PGUID b f f 628 628 701 708 0 line_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 709 ( "<->" PGNSP PGUID b f f 601 601 701 709 0 lseg_distance - - ));
+DATA(insert OID = 709 ( "<->" PGNSP PGUID b f f 601 601 701 709 0 lseg_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 712 ( "<->" PGNSP PGUID b f f 604 604 701 712 0 poly_distance - - ));
+DATA(insert OID = 712 ( "<->" PGNSP PGUID b f f 604 604 701 712 0 poly_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 713 ( "<>" PGNSP PGUID b f f 600 600 16 713 510 point_ne neqsel neqjoinsel ));
+DATA(insert OID = 713 ( "<>" PGNSP PGUID b f f 600 600 16 713 510 point_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
/* add translation/rotation/scaling operators for geometric types. - thomas 97/05/10 */
-DATA(insert OID = 731 ( "+" PGNSP PGUID b f f 600 600 600 731 0 point_add - - ));
+DATA(insert OID = 731 ( "+" PGNSP PGUID b f f 600 600 600 731 0 point_add - - "---"));
DESCR("add points (translate)");
-DATA(insert OID = 732 ( "-" PGNSP PGUID b f f 600 600 600 0 0 point_sub - - ));
+DATA(insert OID = 732 ( "-" PGNSP PGUID b f f 600 600 600 0 0 point_sub - - "---"));
DESCR("subtract points (translate)");
-DATA(insert OID = 733 ( "*" PGNSP PGUID b f f 600 600 600 733 0 point_mul - - ));
+DATA(insert OID = 733 ( "*" PGNSP PGUID b f f 600 600 600 733 0 point_mul - - "---"));
DESCR("multiply points (scale/rotate)");
-DATA(insert OID = 734 ( "/" PGNSP PGUID b f f 600 600 600 0 0 point_div - - ));
+DATA(insert OID = 734 ( "/" PGNSP PGUID b f f 600 600 600 0 0 point_div - - "---"));
DESCR("divide points (scale/rotate)");
-DATA(insert OID = 735 ( "+" PGNSP PGUID b f f 602 602 602 735 0 path_add - - ));
+DATA(insert OID = 735 ( "+" PGNSP PGUID b f f 602 602 602 735 0 path_add - - "---"));
DESCR("concatenate");
-DATA(insert OID = 736 ( "+" PGNSP PGUID b f f 602 600 602 0 0 path_add_pt - - ));
+DATA(insert OID = 736 ( "+" PGNSP PGUID b f f 602 600 602 0 0 path_add_pt - - "---"));
DESCR("add (translate path)");
-DATA(insert OID = 737 ( "-" PGNSP PGUID b f f 602 600 602 0 0 path_sub_pt - - ));
+DATA(insert OID = 737 ( "-" PGNSP PGUID b f f 602 600 602 0 0 path_sub_pt - - "---"));
DESCR("subtract (translate path)");
-DATA(insert OID = 738 ( "*" PGNSP PGUID b f f 602 600 602 0 0 path_mul_pt - - ));
+DATA(insert OID = 738 ( "*" PGNSP PGUID b f f 602 600 602 0 0 path_mul_pt - - "---"));
DESCR("multiply (rotate/scale path)");
-DATA(insert OID = 739 ( "/" PGNSP PGUID b f f 602 600 602 0 0 path_div_pt - - ));
+DATA(insert OID = 739 ( "/" PGNSP PGUID b f f 602 600 602 0 0 path_div_pt - - "---"));
DESCR("divide (rotate/scale path)");
-DATA(insert OID = 755 ( "@>" PGNSP PGUID b f f 602 600 16 512 0 path_contain_pt - - ));
+DATA(insert OID = 755 ( "@>" PGNSP PGUID b f f 602 600 16 512 0 path_contain_pt - - "---"));
DESCR("contains");
-DATA(insert OID = 756 ( "<@" PGNSP PGUID b f f 600 604 16 757 0 pt_contained_poly contsel contjoinsel ));
+DATA(insert OID = 756 ( "<@" PGNSP PGUID b f f 600 604 16 757 0 pt_contained_poly contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 757 ( "@>" PGNSP PGUID b f f 604 600 16 756 0 poly_contain_pt contsel contjoinsel ));
+DATA(insert OID = 757 ( "@>" PGNSP PGUID b f f 604 600 16 756 0 poly_contain_pt contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 758 ( "<@" PGNSP PGUID b f f 600 718 16 759 0 pt_contained_circle contsel contjoinsel ));
+DATA(insert OID = 758 ( "<@" PGNSP PGUID b f f 600 718 16 759 0 pt_contained_circle contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 759 ( "@>" PGNSP PGUID b f f 718 600 16 758 0 circle_contain_pt contsel contjoinsel ));
+DATA(insert OID = 759 ( "@>" PGNSP PGUID b f f 718 600 16 758 0 circle_contain_pt contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 773 ( "@" PGNSP PGUID l f f 0 23 23 0 0 int4abs - - ));
+DATA(insert OID = 773 ( "@" PGNSP PGUID l f f 0 23 23 0 0 int4abs - - "---"));
DESCR("absolute value");
/* additional operators for geometric types - thomas 1997-07-09 */
-DATA(insert OID = 792 ( "=" PGNSP PGUID b f f 602 602 16 792 0 path_n_eq eqsel eqjoinsel ));
+DATA(insert OID = 792 ( "=" PGNSP PGUID b f f 602 602 16 792 0 path_n_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 793 ( "<" PGNSP PGUID b f f 602 602 16 794 0 path_n_lt - - ));
+DATA(insert OID = 793 ( "<" PGNSP PGUID b f f 602 602 16 794 0 path_n_lt - - "---"));
DESCR("less than");
-DATA(insert OID = 794 ( ">" PGNSP PGUID b f f 602 602 16 793 0 path_n_gt - - ));
+DATA(insert OID = 794 ( ">" PGNSP PGUID b f f 602 602 16 793 0 path_n_gt - - "---"));
DESCR("greater than");
-DATA(insert OID = 795 ( "<=" PGNSP PGUID b f f 602 602 16 796 0 path_n_le - - ));
+DATA(insert OID = 795 ( "<=" PGNSP PGUID b f f 602 602 16 796 0 path_n_le - - "---"));
DESCR("less than or equal");
-DATA(insert OID = 796 ( ">=" PGNSP PGUID b f f 602 602 16 795 0 path_n_ge - - ));
+DATA(insert OID = 796 ( ">=" PGNSP PGUID b f f 602 602 16 795 0 path_n_ge - - "---"));
DESCR("greater than or equal");
-DATA(insert OID = 797 ( "#" PGNSP PGUID l f f 0 602 23 0 0 path_npoints - - ));
+DATA(insert OID = 797 ( "#" PGNSP PGUID l f f 0 602 23 0 0 path_npoints - - "---"));
DESCR("number of points");
-DATA(insert OID = 798 ( "?#" PGNSP PGUID b f f 602 602 16 0 0 path_inter - - ));
+DATA(insert OID = 798 ( "?#" PGNSP PGUID b f f 602 602 16 0 0 path_inter - - "---"));
DESCR("intersect");
-DATA(insert OID = 799 ( "@-@" PGNSP PGUID l f f 0 602 701 0 0 path_length - - ));
+DATA(insert OID = 799 ( "@-@" PGNSP PGUID l f f 0 602 701 0 0 path_length - - "---"));
DESCR("sum of path segment lengths");
-DATA(insert OID = 800 ( ">^" PGNSP PGUID b f f 603 603 16 0 0 box_above_eq positionsel positionjoinsel ));
+DATA(insert OID = 800 ( ">^" PGNSP PGUID b f f 603 603 16 0 0 box_above_eq positionsel positionjoinsel "---"));
DESCR("is above (allows touching)");
-DATA(insert OID = 801 ( "<^" PGNSP PGUID b f f 603 603 16 0 0 box_below_eq positionsel positionjoinsel ));
+DATA(insert OID = 801 ( "<^" PGNSP PGUID b f f 603 603 16 0 0 box_below_eq positionsel positionjoinsel "---"));
DESCR("is below (allows touching)");
-DATA(insert OID = 802 ( "?#" PGNSP PGUID b f f 603 603 16 0 0 box_overlap areasel areajoinsel ));
+DATA(insert OID = 802 ( "?#" PGNSP PGUID b f f 603 603 16 0 0 box_overlap areasel areajoinsel "---"));
DESCR("deprecated, use && instead");
-DATA(insert OID = 803 ( "#" PGNSP PGUID b f f 603 603 603 0 0 box_intersect - - ));
+DATA(insert OID = 803 ( "#" PGNSP PGUID b f f 603 603 603 0 0 box_intersect - - "---"));
DESCR("box intersection");
-DATA(insert OID = 804 ( "+" PGNSP PGUID b f f 603 600 603 0 0 box_add - - ));
+DATA(insert OID = 804 ( "+" PGNSP PGUID b f f 603 600 603 0 0 box_add - - "---"));
DESCR("add point to box (translate)");
-DATA(insert OID = 805 ( "-" PGNSP PGUID b f f 603 600 603 0 0 box_sub - - ));
+DATA(insert OID = 805 ( "-" PGNSP PGUID b f f 603 600 603 0 0 box_sub - - "---"));
DESCR("subtract point from box (translate)");
-DATA(insert OID = 806 ( "*" PGNSP PGUID b f f 603 600 603 0 0 box_mul - - ));
+DATA(insert OID = 806 ( "*" PGNSP PGUID b f f 603 600 603 0 0 box_mul - - "---"));
DESCR("multiply box by point (scale)");
-DATA(insert OID = 807 ( "/" PGNSP PGUID b f f 603 600 603 0 0 box_div - - ));
+DATA(insert OID = 807 ( "/" PGNSP PGUID b f f 603 600 603 0 0 box_div - - "---"));
DESCR("divide box by point (scale)");
-DATA(insert OID = 808 ( "?-" PGNSP PGUID b f f 600 600 16 808 0 point_horiz - - ));
+DATA(insert OID = 808 ( "?-" PGNSP PGUID b f f 600 600 16 808 0 point_horiz - - "---"));
DESCR("horizontally aligned");
-DATA(insert OID = 809 ( "?|" PGNSP PGUID b f f 600 600 16 809 0 point_vert - - ));
+DATA(insert OID = 809 ( "?|" PGNSP PGUID b f f 600 600 16 809 0 point_vert - - "---"));
DESCR("vertically aligned");
-DATA(insert OID = 811 ( "=" PGNSP PGUID b t f 704 704 16 811 812 tintervaleq eqsel eqjoinsel ));
+DATA(insert OID = 811 ( "=" PGNSP PGUID b t f 704 704 16 811 812 tintervaleq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 812 ( "<>" PGNSP PGUID b f f 704 704 16 812 811 tintervalne neqsel neqjoinsel ));
+DATA(insert OID = 812 ( "<>" PGNSP PGUID b f f 704 704 16 812 811 tintervalne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 813 ( "<" PGNSP PGUID b f f 704 704 16 814 816 tintervallt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 813 ( "<" PGNSP PGUID b f f 704 704 16 814 816 tintervallt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 814 ( ">" PGNSP PGUID b f f 704 704 16 813 815 tintervalgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 814 ( ">" PGNSP PGUID b f f 704 704 16 813 815 tintervalgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 815 ( "<=" PGNSP PGUID b f f 704 704 16 816 814 tintervalle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 815 ( "<=" PGNSP PGUID b f f 704 704 16 816 814 tintervalle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 816 ( ">=" PGNSP PGUID b f f 704 704 16 815 813 tintervalge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 816 ( ">=" PGNSP PGUID b f f 704 704 16 815 813 tintervalge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 843 ( "*" PGNSP PGUID b f f 790 700 790 845 0 cash_mul_flt4 - - ));
+DATA(insert OID = 843 ( "*" PGNSP PGUID b f f 790 700 790 845 0 cash_mul_flt4 - - "---"));
DESCR("multiply");
-DATA(insert OID = 844 ( "/" PGNSP PGUID b f f 790 700 790 0 0 cash_div_flt4 - - ));
+DATA(insert OID = 844 ( "/" PGNSP PGUID b f f 790 700 790 0 0 cash_div_flt4 - - "---"));
DESCR("divide");
-DATA(insert OID = 845 ( "*" PGNSP PGUID b f f 700 790 790 843 0 flt4_mul_cash - - ));
+DATA(insert OID = 845 ( "*" PGNSP PGUID b f f 700 790 790 843 0 flt4_mul_cash - - "---"));
DESCR("multiply");
-DATA(insert OID = 900 ( "=" PGNSP PGUID b t f 790 790 16 900 901 cash_eq eqsel eqjoinsel ));
+DATA(insert OID = 900 ( "=" PGNSP PGUID b t f 790 790 16 900 901 cash_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 901 ( "<>" PGNSP PGUID b f f 790 790 16 901 900 cash_ne neqsel neqjoinsel ));
+DATA(insert OID = 901 ( "<>" PGNSP PGUID b f f 790 790 16 901 900 cash_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 902 ( "<" PGNSP PGUID b f f 790 790 16 903 905 cash_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 902 ( "<" PGNSP PGUID b f f 790 790 16 903 905 cash_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 903 ( ">" PGNSP PGUID b f f 790 790 16 902 904 cash_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 903 ( ">" PGNSP PGUID b f f 790 790 16 902 904 cash_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 904 ( "<=" PGNSP PGUID b f f 790 790 16 905 903 cash_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 904 ( "<=" PGNSP PGUID b f f 790 790 16 905 903 cash_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 905 ( ">=" PGNSP PGUID b f f 790 790 16 904 902 cash_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 905 ( ">=" PGNSP PGUID b f f 790 790 16 904 902 cash_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 906 ( "+" PGNSP PGUID b f f 790 790 790 906 0 cash_pl - - ));
+DATA(insert OID = 906 ( "+" PGNSP PGUID b f f 790 790 790 906 0 cash_pl - - "---"));
DESCR("add");
-DATA(insert OID = 907 ( "-" PGNSP PGUID b f f 790 790 790 0 0 cash_mi - - ));
+DATA(insert OID = 907 ( "-" PGNSP PGUID b f f 790 790 790 0 0 cash_mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 908 ( "*" PGNSP PGUID b f f 790 701 790 916 0 cash_mul_flt8 - - ));
+DATA(insert OID = 908 ( "*" PGNSP PGUID b f f 790 701 790 916 0 cash_mul_flt8 - - "---"));
DESCR("multiply");
-DATA(insert OID = 909 ( "/" PGNSP PGUID b f f 790 701 790 0 0 cash_div_flt8 - - ));
+DATA(insert OID = 909 ( "/" PGNSP PGUID b f f 790 701 790 0 0 cash_div_flt8 - - "---"));
DESCR("divide");
-DATA(insert OID = 912 ( "*" PGNSP PGUID b f f 790 23 790 917 0 cash_mul_int4 - - ));
+DATA(insert OID = 912 ( "*" PGNSP PGUID b f f 790 23 790 917 0 cash_mul_int4 - - "---"));
DESCR("multiply");
-DATA(insert OID = 913 ( "/" PGNSP PGUID b f f 790 23 790 0 0 cash_div_int4 - - ));
+DATA(insert OID = 913 ( "/" PGNSP PGUID b f f 790 23 790 0 0 cash_div_int4 - - "---"));
DESCR("divide");
-DATA(insert OID = 914 ( "*" PGNSP PGUID b f f 790 21 790 918 0 cash_mul_int2 - - ));
+DATA(insert OID = 914 ( "*" PGNSP PGUID b f f 790 21 790 918 0 cash_mul_int2 - - "---"));
DESCR("multiply");
-DATA(insert OID = 915 ( "/" PGNSP PGUID b f f 790 21 790 0 0 cash_div_int2 - - ));
+DATA(insert OID = 915 ( "/" PGNSP PGUID b f f 790 21 790 0 0 cash_div_int2 - - "---"));
DESCR("divide");
-DATA(insert OID = 916 ( "*" PGNSP PGUID b f f 701 790 790 908 0 flt8_mul_cash - - ));
+DATA(insert OID = 916 ( "*" PGNSP PGUID b f f 701 790 790 908 0 flt8_mul_cash - - "---"));
DESCR("multiply");
-DATA(insert OID = 917 ( "*" PGNSP PGUID b f f 23 790 790 912 0 int4_mul_cash - - ));
+DATA(insert OID = 917 ( "*" PGNSP PGUID b f f 23 790 790 912 0 int4_mul_cash - - "---"));
DESCR("multiply");
-DATA(insert OID = 918 ( "*" PGNSP PGUID b f f 21 790 790 914 0 int2_mul_cash - - ));
+DATA(insert OID = 918 ( "*" PGNSP PGUID b f f 21 790 790 914 0 int2_mul_cash - - "---"));
DESCR("multiply");
-DATA(insert OID = 3825 ( "/" PGNSP PGUID b f f 790 790 701 0 0 cash_div_cash - - ));
+DATA(insert OID = 3825 ( "/" PGNSP PGUID b f f 790 790 701 0 0 cash_div_cash - - "---"));
DESCR("divide");
-DATA(insert OID = 965 ( "^" PGNSP PGUID b f f 701 701 701 0 0 dpow - - ));
+DATA(insert OID = 965 ( "^" PGNSP PGUID b f f 701 701 701 0 0 dpow - - "---"));
DESCR("exponentiation");
-DATA(insert OID = 966 ( "+" PGNSP PGUID b f f 1034 1033 1034 0 0 aclinsert - - ));
+DATA(insert OID = 966 ( "+" PGNSP PGUID b f f 1034 1033 1034 0 0 aclinsert - - "---"));
DESCR("add/update ACL item");
-DATA(insert OID = 967 ( "-" PGNSP PGUID b f f 1034 1033 1034 0 0 aclremove - - ));
+DATA(insert OID = 967 ( "-" PGNSP PGUID b f f 1034 1033 1034 0 0 aclremove - - "---"));
DESCR("remove ACL item");
-DATA(insert OID = 968 ( "@>" PGNSP PGUID b f f 1034 1033 16 0 0 aclcontains - - ));
+DATA(insert OID = 968 ( "@>" PGNSP PGUID b f f 1034 1033 16 0 0 aclcontains - - "---"));
DESCR("contains");
-DATA(insert OID = 974 ( "=" PGNSP PGUID b f t 1033 1033 16 974 0 aclitemeq eqsel eqjoinsel ));
+DATA(insert OID = 974 ( "=" PGNSP PGUID b f t 1033 1033 16 974 0 aclitemeq eqsel eqjoinsel "mhf"));
DESCR("equal");
/* additional geometric operators - thomas 1997-07-09 */
-DATA(insert OID = 969 ( "@@" PGNSP PGUID l f f 0 601 600 0 0 lseg_center - - ));
+DATA(insert OID = 969 ( "@@" PGNSP PGUID l f f 0 601 600 0 0 lseg_center - - "---"));
DESCR("center of");
-DATA(insert OID = 970 ( "@@" PGNSP PGUID l f f 0 602 600 0 0 path_center - - ));
+DATA(insert OID = 970 ( "@@" PGNSP PGUID l f f 0 602 600 0 0 path_center - - "---"));
DESCR("center of");
-DATA(insert OID = 971 ( "@@" PGNSP PGUID l f f 0 604 600 0 0 poly_center - - ));
+DATA(insert OID = 971 ( "@@" PGNSP PGUID l f f 0 604 600 0 0 poly_center - - "---"));
DESCR("center of");
-DATA(insert OID = 1054 ( "=" PGNSP PGUID b t t 1042 1042 16 1054 1057 bpchareq eqsel eqjoinsel ));
+DATA(insert OID = 1054 ( "=" PGNSP PGUID b t t 1042 1042 16 1054 1057 bpchareq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1055 ( "~" PGNSP PGUID b f f 1042 25 16 0 1056 bpcharregexeq regexeqsel regexeqjoinsel ));
+DATA(insert OID = 1055 ( "~" PGNSP PGUID b f f 1042 25 16 0 1056 bpcharregexeq regexeqsel regexeqjoinsel "mhf"));
DESCR("matches regular expression, case-sensitive");
#define OID_BPCHAR_REGEXEQ_OP 1055
-DATA(insert OID = 1056 ( "!~" PGNSP PGUID b f f 1042 25 16 0 1055 bpcharregexne regexnesel regexnejoinsel ));
+DATA(insert OID = 1056 ( "!~" PGNSP PGUID b f f 1042 25 16 0 1055 bpcharregexne regexnesel regexnejoinsel "---"));
DESCR("does not match regular expression, case-sensitive");
-DATA(insert OID = 1057 ( "<>" PGNSP PGUID b f f 1042 1042 16 1057 1054 bpcharne neqsel neqjoinsel ));
+DATA(insert OID = 1057 ( "<>" PGNSP PGUID b f f 1042 1042 16 1057 1054 bpcharne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1058 ( "<" PGNSP PGUID b f f 1042 1042 16 1060 1061 bpcharlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1058 ( "<" PGNSP PGUID b f f 1042 1042 16 1060 1061 bpcharlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1059 ( "<=" PGNSP PGUID b f f 1042 1042 16 1061 1060 bpcharle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1059 ( "<=" PGNSP PGUID b f f 1042 1042 16 1061 1060 bpcharle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1060 ( ">" PGNSP PGUID b f f 1042 1042 16 1058 1059 bpchargt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1060 ( ">" PGNSP PGUID b f f 1042 1042 16 1058 1059 bpchargt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1061 ( ">=" PGNSP PGUID b f f 1042 1042 16 1059 1058 bpcharge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1061 ( ">=" PGNSP PGUID b f f 1042 1042 16 1059 1058 bpcharge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* generic array comparison operators */
-DATA(insert OID = 1070 ( "=" PGNSP PGUID b t t 2277 2277 16 1070 1071 array_eq eqsel eqjoinsel ));
+DATA(insert OID = 1070 ( "=" PGNSP PGUID b t t 2277 2277 16 1070 1071 array_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define ARRAY_EQ_OP 1070
-DATA(insert OID = 1071 ( "<>" PGNSP PGUID b f f 2277 2277 16 1071 1070 array_ne neqsel neqjoinsel ));
+DATA(insert OID = 1071 ( "<>" PGNSP PGUID b f f 2277 2277 16 1071 1070 array_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1072 ( "<" PGNSP PGUID b f f 2277 2277 16 1073 1075 array_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1072 ( "<" PGNSP PGUID b f f 2277 2277 16 1073 1075 array_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define ARRAY_LT_OP 1072
-DATA(insert OID = 1073 ( ">" PGNSP PGUID b f f 2277 2277 16 1072 1074 array_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1073 ( ">" PGNSP PGUID b f f 2277 2277 16 1072 1074 array_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
#define ARRAY_GT_OP 1073
-DATA(insert OID = 1074 ( "<=" PGNSP PGUID b f f 2277 2277 16 1075 1073 array_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1074 ( "<=" PGNSP PGUID b f f 2277 2277 16 1075 1073 array_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1075 ( ">=" PGNSP PGUID b f f 2277 2277 16 1074 1072 array_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1075 ( ">=" PGNSP PGUID b f f 2277 2277 16 1074 1072 array_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* date operators */
-DATA(insert OID = 1076 ( "+" PGNSP PGUID b f f 1082 1186 1114 2551 0 date_pl_interval - - ));
+DATA(insert OID = 1076 ( "+" PGNSP PGUID b f f 1082 1186 1114 2551 0 date_pl_interval - - "---"));
DESCR("add");
-DATA(insert OID = 1077 ( "-" PGNSP PGUID b f f 1082 1186 1114 0 0 date_mi_interval - - ));
+DATA(insert OID = 1077 ( "-" PGNSP PGUID b f f 1082 1186 1114 0 0 date_mi_interval - - "---"));
DESCR("subtract");
-DATA(insert OID = 1093 ( "=" PGNSP PGUID b t t 1082 1082 16 1093 1094 date_eq eqsel eqjoinsel ));
+DATA(insert OID = 1093 ( "=" PGNSP PGUID b t t 1082 1082 16 1093 1094 date_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1094 ( "<>" PGNSP PGUID b f f 1082 1082 16 1094 1093 date_ne neqsel neqjoinsel ));
+DATA(insert OID = 1094 ( "<>" PGNSP PGUID b f f 1082 1082 16 1094 1093 date_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1095 ( "<" PGNSP PGUID b f f 1082 1082 16 1097 1098 date_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1095 ( "<" PGNSP PGUID b f f 1082 1082 16 1097 1098 date_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1096 ( "<=" PGNSP PGUID b f f 1082 1082 16 1098 1097 date_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1096 ( "<=" PGNSP PGUID b f f 1082 1082 16 1098 1097 date_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1097 ( ">" PGNSP PGUID b f f 1082 1082 16 1095 1096 date_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1097 ( ">" PGNSP PGUID b f f 1082 1082 16 1095 1096 date_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1098 ( ">=" PGNSP PGUID b f f 1082 1082 16 1096 1095 date_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1098 ( ">=" PGNSP PGUID b f f 1082 1082 16 1096 1095 date_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1099 ( "-" PGNSP PGUID b f f 1082 1082 23 0 0 date_mi - - ));
+DATA(insert OID = 1099 ( "-" PGNSP PGUID b f f 1082 1082 23 0 0 date_mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 1100 ( "+" PGNSP PGUID b f f 1082 23 1082 2555 0 date_pli - - ));
+DATA(insert OID = 1100 ( "+" PGNSP PGUID b f f 1082 23 1082 2555 0 date_pli - - "---"));
DESCR("add");
-DATA(insert OID = 1101 ( "-" PGNSP PGUID b f f 1082 23 1082 0 0 date_mii - - ));
+DATA(insert OID = 1101 ( "-" PGNSP PGUID b f f 1082 23 1082 0 0 date_mii - - "---"));
DESCR("subtract");
/* time operators */
-DATA(insert OID = 1108 ( "=" PGNSP PGUID b t t 1083 1083 16 1108 1109 time_eq eqsel eqjoinsel ));
+DATA(insert OID = 1108 ( "=" PGNSP PGUID b t t 1083 1083 16 1108 1109 time_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1109 ( "<>" PGNSP PGUID b f f 1083 1083 16 1109 1108 time_ne neqsel neqjoinsel ));
+DATA(insert OID = 1109 ( "<>" PGNSP PGUID b f f 1083 1083 16 1109 1108 time_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1110 ( "<" PGNSP PGUID b f f 1083 1083 16 1112 1113 time_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1110 ( "<" PGNSP PGUID b f f 1083 1083 16 1112 1113 time_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1111 ( "<=" PGNSP PGUID b f f 1083 1083 16 1113 1112 time_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1111 ( "<=" PGNSP PGUID b f f 1083 1083 16 1113 1112 time_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1112 ( ">" PGNSP PGUID b f f 1083 1083 16 1110 1111 time_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1112 ( ">" PGNSP PGUID b f f 1083 1083 16 1110 1111 time_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1113 ( ">=" PGNSP PGUID b f f 1083 1083 16 1111 1110 time_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1113 ( ">=" PGNSP PGUID b f f 1083 1083 16 1111 1110 time_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* timetz operators */
-DATA(insert OID = 1550 ( "=" PGNSP PGUID b t t 1266 1266 16 1550 1551 timetz_eq eqsel eqjoinsel ));
+DATA(insert OID = 1550 ( "=" PGNSP PGUID b t t 1266 1266 16 1550 1551 timetz_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1551 ( "<>" PGNSP PGUID b f f 1266 1266 16 1551 1550 timetz_ne neqsel neqjoinsel ));
+DATA(insert OID = 1551 ( "<>" PGNSP PGUID b f f 1266 1266 16 1551 1550 timetz_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1552 ( "<" PGNSP PGUID b f f 1266 1266 16 1554 1555 timetz_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1552 ( "<" PGNSP PGUID b f f 1266 1266 16 1554 1555 timetz_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1553 ( "<=" PGNSP PGUID b f f 1266 1266 16 1555 1554 timetz_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1553 ( "<=" PGNSP PGUID b f f 1266 1266 16 1555 1554 timetz_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1554 ( ">" PGNSP PGUID b f f 1266 1266 16 1552 1553 timetz_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1554 ( ">" PGNSP PGUID b f f 1266 1266 16 1552 1553 timetz_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1555 ( ">=" PGNSP PGUID b f f 1266 1266 16 1553 1552 timetz_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1555 ( ">=" PGNSP PGUID b f f 1266 1266 16 1553 1552 timetz_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* float48 operators */
-DATA(insert OID = 1116 ( "+" PGNSP PGUID b f f 700 701 701 1126 0 float48pl - - ));
+DATA(insert OID = 1116 ( "+" PGNSP PGUID b f f 700 701 701 1126 0 float48pl - - "---"));
DESCR("add");
-DATA(insert OID = 1117 ( "-" PGNSP PGUID b f f 700 701 701 0 0 float48mi - - ));
+DATA(insert OID = 1117 ( "-" PGNSP PGUID b f f 700 701 701 0 0 float48mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 1118 ( "/" PGNSP PGUID b f f 700 701 701 0 0 float48div - - ));
+DATA(insert OID = 1118 ( "/" PGNSP PGUID b f f 700 701 701 0 0 float48div - - "---"));
DESCR("divide");
-DATA(insert OID = 1119 ( "*" PGNSP PGUID b f f 700 701 701 1129 0 float48mul - - ));
+DATA(insert OID = 1119 ( "*" PGNSP PGUID b f f 700 701 701 1129 0 float48mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 1120 ( "=" PGNSP PGUID b t t 700 701 16 1130 1121 float48eq eqsel eqjoinsel ));
+DATA(insert OID = 1120 ( "=" PGNSP PGUID b t t 700 701 16 1130 1121 float48eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1121 ( "<>" PGNSP PGUID b f f 700 701 16 1131 1120 float48ne neqsel neqjoinsel ));
+DATA(insert OID = 1121 ( "<>" PGNSP PGUID b f f 700 701 16 1131 1120 float48ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1122 ( "<" PGNSP PGUID b f f 700 701 16 1133 1125 float48lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1122 ( "<" PGNSP PGUID b f f 700 701 16 1133 1125 float48lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1123 ( ">" PGNSP PGUID b f f 700 701 16 1132 1124 float48gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1123 ( ">" PGNSP PGUID b f f 700 701 16 1132 1124 float48gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1124 ( "<=" PGNSP PGUID b f f 700 701 16 1135 1123 float48le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1124 ( "<=" PGNSP PGUID b f f 700 701 16 1135 1123 float48le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1125 ( ">=" PGNSP PGUID b f f 700 701 16 1134 1122 float48ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1125 ( ">=" PGNSP PGUID b f f 700 701 16 1134 1122 float48ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* float84 operators */
-DATA(insert OID = 1126 ( "+" PGNSP PGUID b f f 701 700 701 1116 0 float84pl - - ));
+DATA(insert OID = 1126 ( "+" PGNSP PGUID b f f 701 700 701 1116 0 float84pl - - "---"));
DESCR("add");
-DATA(insert OID = 1127 ( "-" PGNSP PGUID b f f 701 700 701 0 0 float84mi - - ));
+DATA(insert OID = 1127 ( "-" PGNSP PGUID b f f 701 700 701 0 0 float84mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 1128 ( "/" PGNSP PGUID b f f 701 700 701 0 0 float84div - - ));
+DATA(insert OID = 1128 ( "/" PGNSP PGUID b f f 701 700 701 0 0 float84div - - "---"));
DESCR("divide");
-DATA(insert OID = 1129 ( "*" PGNSP PGUID b f f 701 700 701 1119 0 float84mul - - ));
+DATA(insert OID = 1129 ( "*" PGNSP PGUID b f f 701 700 701 1119 0 float84mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 1130 ( "=" PGNSP PGUID b t t 701 700 16 1120 1131 float84eq eqsel eqjoinsel ));
+DATA(insert OID = 1130 ( "=" PGNSP PGUID b t t 701 700 16 1120 1131 float84eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1131 ( "<>" PGNSP PGUID b f f 701 700 16 1121 1130 float84ne neqsel neqjoinsel ));
+DATA(insert OID = 1131 ( "<>" PGNSP PGUID b f f 701 700 16 1121 1130 float84ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1132 ( "<" PGNSP PGUID b f f 701 700 16 1123 1135 float84lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1132 ( "<" PGNSP PGUID b f f 701 700 16 1123 1135 float84lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1133 ( ">" PGNSP PGUID b f f 701 700 16 1122 1134 float84gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1133 ( ">" PGNSP PGUID b f f 701 700 16 1122 1134 float84gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1134 ( "<=" PGNSP PGUID b f f 701 700 16 1125 1133 float84le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1134 ( "<=" PGNSP PGUID b f f 701 700 16 1125 1133 float84le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1135 ( ">=" PGNSP PGUID b f f 701 700 16 1124 1132 float84ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1135 ( ">=" PGNSP PGUID b f f 701 700 16 1124 1132 float84ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* LIKE hacks by Keith Parks. */
-DATA(insert OID = 1207 ( "~~" PGNSP PGUID b f f 19 25 16 0 1208 namelike likesel likejoinsel ));
+DATA(insert OID = 1207 ( "~~" PGNSP PGUID b f f 19 25 16 0 1208 namelike likesel likejoinsel "---"));
DESCR("matches LIKE expression");
#define OID_NAME_LIKE_OP 1207
-DATA(insert OID = 1208 ( "!~~" PGNSP PGUID b f f 19 25 16 0 1207 namenlike nlikesel nlikejoinsel ));
+DATA(insert OID = 1208 ( "!~~" PGNSP PGUID b f f 19 25 16 0 1207 namenlike nlikesel nlikejoinsel "---"));
DESCR("does not match LIKE expression");
-DATA(insert OID = 1209 ( "~~" PGNSP PGUID b f f 25 25 16 0 1210 textlike likesel likejoinsel ));
+DATA(insert OID = 1209 ( "~~" PGNSP PGUID b f f 25 25 16 0 1210 textlike likesel likejoinsel "---"));
DESCR("matches LIKE expression");
#define OID_TEXT_LIKE_OP 1209
-DATA(insert OID = 1210 ( "!~~" PGNSP PGUID b f f 25 25 16 0 1209 textnlike nlikesel nlikejoinsel ));
+DATA(insert OID = 1210 ( "!~~" PGNSP PGUID b f f 25 25 16 0 1209 textnlike nlikesel nlikejoinsel "---"));
DESCR("does not match LIKE expression");
-DATA(insert OID = 1211 ( "~~" PGNSP PGUID b f f 1042 25 16 0 1212 bpcharlike likesel likejoinsel ));
+DATA(insert OID = 1211 ( "~~" PGNSP PGUID b f f 1042 25 16 0 1212 bpcharlike likesel likejoinsel "---"));
DESCR("matches LIKE expression");
#define OID_BPCHAR_LIKE_OP 1211
-DATA(insert OID = 1212 ( "!~~" PGNSP PGUID b f f 1042 25 16 0 1211 bpcharnlike nlikesel nlikejoinsel ));
+DATA(insert OID = 1212 ( "!~~" PGNSP PGUID b f f 1042 25 16 0 1211 bpcharnlike nlikesel nlikejoinsel "---"));
DESCR("does not match LIKE expression");
/* case-insensitive regex hacks */
-DATA(insert OID = 1226 ( "~*" PGNSP PGUID b f f 19 25 16 0 1227 nameicregexeq icregexeqsel icregexeqjoinsel ));
+DATA(insert OID = 1226 ( "~*" PGNSP PGUID b f f 19 25 16 0 1227 nameicregexeq icregexeqsel icregexeqjoinsel "mhf"));
DESCR("matches regular expression, case-insensitive");
#define OID_NAME_ICREGEXEQ_OP 1226
-DATA(insert OID = 1227 ( "!~*" PGNSP PGUID b f f 19 25 16 0 1226 nameicregexne icregexnesel icregexnejoinsel ));
+DATA(insert OID = 1227 ( "!~*" PGNSP PGUID b f f 19 25 16 0 1226 nameicregexne icregexnesel icregexnejoinsel "---"));
DESCR("does not match regular expression, case-insensitive");
-DATA(insert OID = 1228 ( "~*" PGNSP PGUID b f f 25 25 16 0 1229 texticregexeq icregexeqsel icregexeqjoinsel ));
+DATA(insert OID = 1228 ( "~*" PGNSP PGUID b f f 25 25 16 0 1229 texticregexeq icregexeqsel icregexeqjoinsel "mhf"));
DESCR("matches regular expression, case-insensitive");
#define OID_TEXT_ICREGEXEQ_OP 1228
-DATA(insert OID = 1229 ( "!~*" PGNSP PGUID b f f 25 25 16 0 1228 texticregexne icregexnesel icregexnejoinsel ));
+DATA(insert OID = 1229 ( "!~*" PGNSP PGUID b f f 25 25 16 0 1228 texticregexne icregexnesel icregexnejoinsel "---"));
DESCR("does not match regular expression, case-insensitive");
-DATA(insert OID = 1234 ( "~*" PGNSP PGUID b f f 1042 25 16 0 1235 bpcharicregexeq icregexeqsel icregexeqjoinsel ));
+DATA(insert OID = 1234 ( "~*" PGNSP PGUID b f f 1042 25 16 0 1235 bpcharicregexeq icregexeqsel icregexeqjoinsel "mhf"));
DESCR("matches regular expression, case-insensitive");
#define OID_BPCHAR_ICREGEXEQ_OP 1234
-DATA(insert OID = 1235 ( "!~*" PGNSP PGUID b f f 1042 25 16 0 1234 bpcharicregexne icregexnesel icregexnejoinsel ));
+DATA(insert OID = 1235 ( "!~*" PGNSP PGUID b f f 1042 25 16 0 1234 bpcharicregexne icregexnesel icregexnejoinsel "---"));
DESCR("does not match regular expression, case-insensitive");
/* timestamptz operators */
-DATA(insert OID = 1320 ( "=" PGNSP PGUID b t t 1184 1184 16 1320 1321 timestamptz_eq eqsel eqjoinsel ));
+DATA(insert OID = 1320 ( "=" PGNSP PGUID b t t 1184 1184 16 1320 1321 timestamptz_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1321 ( "<>" PGNSP PGUID b f f 1184 1184 16 1321 1320 timestamptz_ne neqsel neqjoinsel ));
+DATA(insert OID = 1321 ( "<>" PGNSP PGUID b f f 1184 1184 16 1321 1320 timestamptz_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1322 ( "<" PGNSP PGUID b f f 1184 1184 16 1324 1325 timestamptz_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1322 ( "<" PGNSP PGUID b f f 1184 1184 16 1324 1325 timestamptz_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1323 ( "<=" PGNSP PGUID b f f 1184 1184 16 1325 1324 timestamptz_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1323 ( "<=" PGNSP PGUID b f f 1184 1184 16 1325 1324 timestamptz_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1324 ( ">" PGNSP PGUID b f f 1184 1184 16 1322 1323 timestamptz_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1324 ( ">" PGNSP PGUID b f f 1184 1184 16 1322 1323 timestamptz_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1325 ( ">=" PGNSP PGUID b f f 1184 1184 16 1323 1322 timestamptz_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1325 ( ">=" PGNSP PGUID b f f 1184 1184 16 1323 1322 timestamptz_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1327 ( "+" PGNSP PGUID b f f 1184 1186 1184 2554 0 timestamptz_pl_interval - - ));
+DATA(insert OID = 1327 ( "+" PGNSP PGUID b f f 1184 1186 1184 2554 0 timestamptz_pl_interval - - "---"));
DESCR("add");
-DATA(insert OID = 1328 ( "-" PGNSP PGUID b f f 1184 1184 1186 0 0 timestamptz_mi - - ));
+DATA(insert OID = 1328 ( "-" PGNSP PGUID b f f 1184 1184 1186 0 0 timestamptz_mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 1329 ( "-" PGNSP PGUID b f f 1184 1186 1184 0 0 timestamptz_mi_interval - - ));
+DATA(insert OID = 1329 ( "-" PGNSP PGUID b f f 1184 1186 1184 0 0 timestamptz_mi_interval - - "---"));
DESCR("subtract");
/* interval operators */
-DATA(insert OID = 1330 ( "=" PGNSP PGUID b t t 1186 1186 16 1330 1331 interval_eq eqsel eqjoinsel ));
+DATA(insert OID = 1330 ( "=" PGNSP PGUID b t t 1186 1186 16 1330 1331 interval_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1331 ( "<>" PGNSP PGUID b f f 1186 1186 16 1331 1330 interval_ne neqsel neqjoinsel ));
+DATA(insert OID = 1331 ( "<>" PGNSP PGUID b f f 1186 1186 16 1331 1330 interval_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1332 ( "<" PGNSP PGUID b f f 1186 1186 16 1334 1335 interval_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1332 ( "<" PGNSP PGUID b f f 1186 1186 16 1334 1335 interval_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1333 ( "<=" PGNSP PGUID b f f 1186 1186 16 1335 1334 interval_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1333 ( "<=" PGNSP PGUID b f f 1186 1186 16 1335 1334 interval_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1334 ( ">" PGNSP PGUID b f f 1186 1186 16 1332 1333 interval_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1334 ( ">" PGNSP PGUID b f f 1186 1186 16 1332 1333 interval_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1335 ( ">=" PGNSP PGUID b f f 1186 1186 16 1333 1332 interval_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1335 ( ">=" PGNSP PGUID b f f 1186 1186 16 1333 1332 interval_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1336 ( "-" PGNSP PGUID l f f 0 1186 1186 0 0 interval_um - - ));
+DATA(insert OID = 1336 ( "-" PGNSP PGUID l f f 0 1186 1186 0 0 interval_um - - "---"));
DESCR("negate");
-DATA(insert OID = 1337 ( "+" PGNSP PGUID b f f 1186 1186 1186 1337 0 interval_pl - - ));
+DATA(insert OID = 1337 ( "+" PGNSP PGUID b f f 1186 1186 1186 1337 0 interval_pl - - "---"));
DESCR("add");
-DATA(insert OID = 1338 ( "-" PGNSP PGUID b f f 1186 1186 1186 0 0 interval_mi - - ));
+DATA(insert OID = 1338 ( "-" PGNSP PGUID b f f 1186 1186 1186 0 0 interval_mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 1360 ( "+" PGNSP PGUID b f f 1082 1083 1114 1363 0 datetime_pl - - ));
+DATA(insert OID = 1360 ( "+" PGNSP PGUID b f f 1082 1083 1114 1363 0 datetime_pl - - "---"));
DESCR("convert date and time to timestamp");
-DATA(insert OID = 1361 ( "+" PGNSP PGUID b f f 1082 1266 1184 1366 0 datetimetz_pl - - ));
+DATA(insert OID = 1361 ( "+" PGNSP PGUID b f f 1082 1266 1184 1366 0 datetimetz_pl - - "---"));
DESCR("convert date and time with time zone to timestamp with time zone");
-DATA(insert OID = 1363 ( "+" PGNSP PGUID b f f 1083 1082 1114 1360 0 timedate_pl - - ));
+DATA(insert OID = 1363 ( "+" PGNSP PGUID b f f 1083 1082 1114 1360 0 timedate_pl - - "---"));
DESCR("convert time and date to timestamp");
-DATA(insert OID = 1366 ( "+" PGNSP PGUID b f f 1266 1082 1184 1361 0 timetzdate_pl - - ));
+DATA(insert OID = 1366 ( "+" PGNSP PGUID b f f 1266 1082 1184 1361 0 timetzdate_pl - - "---"));
DESCR("convert time with time zone and date to timestamp with time zone");
-DATA(insert OID = 1399 ( "-" PGNSP PGUID b f f 1083 1083 1186 0 0 time_mi_time - - ));
+DATA(insert OID = 1399 ( "-" PGNSP PGUID b f f 1083 1083 1186 0 0 time_mi_time - - "---"));
DESCR("subtract");
/* additional geometric operators - thomas 97/04/18 */
-DATA(insert OID = 1420 ( "@@" PGNSP PGUID l f f 0 718 600 0 0 circle_center - - ));
+DATA(insert OID = 1420 ( "@@" PGNSP PGUID l f f 0 718 600 0 0 circle_center - - "---"));
DESCR("center of");
-DATA(insert OID = 1500 ( "=" PGNSP PGUID b f f 718 718 16 1500 1501 circle_eq eqsel eqjoinsel ));
+DATA(insert OID = 1500 ( "=" PGNSP PGUID b f f 718 718 16 1500 1501 circle_eq eqsel eqjoinsel "mhf"));
DESCR("equal by area");
-DATA(insert OID = 1501 ( "<>" PGNSP PGUID b f f 718 718 16 1501 1500 circle_ne neqsel neqjoinsel ));
+DATA(insert OID = 1501 ( "<>" PGNSP PGUID b f f 718 718 16 1501 1500 circle_ne neqsel neqjoinsel "mhf"));
DESCR("not equal by area");
-DATA(insert OID = 1502 ( "<" PGNSP PGUID b f f 718 718 16 1503 1505 circle_lt areasel areajoinsel ));
+DATA(insert OID = 1502 ( "<" PGNSP PGUID b f f 718 718 16 1503 1505 circle_lt areasel areajoinsel "---"));
DESCR("less than by area");
-DATA(insert OID = 1503 ( ">" PGNSP PGUID b f f 718 718 16 1502 1504 circle_gt areasel areajoinsel ));
+DATA(insert OID = 1503 ( ">" PGNSP PGUID b f f 718 718 16 1502 1504 circle_gt areasel areajoinsel "---"));
DESCR("greater than by area");
-DATA(insert OID = 1504 ( "<=" PGNSP PGUID b f f 718 718 16 1505 1503 circle_le areasel areajoinsel ));
+DATA(insert OID = 1504 ( "<=" PGNSP PGUID b f f 718 718 16 1505 1503 circle_le areasel areajoinsel "---"));
DESCR("less than or equal by area");
-DATA(insert OID = 1505 ( ">=" PGNSP PGUID b f f 718 718 16 1504 1502 circle_ge areasel areajoinsel ));
+DATA(insert OID = 1505 ( ">=" PGNSP PGUID b f f 718 718 16 1504 1502 circle_ge areasel areajoinsel "---"));
DESCR("greater than or equal by area");
-DATA(insert OID = 1506 ( "<<" PGNSP PGUID b f f 718 718 16 0 0 circle_left positionsel positionjoinsel ));
+DATA(insert OID = 1506 ( "<<" PGNSP PGUID b f f 718 718 16 0 0 circle_left positionsel positionjoinsel "---"));
DESCR("is left of");
-DATA(insert OID = 1507 ( "&<" PGNSP PGUID b f f 718 718 16 0 0 circle_overleft positionsel positionjoinsel ));
+DATA(insert OID = 1507 ( "&<" PGNSP PGUID b f f 718 718 16 0 0 circle_overleft positionsel positionjoinsel "---"));
DESCR("overlaps or is left of");
-DATA(insert OID = 1508 ( "&>" PGNSP PGUID b f f 718 718 16 0 0 circle_overright positionsel positionjoinsel ));
+DATA(insert OID = 1508 ( "&>" PGNSP PGUID b f f 718 718 16 0 0 circle_overright positionsel positionjoinsel "---"));
DESCR("overlaps or is right of");
-DATA(insert OID = 1509 ( ">>" PGNSP PGUID b f f 718 718 16 0 0 circle_right positionsel positionjoinsel ));
+DATA(insert OID = 1509 ( ">>" PGNSP PGUID b f f 718 718 16 0 0 circle_right positionsel positionjoinsel "---"));
DESCR("is right of");
-DATA(insert OID = 1510 ( "<@" PGNSP PGUID b f f 718 718 16 1511 0 circle_contained contsel contjoinsel ));
+DATA(insert OID = 1510 ( "<@" PGNSP PGUID b f f 718 718 16 1511 0 circle_contained contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 1511 ( "@>" PGNSP PGUID b f f 718 718 16 1510 0 circle_contain contsel contjoinsel ));
+DATA(insert OID = 1511 ( "@>" PGNSP PGUID b f f 718 718 16 1510 0 circle_contain contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 1512 ( "~=" PGNSP PGUID b f f 718 718 16 1512 0 circle_same eqsel eqjoinsel ));
+DATA(insert OID = 1512 ( "~=" PGNSP PGUID b f f 718 718 16 1512 0 circle_same eqsel eqjoinsel "mhf"));
DESCR("same as");
-DATA(insert OID = 1513 ( "&&" PGNSP PGUID b f f 718 718 16 1513 0 circle_overlap areasel areajoinsel ));
+DATA(insert OID = 1513 ( "&&" PGNSP PGUID b f f 718 718 16 1513 0 circle_overlap areasel areajoinsel "---"));
DESCR("overlaps");
-DATA(insert OID = 1514 ( "|>>" PGNSP PGUID b f f 718 718 16 0 0 circle_above positionsel positionjoinsel ));
+DATA(insert OID = 1514 ( "|>>" PGNSP PGUID b f f 718 718 16 0 0 circle_above positionsel positionjoinsel "---"));
DESCR("is above");
-DATA(insert OID = 1515 ( "<<|" PGNSP PGUID b f f 718 718 16 0 0 circle_below positionsel positionjoinsel ));
+DATA(insert OID = 1515 ( "<<|" PGNSP PGUID b f f 718 718 16 0 0 circle_below positionsel positionjoinsel "---"));
DESCR("is below");
-DATA(insert OID = 1516 ( "+" PGNSP PGUID b f f 718 600 718 0 0 circle_add_pt - - ));
+DATA(insert OID = 1516 ( "+" PGNSP PGUID b f f 718 600 718 0 0 circle_add_pt - - "---"));
DESCR("add");
-DATA(insert OID = 1517 ( "-" PGNSP PGUID b f f 718 600 718 0 0 circle_sub_pt - - ));
+DATA(insert OID = 1517 ( "-" PGNSP PGUID b f f 718 600 718 0 0 circle_sub_pt - - "---"));
DESCR("subtract");
-DATA(insert OID = 1518 ( "*" PGNSP PGUID b f f 718 600 718 0 0 circle_mul_pt - - ));
+DATA(insert OID = 1518 ( "*" PGNSP PGUID b f f 718 600 718 0 0 circle_mul_pt - - "---"));
DESCR("multiply");
-DATA(insert OID = 1519 ( "/" PGNSP PGUID b f f 718 600 718 0 0 circle_div_pt - - ));
+DATA(insert OID = 1519 ( "/" PGNSP PGUID b f f 718 600 718 0 0 circle_div_pt - - "---"));
DESCR("divide");
-DATA(insert OID = 1520 ( "<->" PGNSP PGUID b f f 718 718 701 1520 0 circle_distance - - ));
+DATA(insert OID = 1520 ( "<->" PGNSP PGUID b f f 718 718 701 1520 0 circle_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
+DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - "---"));
DESCR("number of points");
-DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3291 0 dist_pc - - ));
+DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3291 0 dist_pc - - "---"));
DESCR("distance between");
-DATA(insert OID = 3291 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
+DATA(insert OID = 3291 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - "---"));
DESCR("distance between");
-DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 3289 0 dist_ppoly - - ));
+DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 3289 0 dist_ppoly - - "---"));
DESCR("distance between");
-DATA(insert OID = 3289 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
+DATA(insert OID = 3289 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - "---"));
DESCR("distance between");
-DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
+DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - "---"));
DESCR("distance between");
/* additional geometric operators - thomas 1997-07-09 */
-DATA(insert OID = 1524 ( "<->" PGNSP PGUID b f f 628 603 701 0 0 dist_lb - - ));
+DATA(insert OID = 1524 ( "<->" PGNSP PGUID b f f 628 603 701 0 0 dist_lb - - "---"));
DESCR("distance between");
-DATA(insert OID = 1525 ( "?#" PGNSP PGUID b f f 601 601 16 1525 0 lseg_intersect - - ));
+DATA(insert OID = 1525 ( "?#" PGNSP PGUID b f f 601 601 16 1525 0 lseg_intersect - - "---"));
DESCR("intersect");
-DATA(insert OID = 1526 ( "?||" PGNSP PGUID b f f 601 601 16 1526 0 lseg_parallel - - ));
+DATA(insert OID = 1526 ( "?||" PGNSP PGUID b f f 601 601 16 1526 0 lseg_parallel - - "---"));
DESCR("parallel");
-DATA(insert OID = 1527 ( "?-|" PGNSP PGUID b f f 601 601 16 1527 0 lseg_perp - - ));
+DATA(insert OID = 1527 ( "?-|" PGNSP PGUID b f f 601 601 16 1527 0 lseg_perp - - "---"));
DESCR("perpendicular");
-DATA(insert OID = 1528 ( "?-" PGNSP PGUID l f f 0 601 16 0 0 lseg_horizontal - - ));
+DATA(insert OID = 1528 ( "?-" PGNSP PGUID l f f 0 601 16 0 0 lseg_horizontal - - "---"));
DESCR("horizontal");
-DATA(insert OID = 1529 ( "?|" PGNSP PGUID l f f 0 601 16 0 0 lseg_vertical - - ));
+DATA(insert OID = 1529 ( "?|" PGNSP PGUID l f f 0 601 16 0 0 lseg_vertical - - "---"));
DESCR("vertical");
-DATA(insert OID = 1535 ( "=" PGNSP PGUID b f f 601 601 16 1535 1586 lseg_eq eqsel eqjoinsel ));
+DATA(insert OID = 1535 ( "=" PGNSP PGUID b f f 601 601 16 1535 1586 lseg_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1536 ( "#" PGNSP PGUID b f f 601 601 600 1536 0 lseg_interpt - - ));
+DATA(insert OID = 1536 ( "#" PGNSP PGUID b f f 601 601 600 1536 0 lseg_interpt - - "---"));
DESCR("intersection point");
-DATA(insert OID = 1537 ( "?#" PGNSP PGUID b f f 601 628 16 0 0 inter_sl - - ));
+DATA(insert OID = 1537 ( "?#" PGNSP PGUID b f f 601 628 16 0 0 inter_sl - - "---"));
DESCR("intersect");
-DATA(insert OID = 1538 ( "?#" PGNSP PGUID b f f 601 603 16 0 0 inter_sb - - ));
+DATA(insert OID = 1538 ( "?#" PGNSP PGUID b f f 601 603 16 0 0 inter_sb - - "---"));
DESCR("intersect");
-DATA(insert OID = 1539 ( "?#" PGNSP PGUID b f f 628 603 16 0 0 inter_lb - - ));
+DATA(insert OID = 1539 ( "?#" PGNSP PGUID b f f 628 603 16 0 0 inter_lb - - "---"));
DESCR("intersect");
-DATA(insert OID = 1546 ( "<@" PGNSP PGUID b f f 600 628 16 0 0 on_pl - - ));
+DATA(insert OID = 1546 ( "<@" PGNSP PGUID b f f 600 628 16 0 0 on_pl - - "---"));
DESCR("point on line");
-DATA(insert OID = 1547 ( "<@" PGNSP PGUID b f f 600 601 16 0 0 on_ps - - ));
+DATA(insert OID = 1547 ( "<@" PGNSP PGUID b f f 600 601 16 0 0 on_ps - - "---"));
DESCR("is contained by");
-DATA(insert OID = 1548 ( "<@" PGNSP PGUID b f f 601 628 16 0 0 on_sl - - ));
+DATA(insert OID = 1548 ( "<@" PGNSP PGUID b f f 601 628 16 0 0 on_sl - - "---"));
DESCR("lseg on line");
-DATA(insert OID = 1549 ( "<@" PGNSP PGUID b f f 601 603 16 0 0 on_sb - - ));
+DATA(insert OID = 1549 ( "<@" PGNSP PGUID b f f 601 603 16 0 0 on_sb - - "---"));
DESCR("is contained by");
-DATA(insert OID = 1557 ( "##" PGNSP PGUID b f f 600 628 600 0 0 close_pl - - ));
+DATA(insert OID = 1557 ( "##" PGNSP PGUID b f f 600 628 600 0 0 close_pl - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1558 ( "##" PGNSP PGUID b f f 600 601 600 0 0 close_ps - - ));
+DATA(insert OID = 1558 ( "##" PGNSP PGUID b f f 600 601 600 0 0 close_ps - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1559 ( "##" PGNSP PGUID b f f 600 603 600 0 0 close_pb - - ));
+DATA(insert OID = 1559 ( "##" PGNSP PGUID b f f 600 603 600 0 0 close_pb - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1566 ( "##" PGNSP PGUID b f f 601 628 600 0 0 close_sl - - ));
+DATA(insert OID = 1566 ( "##" PGNSP PGUID b f f 601 628 600 0 0 close_sl - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1567 ( "##" PGNSP PGUID b f f 601 603 600 0 0 close_sb - - ));
+DATA(insert OID = 1567 ( "##" PGNSP PGUID b f f 601 603 600 0 0 close_sb - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1568 ( "##" PGNSP PGUID b f f 628 603 600 0 0 close_lb - - ));
+DATA(insert OID = 1568 ( "##" PGNSP PGUID b f f 628 603 600 0 0 close_lb - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1577 ( "##" PGNSP PGUID b f f 628 601 600 0 0 close_ls - - ));
+DATA(insert OID = 1577 ( "##" PGNSP PGUID b f f 628 601 600 0 0 close_ls - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1578 ( "##" PGNSP PGUID b f f 601 601 600 0 0 close_lseg - - ));
+DATA(insert OID = 1578 ( "##" PGNSP PGUID b f f 601 601 600 0 0 close_lseg - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1583 ( "*" PGNSP PGUID b f f 1186 701 1186 1584 0 interval_mul - - ));
+DATA(insert OID = 1583 ( "*" PGNSP PGUID b f f 1186 701 1186 1584 0 interval_mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 1584 ( "*" PGNSP PGUID b f f 701 1186 1186 1583 0 mul_d_interval - - ));
+DATA(insert OID = 1584 ( "*" PGNSP PGUID b f f 701 1186 1186 1583 0 mul_d_interval - - "---"));
DESCR("multiply");
-DATA(insert OID = 1585 ( "/" PGNSP PGUID b f f 1186 701 1186 0 0 interval_div - - ));
+DATA(insert OID = 1585 ( "/" PGNSP PGUID b f f 1186 701 1186 0 0 interval_div - - "---"));
DESCR("divide");
-DATA(insert OID = 1586 ( "<>" PGNSP PGUID b f f 601 601 16 1586 1535 lseg_ne neqsel neqjoinsel ));
+DATA(insert OID = 1586 ( "<>" PGNSP PGUID b f f 601 601 16 1586 1535 lseg_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1587 ( "<" PGNSP PGUID b f f 601 601 16 1589 1590 lseg_lt - - ));
+DATA(insert OID = 1587 ( "<" PGNSP PGUID b f f 601 601 16 1589 1590 lseg_lt - - "---"));
DESCR("less than by length");
-DATA(insert OID = 1588 ( "<=" PGNSP PGUID b f f 601 601 16 1590 1589 lseg_le - - ));
+DATA(insert OID = 1588 ( "<=" PGNSP PGUID b f f 601 601 16 1590 1589 lseg_le - - "---"));
DESCR("less than or equal by length");
-DATA(insert OID = 1589 ( ">" PGNSP PGUID b f f 601 601 16 1587 1588 lseg_gt - - ));
+DATA(insert OID = 1589 ( ">" PGNSP PGUID b f f 601 601 16 1587 1588 lseg_gt - - "---"));
DESCR("greater than by length");
-DATA(insert OID = 1590 ( ">=" PGNSP PGUID b f f 601 601 16 1588 1587 lseg_ge - - ));
+DATA(insert OID = 1590 ( ">=" PGNSP PGUID b f f 601 601 16 1588 1587 lseg_ge - - "---"));
DESCR("greater than or equal by length");
-DATA(insert OID = 1591 ( "@-@" PGNSP PGUID l f f 0 601 701 0 0 lseg_length - - ));
+DATA(insert OID = 1591 ( "@-@" PGNSP PGUID l f f 0 601 701 0 0 lseg_length - - "---"));
DESCR("distance between endpoints");
-DATA(insert OID = 1611 ( "?#" PGNSP PGUID b f f 628 628 16 1611 0 line_intersect - - ));
+DATA(insert OID = 1611 ( "?#" PGNSP PGUID b f f 628 628 16 1611 0 line_intersect - - "---"));
DESCR("intersect");
-DATA(insert OID = 1612 ( "?||" PGNSP PGUID b f f 628 628 16 1612 0 line_parallel - - ));
+DATA(insert OID = 1612 ( "?||" PGNSP PGUID b f f 628 628 16 1612 0 line_parallel - - "---"));
DESCR("parallel");
-DATA(insert OID = 1613 ( "?-|" PGNSP PGUID b f f 628 628 16 1613 0 line_perp - - ));
+DATA(insert OID = 1613 ( "?-|" PGNSP PGUID b f f 628 628 16 1613 0 line_perp - - "---"));
DESCR("perpendicular");
-DATA(insert OID = 1614 ( "?-" PGNSP PGUID l f f 0 628 16 0 0 line_horizontal - - ));
+DATA(insert OID = 1614 ( "?-" PGNSP PGUID l f f 0 628 16 0 0 line_horizontal - - "---"));
DESCR("horizontal");
-DATA(insert OID = 1615 ( "?|" PGNSP PGUID l f f 0 628 16 0 0 line_vertical - - ));
+DATA(insert OID = 1615 ( "?|" PGNSP PGUID l f f 0 628 16 0 0 line_vertical - - "---"));
DESCR("vertical");
-DATA(insert OID = 1616 ( "=" PGNSP PGUID b f f 628 628 16 1616 0 line_eq eqsel eqjoinsel ));
+DATA(insert OID = 1616 ( "=" PGNSP PGUID b f f 628 628 16 1616 0 line_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1617 ( "#" PGNSP PGUID b f f 628 628 600 1617 0 line_interpt - - ));
+DATA(insert OID = 1617 ( "#" PGNSP PGUID b f f 628 628 600 1617 0 line_interpt - - "---"));
DESCR("intersection point");
/* MAC type */
-DATA(insert OID = 1220 ( "=" PGNSP PGUID b t t 829 829 16 1220 1221 macaddr_eq eqsel eqjoinsel ));
+DATA(insert OID = 1220 ( "=" PGNSP PGUID b t t 829 829 16 1220 1221 macaddr_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1221 ( "<>" PGNSP PGUID b f f 829 829 16 1221 1220 macaddr_ne neqsel neqjoinsel ));
+DATA(insert OID = 1221 ( "<>" PGNSP PGUID b f f 829 829 16 1221 1220 macaddr_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1222 ( "<" PGNSP PGUID b f f 829 829 16 1224 1225 macaddr_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1222 ( "<" PGNSP PGUID b f f 829 829 16 1224 1225 macaddr_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1223 ( "<=" PGNSP PGUID b f f 829 829 16 1225 1224 macaddr_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1223 ( "<=" PGNSP PGUID b f f 829 829 16 1225 1224 macaddr_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1224 ( ">" PGNSP PGUID b f f 829 829 16 1222 1223 macaddr_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1224 ( ">" PGNSP PGUID b f f 829 829 16 1222 1223 macaddr_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1225 ( ">=" PGNSP PGUID b f f 829 829 16 1223 1222 macaddr_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1225 ( ">=" PGNSP PGUID b f f 829 829 16 1223 1222 macaddr_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 3147 ( "~" PGNSP PGUID l f f 0 829 829 0 0 macaddr_not - - ));
+DATA(insert OID = 3147 ( "~" PGNSP PGUID l f f 0 829 829 0 0 macaddr_not - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 3148 ( "&" PGNSP PGUID b f f 829 829 829 0 0 macaddr_and - - ));
+DATA(insert OID = 3148 ( "&" PGNSP PGUID b f f 829 829 829 0 0 macaddr_and - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 3149 ( "|" PGNSP PGUID b f f 829 829 829 0 0 macaddr_or - - ));
+DATA(insert OID = 3149 ( "|" PGNSP PGUID b f f 829 829 829 0 0 macaddr_or - - "---"));
DESCR("bitwise or");
/* INET type (these also support CIDR via implicit cast) */
-DATA(insert OID = 1201 ( "=" PGNSP PGUID b t t 869 869 16 1201 1202 network_eq eqsel eqjoinsel ));
+DATA(insert OID = 1201 ( "=" PGNSP PGUID b t t 869 869 16 1201 1202 network_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1202 ( "<>" PGNSP PGUID b f f 869 869 16 1202 1201 network_ne neqsel neqjoinsel ));
+DATA(insert OID = 1202 ( "<>" PGNSP PGUID b f f 869 869 16 1202 1201 network_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1203 ( "<" PGNSP PGUID b f f 869 869 16 1205 1206 network_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1203 ( "<" PGNSP PGUID b f f 869 869 16 1205 1206 network_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1204 ( "<=" PGNSP PGUID b f f 869 869 16 1206 1205 network_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1204 ( "<=" PGNSP PGUID b f f 869 869 16 1206 1205 network_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1205 ( ">" PGNSP PGUID b f f 869 869 16 1203 1204 network_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1205 ( ">" PGNSP PGUID b f f 869 869 16 1203 1204 network_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1206 ( ">=" PGNSP PGUID b f f 869 869 16 1204 1203 network_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1206 ( ">=" PGNSP PGUID b f f 869 869 16 1204 1203 network_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 931 ( "<<" PGNSP PGUID b f f 869 869 16 933 0 network_sub networksel networkjoinsel ));
+DATA(insert OID = 931 ( "<<" PGNSP PGUID b f f 869 869 16 933 0 network_sub networksel networkjoinsel "---"));
DESCR("is subnet");
#define OID_INET_SUB_OP 931
-DATA(insert OID = 932 ( "<<=" PGNSP PGUID b f f 869 869 16 934 0 network_subeq networksel networkjoinsel ));
+DATA(insert OID = 932 ( "<<=" PGNSP PGUID b f f 869 869 16 934 0 network_subeq networksel networkjoinsel "---"));
DESCR("is subnet or equal");
#define OID_INET_SUBEQ_OP 932
-DATA(insert OID = 933 ( ">>" PGNSP PGUID b f f 869 869 16 931 0 network_sup networksel networkjoinsel ));
+DATA(insert OID = 933 ( ">>" PGNSP PGUID b f f 869 869 16 931 0 network_sup networksel networkjoinsel "---"));
DESCR("is supernet");
#define OID_INET_SUP_OP 933
-DATA(insert OID = 934 ( ">>=" PGNSP PGUID b f f 869 869 16 932 0 network_supeq networksel networkjoinsel ));
+DATA(insert OID = 934 ( ">>=" PGNSP PGUID b f f 869 869 16 932 0 network_supeq networksel networkjoinsel "---"));
DESCR("is supernet or equal");
#define OID_INET_SUPEQ_OP 934
-DATA(insert OID = 3552 ( "&&" PGNSP PGUID b f f 869 869 16 3552 0 network_overlap networksel networkjoinsel ));
+DATA(insert OID = 3552 ( "&&" PGNSP PGUID b f f 869 869 16 3552 0 network_overlap networksel networkjoinsel "---"));
DESCR("overlaps (is subnet or supernet)");
#define OID_INET_OVERLAP_OP 3552
-DATA(insert OID = 2634 ( "~" PGNSP PGUID l f f 0 869 869 0 0 inetnot - - ));
+DATA(insert OID = 2634 ( "~" PGNSP PGUID l f f 0 869 869 0 0 inetnot - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 2635 ( "&" PGNSP PGUID b f f 869 869 869 0 0 inetand - - ));
+DATA(insert OID = 2635 ( "&" PGNSP PGUID b f f 869 869 869 0 0 inetand - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 2636 ( "|" PGNSP PGUID b f f 869 869 869 0 0 inetor - - ));
+DATA(insert OID = 2636 ( "|" PGNSP PGUID b f f 869 869 869 0 0 inetor - - "---"));
DESCR("bitwise or");
-DATA(insert OID = 2637 ( "+" PGNSP PGUID b f f 869 20 869 2638 0 inetpl - - ));
+DATA(insert OID = 2637 ( "+" PGNSP PGUID b f f 869 20 869 2638 0 inetpl - - "---"));
DESCR("add");
-DATA(insert OID = 2638 ( "+" PGNSP PGUID b f f 20 869 869 2637 0 int8pl_inet - - ));
+DATA(insert OID = 2638 ( "+" PGNSP PGUID b f f 20 869 869 2637 0 int8pl_inet - - "---"));
DESCR("add");
-DATA(insert OID = 2639 ( "-" PGNSP PGUID b f f 869 20 869 0 0 inetmi_int8 - - ));
+DATA(insert OID = 2639 ( "-" PGNSP PGUID b f f 869 20 869 0 0 inetmi_int8 - - "---"));
DESCR("subtract");
-DATA(insert OID = 2640 ( "-" PGNSP PGUID b f f 869 869 20 0 0 inetmi - - ));
+DATA(insert OID = 2640 ( "-" PGNSP PGUID b f f 869 869 20 0 0 inetmi - - "---"));
DESCR("subtract");
/* case-insensitive LIKE hacks */
-DATA(insert OID = 1625 ( "~~*" PGNSP PGUID b f f 19 25 16 0 1626 nameiclike iclikesel iclikejoinsel ));
+DATA(insert OID = 1625 ( "~~*" PGNSP PGUID b f f 19 25 16 0 1626 nameiclike iclikesel iclikejoinsel "---"));
DESCR("matches LIKE expression, case-insensitive");
#define OID_NAME_ICLIKE_OP 1625
-DATA(insert OID = 1626 ( "!~~*" PGNSP PGUID b f f 19 25 16 0 1625 nameicnlike icnlikesel icnlikejoinsel ));
+DATA(insert OID = 1626 ( "!~~*" PGNSP PGUID b f f 19 25 16 0 1625 nameicnlike icnlikesel icnlikejoinsel "---"));
DESCR("does not match LIKE expression, case-insensitive");
-DATA(insert OID = 1627 ( "~~*" PGNSP PGUID b f f 25 25 16 0 1628 texticlike iclikesel iclikejoinsel ));
+DATA(insert OID = 1627 ( "~~*" PGNSP PGUID b f f 25 25 16 0 1628 texticlike iclikesel iclikejoinsel "---"));
DESCR("matches LIKE expression, case-insensitive");
#define OID_TEXT_ICLIKE_OP 1627
-DATA(insert OID = 1628 ( "!~~*" PGNSP PGUID b f f 25 25 16 0 1627 texticnlike icnlikesel icnlikejoinsel ));
+DATA(insert OID = 1628 ( "!~~*" PGNSP PGUID b f f 25 25 16 0 1627 texticnlike icnlikesel icnlikejoinsel "---"));
DESCR("does not match LIKE expression, case-insensitive");
-DATA(insert OID = 1629 ( "~~*" PGNSP PGUID b f f 1042 25 16 0 1630 bpchariclike iclikesel iclikejoinsel ));
+DATA(insert OID = 1629 ( "~~*" PGNSP PGUID b f f 1042 25 16 0 1630 bpchariclike iclikesel iclikejoinsel "---"));
DESCR("matches LIKE expression, case-insensitive");
#define OID_BPCHAR_ICLIKE_OP 1629
-DATA(insert OID = 1630 ( "!~~*" PGNSP PGUID b f f 1042 25 16 0 1629 bpcharicnlike icnlikesel icnlikejoinsel ));
+DATA(insert OID = 1630 ( "!~~*" PGNSP PGUID b f f 1042 25 16 0 1629 bpcharicnlike icnlikesel icnlikejoinsel "---"));
DESCR("does not match LIKE expression, case-insensitive");
/* NUMERIC type - OID's 1700-1799 */
-DATA(insert OID = 1751 ( "-" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_uminus - - ));
+DATA(insert OID = 1751 ( "-" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_uminus - - "---"));
DESCR("negate");
-DATA(insert OID = 1752 ( "=" PGNSP PGUID b t t 1700 1700 16 1752 1753 numeric_eq eqsel eqjoinsel ));
+DATA(insert OID = 1752 ( "=" PGNSP PGUID b t t 1700 1700 16 1752 1753 numeric_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1753 ( "<>" PGNSP PGUID b f f 1700 1700 16 1753 1752 numeric_ne neqsel neqjoinsel ));
+DATA(insert OID = 1753 ( "<>" PGNSP PGUID b f f 1700 1700 16 1753 1752 numeric_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1754 ( "<" PGNSP PGUID b f f 1700 1700 16 1756 1757 numeric_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1754 ( "<" PGNSP PGUID b f f 1700 1700 16 1756 1757 numeric_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1755 ( "<=" PGNSP PGUID b f f 1700 1700 16 1757 1756 numeric_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1755 ( "<=" PGNSP PGUID b f f 1700 1700 16 1757 1756 numeric_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1756 ( ">" PGNSP PGUID b f f 1700 1700 16 1754 1755 numeric_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1756 ( ">" PGNSP PGUID b f f 1700 1700 16 1754 1755 numeric_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1757 ( ">=" PGNSP PGUID b f f 1700 1700 16 1755 1754 numeric_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1757 ( ">=" PGNSP PGUID b f f 1700 1700 16 1755 1754 numeric_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1758 ( "+" PGNSP PGUID b f f 1700 1700 1700 1758 0 numeric_add - - ));
+DATA(insert OID = 1758 ( "+" PGNSP PGUID b f f 1700 1700 1700 1758 0 numeric_add - - "---"));
DESCR("add");
-DATA(insert OID = 1759 ( "-" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_sub - - ));
+DATA(insert OID = 1759 ( "-" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_sub - - "---"));
DESCR("subtract");
-DATA(insert OID = 1760 ( "*" PGNSP PGUID b f f 1700 1700 1700 1760 0 numeric_mul - - ));
+DATA(insert OID = 1760 ( "*" PGNSP PGUID b f f 1700 1700 1700 1760 0 numeric_mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 1761 ( "/" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_div - - ));
+DATA(insert OID = 1761 ( "/" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_div - - "---"));
DESCR("divide");
-DATA(insert OID = 1762 ( "%" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_mod - - ));
+DATA(insert OID = 1762 ( "%" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_mod - - "---"));
DESCR("modulus");
-DATA(insert OID = 1038 ( "^" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_power - - ));
+DATA(insert OID = 1038 ( "^" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_power - - "---"));
DESCR("exponentiation");
-DATA(insert OID = 1763 ( "@" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_abs - - ));
+DATA(insert OID = 1763 ( "@" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_abs - - "---"));
DESCR("absolute value");
-DATA(insert OID = 1784 ( "=" PGNSP PGUID b t f 1560 1560 16 1784 1785 biteq eqsel eqjoinsel ));
+DATA(insert OID = 1784 ( "=" PGNSP PGUID b t f 1560 1560 16 1784 1785 biteq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1785 ( "<>" PGNSP PGUID b f f 1560 1560 16 1785 1784 bitne neqsel neqjoinsel ));
+DATA(insert OID = 1785 ( "<>" PGNSP PGUID b f f 1560 1560 16 1785 1784 bitne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1786 ( "<" PGNSP PGUID b f f 1560 1560 16 1787 1789 bitlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1786 ( "<" PGNSP PGUID b f f 1560 1560 16 1787 1789 bitlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1787 ( ">" PGNSP PGUID b f f 1560 1560 16 1786 1788 bitgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1787 ( ">" PGNSP PGUID b f f 1560 1560 16 1786 1788 bitgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1788 ( "<=" PGNSP PGUID b f f 1560 1560 16 1789 1787 bitle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1788 ( "<=" PGNSP PGUID b f f 1560 1560 16 1789 1787 bitle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1789 ( ">=" PGNSP PGUID b f f 1560 1560 16 1788 1786 bitge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1789 ( ">=" PGNSP PGUID b f f 1560 1560 16 1788 1786 bitge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1791 ( "&" PGNSP PGUID b f f 1560 1560 1560 1791 0 bitand - - ));
+DATA(insert OID = 1791 ( "&" PGNSP PGUID b f f 1560 1560 1560 1791 0 bitand - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 1792 ( "|" PGNSP PGUID b f f 1560 1560 1560 1792 0 bitor - - ));
+DATA(insert OID = 1792 ( "|" PGNSP PGUID b f f 1560 1560 1560 1792 0 bitor - - "---"));
DESCR("bitwise or");
-DATA(insert OID = 1793 ( "#" PGNSP PGUID b f f 1560 1560 1560 1793 0 bitxor - - ));
+DATA(insert OID = 1793 ( "#" PGNSP PGUID b f f 1560 1560 1560 1793 0 bitxor - - "---"));
DESCR("bitwise exclusive or");
-DATA(insert OID = 1794 ( "~" PGNSP PGUID l f f 0 1560 1560 0 0 bitnot - - ));
+DATA(insert OID = 1794 ( "~" PGNSP PGUID l f f 0 1560 1560 0 0 bitnot - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 1795 ( "<<" PGNSP PGUID b f f 1560 23 1560 0 0 bitshiftleft - - ));
+DATA(insert OID = 1795 ( "<<" PGNSP PGUID b f f 1560 23 1560 0 0 bitshiftleft - - "---"));
DESCR("bitwise shift left");
-DATA(insert OID = 1796 ( ">>" PGNSP PGUID b f f 1560 23 1560 0 0 bitshiftright - - ));
+DATA(insert OID = 1796 ( ">>" PGNSP PGUID b f f 1560 23 1560 0 0 bitshiftright - - "---"));
DESCR("bitwise shift right");
-DATA(insert OID = 1797 ( "||" PGNSP PGUID b f f 1562 1562 1562 0 0 bitcat - - ));
+DATA(insert OID = 1797 ( "||" PGNSP PGUID b f f 1562 1562 1562 0 0 bitcat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 1800 ( "+" PGNSP PGUID b f f 1083 1186 1083 1849 0 time_pl_interval - - ));
+DATA(insert OID = 1800 ( "+" PGNSP PGUID b f f 1083 1186 1083 1849 0 time_pl_interval - - "---"));
DESCR("add");
-DATA(insert OID = 1801 ( "-" PGNSP PGUID b f f 1083 1186 1083 0 0 time_mi_interval - - ));
+DATA(insert OID = 1801 ( "-" PGNSP PGUID b f f 1083 1186 1083 0 0 time_mi_interval - - "---"));
DESCR("subtract");
-DATA(insert OID = 1802 ( "+" PGNSP PGUID b f f 1266 1186 1266 2552 0 timetz_pl_interval - - ));
+DATA(insert OID = 1802 ( "+" PGNSP PGUID b f f 1266 1186 1266 2552 0 timetz_pl_interval - - "---"));
DESCR("add");
-DATA(insert OID = 1803 ( "-" PGNSP PGUID b f f 1266 1186 1266 0 0 timetz_mi_interval - - ));
+DATA(insert OID = 1803 ( "-" PGNSP PGUID b f f 1266 1186 1266 0 0 timetz_mi_interval - - "---"));
DESCR("subtract");
-DATA(insert OID = 1804 ( "=" PGNSP PGUID b t f 1562 1562 16 1804 1805 varbiteq eqsel eqjoinsel ));
+DATA(insert OID = 1804 ( "=" PGNSP PGUID b t f 1562 1562 16 1804 1805 varbiteq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1805 ( "<>" PGNSP PGUID b f f 1562 1562 16 1805 1804 varbitne neqsel neqjoinsel ));
+DATA(insert OID = 1805 ( "<>" PGNSP PGUID b f f 1562 1562 16 1805 1804 varbitne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1806 ( "<" PGNSP PGUID b f f 1562 1562 16 1807 1809 varbitlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1806 ( "<" PGNSP PGUID b f f 1562 1562 16 1807 1809 varbitlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1807 ( ">" PGNSP PGUID b f f 1562 1562 16 1806 1808 varbitgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1807 ( ">" PGNSP PGUID b f f 1562 1562 16 1806 1808 varbitgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1808 ( "<=" PGNSP PGUID b f f 1562 1562 16 1809 1807 varbitle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1808 ( "<=" PGNSP PGUID b f f 1562 1562 16 1809 1807 varbitle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1809 ( ">=" PGNSP PGUID b f f 1562 1562 16 1808 1806 varbitge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1809 ( ">=" PGNSP PGUID b f f 1562 1562 16 1808 1806 varbitge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1849 ( "+" PGNSP PGUID b f f 1186 1083 1083 1800 0 interval_pl_time - - ));
+DATA(insert OID = 1849 ( "+" PGNSP PGUID b f f 1186 1083 1083 1800 0 interval_pl_time - - "---"));
DESCR("add");
-DATA(insert OID = 1862 ( "=" PGNSP PGUID b t t 21 20 16 1868 1863 int28eq eqsel eqjoinsel ));
+DATA(insert OID = 1862 ( "=" PGNSP PGUID b t t 21 20 16 1868 1863 int28eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1863 ( "<>" PGNSP PGUID b f f 21 20 16 1869 1862 int28ne neqsel neqjoinsel ));
+DATA(insert OID = 1863 ( "<>" PGNSP PGUID b f f 21 20 16 1869 1862 int28ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1864 ( "<" PGNSP PGUID b f f 21 20 16 1871 1867 int28lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1864 ( "<" PGNSP PGUID b f f 21 20 16 1871 1867 int28lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1865 ( ">" PGNSP PGUID b f f 21 20 16 1870 1866 int28gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1865 ( ">" PGNSP PGUID b f f 21 20 16 1870 1866 int28gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1866 ( "<=" PGNSP PGUID b f f 21 20 16 1873 1865 int28le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1866 ( "<=" PGNSP PGUID b f f 21 20 16 1873 1865 int28le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1867 ( ">=" PGNSP PGUID b f f 21 20 16 1872 1864 int28ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1867 ( ">=" PGNSP PGUID b f f 21 20 16 1872 1864 int28ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1868 ( "=" PGNSP PGUID b t t 20 21 16 1862 1869 int82eq eqsel eqjoinsel ));
+DATA(insert OID = 1868 ( "=" PGNSP PGUID b t t 20 21 16 1862 1869 int82eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1869 ( "<>" PGNSP PGUID b f f 20 21 16 1863 1868 int82ne neqsel neqjoinsel ));
+DATA(insert OID = 1869 ( "<>" PGNSP PGUID b f f 20 21 16 1863 1868 int82ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1870 ( "<" PGNSP PGUID b f f 20 21 16 1865 1873 int82lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1870 ( "<" PGNSP PGUID b f f 20 21 16 1865 1873 int82lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1871 ( ">" PGNSP PGUID b f f 20 21 16 1864 1872 int82gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1871 ( ">" PGNSP PGUID b f f 20 21 16 1864 1872 int82gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1872 ( "<=" PGNSP PGUID b f f 20 21 16 1867 1871 int82le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1872 ( "<=" PGNSP PGUID b f f 20 21 16 1867 1871 int82le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1873 ( ">=" PGNSP PGUID b f f 20 21 16 1866 1870 int82ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1873 ( ">=" PGNSP PGUID b f f 20 21 16 1866 1870 int82ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1874 ( "&" PGNSP PGUID b f f 21 21 21 1874 0 int2and - - ));
+DATA(insert OID = 1874 ( "&" PGNSP PGUID b f f 21 21 21 1874 0 int2and - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 1875 ( "|" PGNSP PGUID b f f 21 21 21 1875 0 int2or - - ));
+DATA(insert OID = 1875 ( "|" PGNSP PGUID b f f 21 21 21 1875 0 int2or - - "---"));
DESCR("bitwise or");
-DATA(insert OID = 1876 ( "#" PGNSP PGUID b f f 21 21 21 1876 0 int2xor - - ));
+DATA(insert OID = 1876 ( "#" PGNSP PGUID b f f 21 21 21 1876 0 int2xor - - "---"));
DESCR("bitwise exclusive or");
-DATA(insert OID = 1877 ( "~" PGNSP PGUID l f f 0 21 21 0 0 int2not - - ));
+DATA(insert OID = 1877 ( "~" PGNSP PGUID l f f 0 21 21 0 0 int2not - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 1878 ( "<<" PGNSP PGUID b f f 21 23 21 0 0 int2shl - - ));
+DATA(insert OID = 1878 ( "<<" PGNSP PGUID b f f 21 23 21 0 0 int2shl - - "---"));
DESCR("bitwise shift left");
-DATA(insert OID = 1879 ( ">>" PGNSP PGUID b f f 21 23 21 0 0 int2shr - - ));
+DATA(insert OID = 1879 ( ">>" PGNSP PGUID b f f 21 23 21 0 0 int2shr - - "---"));
DESCR("bitwise shift right");
-DATA(insert OID = 1880 ( "&" PGNSP PGUID b f f 23 23 23 1880 0 int4and - - ));
+DATA(insert OID = 1880 ( "&" PGNSP PGUID b f f 23 23 23 1880 0 int4and - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 1881 ( "|" PGNSP PGUID b f f 23 23 23 1881 0 int4or - - ));
+DATA(insert OID = 1881 ( "|" PGNSP PGUID b f f 23 23 23 1881 0 int4or - - "---"));
DESCR("bitwise or");
-DATA(insert OID = 1882 ( "#" PGNSP PGUID b f f 23 23 23 1882 0 int4xor - - ));
+DATA(insert OID = 1882 ( "#" PGNSP PGUID b f f 23 23 23 1882 0 int4xor - - "---"));
DESCR("bitwise exclusive or");
-DATA(insert OID = 1883 ( "~" PGNSP PGUID l f f 0 23 23 0 0 int4not - - ));
+DATA(insert OID = 1883 ( "~" PGNSP PGUID l f f 0 23 23 0 0 int4not - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 1884 ( "<<" PGNSP PGUID b f f 23 23 23 0 0 int4shl - - ));
+DATA(insert OID = 1884 ( "<<" PGNSP PGUID b f f 23 23 23 0 0 int4shl - - "---"));
DESCR("bitwise shift left");
-DATA(insert OID = 1885 ( ">>" PGNSP PGUID b f f 23 23 23 0 0 int4shr - - ));
+DATA(insert OID = 1885 ( ">>" PGNSP PGUID b f f 23 23 23 0 0 int4shr - - "---"));
DESCR("bitwise shift right");
-DATA(insert OID = 1886 ( "&" PGNSP PGUID b f f 20 20 20 1886 0 int8and - - ));
+DATA(insert OID = 1886 ( "&" PGNSP PGUID b f f 20 20 20 1886 0 int8and - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 1887 ( "|" PGNSP PGUID b f f 20 20 20 1887 0 int8or - - ));
+DATA(insert OID = 1887 ( "|" PGNSP PGUID b f f 20 20 20 1887 0 int8or - - "---"));
DESCR("bitwise or");
-DATA(insert OID = 1888 ( "#" PGNSP PGUID b f f 20 20 20 1888 0 int8xor - - ));
+DATA(insert OID = 1888 ( "#" PGNSP PGUID b f f 20 20 20 1888 0 int8xor - - "---"));
DESCR("bitwise exclusive or");
-DATA(insert OID = 1889 ( "~" PGNSP PGUID l f f 0 20 20 0 0 int8not - - ));
+DATA(insert OID = 1889 ( "~" PGNSP PGUID l f f 0 20 20 0 0 int8not - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 1890 ( "<<" PGNSP PGUID b f f 20 23 20 0 0 int8shl - - ));
+DATA(insert OID = 1890 ( "<<" PGNSP PGUID b f f 20 23 20 0 0 int8shl - - "---"));
DESCR("bitwise shift left");
-DATA(insert OID = 1891 ( ">>" PGNSP PGUID b f f 20 23 20 0 0 int8shr - - ));
+DATA(insert OID = 1891 ( ">>" PGNSP PGUID b f f 20 23 20 0 0 int8shr - - "---"));
DESCR("bitwise shift right");
-DATA(insert OID = 1916 ( "+" PGNSP PGUID l f f 0 20 20 0 0 int8up - - ));
+DATA(insert OID = 1916 ( "+" PGNSP PGUID l f f 0 20 20 0 0 int8up - - "---"));
DESCR("unary plus");
-DATA(insert OID = 1917 ( "+" PGNSP PGUID l f f 0 21 21 0 0 int2up - - ));
+DATA(insert OID = 1917 ( "+" PGNSP PGUID l f f 0 21 21 0 0 int2up - - "---"));
DESCR("unary plus");
-DATA(insert OID = 1918 ( "+" PGNSP PGUID l f f 0 23 23 0 0 int4up - - ));
+DATA(insert OID = 1918 ( "+" PGNSP PGUID l f f 0 23 23 0 0 int4up - - "---"));
DESCR("unary plus");
-DATA(insert OID = 1919 ( "+" PGNSP PGUID l f f 0 700 700 0 0 float4up - - ));
+DATA(insert OID = 1919 ( "+" PGNSP PGUID l f f 0 700 700 0 0 float4up - - "---"));
DESCR("unary plus");
-DATA(insert OID = 1920 ( "+" PGNSP PGUID l f f 0 701 701 0 0 float8up - - ));
+DATA(insert OID = 1920 ( "+" PGNSP PGUID l f f 0 701 701 0 0 float8up - - "---"));
DESCR("unary plus");
-DATA(insert OID = 1921 ( "+" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_uplus - - ));
+DATA(insert OID = 1921 ( "+" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_uplus - - "---"));
DESCR("unary plus");
/* bytea operators */
-DATA(insert OID = 1955 ( "=" PGNSP PGUID b t t 17 17 16 1955 1956 byteaeq eqsel eqjoinsel ));
+DATA(insert OID = 1955 ( "=" PGNSP PGUID b t t 17 17 16 1955 1956 byteaeq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1956 ( "<>" PGNSP PGUID b f f 17 17 16 1956 1955 byteane neqsel neqjoinsel ));
+DATA(insert OID = 1956 ( "<>" PGNSP PGUID b f f 17 17 16 1956 1955 byteane neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1957 ( "<" PGNSP PGUID b f f 17 17 16 1959 1960 bytealt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1957 ( "<" PGNSP PGUID b f f 17 17 16 1959 1960 bytealt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1958 ( "<=" PGNSP PGUID b f f 17 17 16 1960 1959 byteale scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1958 ( "<=" PGNSP PGUID b f f 17 17 16 1960 1959 byteale scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1959 ( ">" PGNSP PGUID b f f 17 17 16 1957 1958 byteagt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1959 ( ">" PGNSP PGUID b f f 17 17 16 1957 1958 byteagt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1960 ( ">=" PGNSP PGUID b f f 17 17 16 1958 1957 byteage scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1960 ( ">=" PGNSP PGUID b f f 17 17 16 1958 1957 byteage scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2016 ( "~~" PGNSP PGUID b f f 17 17 16 0 2017 bytealike likesel likejoinsel ));
+DATA(insert OID = 2016 ( "~~" PGNSP PGUID b f f 17 17 16 0 2017 bytealike likesel likejoinsel "---"));
DESCR("matches LIKE expression");
#define OID_BYTEA_LIKE_OP 2016
-DATA(insert OID = 2017 ( "!~~" PGNSP PGUID b f f 17 17 16 0 2016 byteanlike nlikesel nlikejoinsel ));
+DATA(insert OID = 2017 ( "!~~" PGNSP PGUID b f f 17 17 16 0 2016 byteanlike nlikesel nlikejoinsel "---"));
DESCR("does not match LIKE expression");
-DATA(insert OID = 2018 ( "||" PGNSP PGUID b f f 17 17 17 0 0 byteacat - - ));
+DATA(insert OID = 2018 ( "||" PGNSP PGUID b f f 17 17 17 0 0 byteacat - - "---"));
DESCR("concatenate");
/* timestamp operators */
-DATA(insert OID = 2060 ( "=" PGNSP PGUID b t t 1114 1114 16 2060 2061 timestamp_eq eqsel eqjoinsel ));
+DATA(insert OID = 2060 ( "=" PGNSP PGUID b t t 1114 1114 16 2060 2061 timestamp_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2061 ( "<>" PGNSP PGUID b f f 1114 1114 16 2061 2060 timestamp_ne neqsel neqjoinsel ));
+DATA(insert OID = 2061 ( "<>" PGNSP PGUID b f f 1114 1114 16 2061 2060 timestamp_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2062 ( "<" PGNSP PGUID b f f 1114 1114 16 2064 2065 timestamp_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2062 ( "<" PGNSP PGUID b f f 1114 1114 16 2064 2065 timestamp_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2063 ( "<=" PGNSP PGUID b f f 1114 1114 16 2065 2064 timestamp_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2063 ( "<=" PGNSP PGUID b f f 1114 1114 16 2065 2064 timestamp_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2064 ( ">" PGNSP PGUID b f f 1114 1114 16 2062 2063 timestamp_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2064 ( ">" PGNSP PGUID b f f 1114 1114 16 2062 2063 timestamp_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2065 ( ">=" PGNSP PGUID b f f 1114 1114 16 2063 2062 timestamp_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2065 ( ">=" PGNSP PGUID b f f 1114 1114 16 2063 2062 timestamp_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2066 ( "+" PGNSP PGUID b f f 1114 1186 1114 2553 0 timestamp_pl_interval - - ));
+DATA(insert OID = 2066 ( "+" PGNSP PGUID b f f 1114 1186 1114 2553 0 timestamp_pl_interval - - "---"));
DESCR("add");
-DATA(insert OID = 2067 ( "-" PGNSP PGUID b f f 1114 1114 1186 0 0 timestamp_mi - - ));
+DATA(insert OID = 2067 ( "-" PGNSP PGUID b f f 1114 1114 1186 0 0 timestamp_mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 2068 ( "-" PGNSP PGUID b f f 1114 1186 1114 0 0 timestamp_mi_interval - - ));
+DATA(insert OID = 2068 ( "-" PGNSP PGUID b f f 1114 1186 1114 0 0 timestamp_mi_interval - - "---"));
DESCR("subtract");
/* character-by-character (not collation order) comparison operators for character types */
-DATA(insert OID = 2314 ( "~<~" PGNSP PGUID b f f 25 25 16 2318 2317 text_pattern_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2314 ( "~<~" PGNSP PGUID b f f 25 25 16 2318 2317 text_pattern_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2315 ( "~<=~" PGNSP PGUID b f f 25 25 16 2317 2318 text_pattern_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2315 ( "~<=~" PGNSP PGUID b f f 25 25 16 2317 2318 text_pattern_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2317 ( "~>=~" PGNSP PGUID b f f 25 25 16 2315 2314 text_pattern_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2317 ( "~>=~" PGNSP PGUID b f f 25 25 16 2315 2314 text_pattern_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2318 ( "~>~" PGNSP PGUID b f f 25 25 16 2314 2315 text_pattern_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2318 ( "~>~" PGNSP PGUID b f f 25 25 16 2314 2315 text_pattern_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2326 ( "~<~" PGNSP PGUID b f f 1042 1042 16 2330 2329 bpchar_pattern_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2326 ( "~<~" PGNSP PGUID b f f 1042 1042 16 2330 2329 bpchar_pattern_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2327 ( "~<=~" PGNSP PGUID b f f 1042 1042 16 2329 2330 bpchar_pattern_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2327 ( "~<=~" PGNSP PGUID b f f 1042 1042 16 2329 2330 bpchar_pattern_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2329 ( "~>=~" PGNSP PGUID b f f 1042 1042 16 2327 2326 bpchar_pattern_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2329 ( "~>=~" PGNSP PGUID b f f 1042 1042 16 2327 2326 bpchar_pattern_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2330 ( "~>~" PGNSP PGUID b f f 1042 1042 16 2326 2327 bpchar_pattern_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2330 ( "~>~" PGNSP PGUID b f f 1042 1042 16 2326 2327 bpchar_pattern_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
/* crosstype operations for date vs. timestamp and timestamptz */
-DATA(insert OID = 2345 ( "<" PGNSP PGUID b f f 1082 1114 16 2375 2348 date_lt_timestamp scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2345 ( "<" PGNSP PGUID b f f 1082 1114 16 2375 2348 date_lt_timestamp scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2346 ( "<=" PGNSP PGUID b f f 1082 1114 16 2374 2349 date_le_timestamp scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2346 ( "<=" PGNSP PGUID b f f 1082 1114 16 2374 2349 date_le_timestamp scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2347 ( "=" PGNSP PGUID b t f 1082 1114 16 2373 2350 date_eq_timestamp eqsel eqjoinsel ));
+DATA(insert OID = 2347 ( "=" PGNSP PGUID b t f 1082 1114 16 2373 2350 date_eq_timestamp eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2348 ( ">=" PGNSP PGUID b f f 1082 1114 16 2372 2345 date_ge_timestamp scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2348 ( ">=" PGNSP PGUID b f f 1082 1114 16 2372 2345 date_ge_timestamp scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2349 ( ">" PGNSP PGUID b f f 1082 1114 16 2371 2346 date_gt_timestamp scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2349 ( ">" PGNSP PGUID b f f 1082 1114 16 2371 2346 date_gt_timestamp scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2350 ( "<>" PGNSP PGUID b f f 1082 1114 16 2376 2347 date_ne_timestamp neqsel neqjoinsel ));
+DATA(insert OID = 2350 ( "<>" PGNSP PGUID b f f 1082 1114 16 2376 2347 date_ne_timestamp neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2358 ( "<" PGNSP PGUID b f f 1082 1184 16 2388 2361 date_lt_timestamptz scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2358 ( "<" PGNSP PGUID b f f 1082 1184 16 2388 2361 date_lt_timestamptz scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2359 ( "<=" PGNSP PGUID b f f 1082 1184 16 2387 2362 date_le_timestamptz scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2359 ( "<=" PGNSP PGUID b f f 1082 1184 16 2387 2362 date_le_timestamptz scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2360 ( "=" PGNSP PGUID b t f 1082 1184 16 2386 2363 date_eq_timestamptz eqsel eqjoinsel ));
+DATA(insert OID = 2360 ( "=" PGNSP PGUID b t f 1082 1184 16 2386 2363 date_eq_timestamptz eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2361 ( ">=" PGNSP PGUID b f f 1082 1184 16 2385 2358 date_ge_timestamptz scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2361 ( ">=" PGNSP PGUID b f f 1082 1184 16 2385 2358 date_ge_timestamptz scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2362 ( ">" PGNSP PGUID b f f 1082 1184 16 2384 2359 date_gt_timestamptz scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2362 ( ">" PGNSP PGUID b f f 1082 1184 16 2384 2359 date_gt_timestamptz scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2363 ( "<>" PGNSP PGUID b f f 1082 1184 16 2389 2360 date_ne_timestamptz neqsel neqjoinsel ));
+DATA(insert OID = 2363 ( "<>" PGNSP PGUID b f f 1082 1184 16 2389 2360 date_ne_timestamptz neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2371 ( "<" PGNSP PGUID b f f 1114 1082 16 2349 2374 timestamp_lt_date scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2371 ( "<" PGNSP PGUID b f f 1114 1082 16 2349 2374 timestamp_lt_date scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2372 ( "<=" PGNSP PGUID b f f 1114 1082 16 2348 2375 timestamp_le_date scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2372 ( "<=" PGNSP PGUID b f f 1114 1082 16 2348 2375 timestamp_le_date scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2373 ( "=" PGNSP PGUID b t f 1114 1082 16 2347 2376 timestamp_eq_date eqsel eqjoinsel ));
+DATA(insert OID = 2373 ( "=" PGNSP PGUID b t f 1114 1082 16 2347 2376 timestamp_eq_date eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2374 ( ">=" PGNSP PGUID b f f 1114 1082 16 2346 2371 timestamp_ge_date scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2374 ( ">=" PGNSP PGUID b f f 1114 1082 16 2346 2371 timestamp_ge_date scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2375 ( ">" PGNSP PGUID b f f 1114 1082 16 2345 2372 timestamp_gt_date scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2375 ( ">" PGNSP PGUID b f f 1114 1082 16 2345 2372 timestamp_gt_date scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2376 ( "<>" PGNSP PGUID b f f 1114 1082 16 2350 2373 timestamp_ne_date neqsel neqjoinsel ));
+DATA(insert OID = 2376 ( "<>" PGNSP PGUID b f f 1114 1082 16 2350 2373 timestamp_ne_date neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2384 ( "<" PGNSP PGUID b f f 1184 1082 16 2362 2387 timestamptz_lt_date scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2384 ( "<" PGNSP PGUID b f f 1184 1082 16 2362 2387 timestamptz_lt_date scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2385 ( "<=" PGNSP PGUID b f f 1184 1082 16 2361 2388 timestamptz_le_date scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2385 ( "<=" PGNSP PGUID b f f 1184 1082 16 2361 2388 timestamptz_le_date scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2386 ( "=" PGNSP PGUID b t f 1184 1082 16 2360 2389 timestamptz_eq_date eqsel eqjoinsel ));
+DATA(insert OID = 2386 ( "=" PGNSP PGUID b t f 1184 1082 16 2360 2389 timestamptz_eq_date eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2387 ( ">=" PGNSP PGUID b f f 1184 1082 16 2359 2384 timestamptz_ge_date scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2387 ( ">=" PGNSP PGUID b f f 1184 1082 16 2359 2384 timestamptz_ge_date scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2388 ( ">" PGNSP PGUID b f f 1184 1082 16 2358 2385 timestamptz_gt_date scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2388 ( ">" PGNSP PGUID b f f 1184 1082 16 2358 2385 timestamptz_gt_date scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2389 ( "<>" PGNSP PGUID b f f 1184 1082 16 2363 2386 timestamptz_ne_date neqsel neqjoinsel ));
+DATA(insert OID = 2389 ( "<>" PGNSP PGUID b f f 1184 1082 16 2363 2386 timestamptz_ne_date neqsel neqjoinsel "mhf"));
DESCR("not equal");
/* crosstype operations for timestamp vs. timestamptz */
-DATA(insert OID = 2534 ( "<" PGNSP PGUID b f f 1114 1184 16 2544 2537 timestamp_lt_timestamptz scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2534 ( "<" PGNSP PGUID b f f 1114 1184 16 2544 2537 timestamp_lt_timestamptz scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2535 ( "<=" PGNSP PGUID b f f 1114 1184 16 2543 2538 timestamp_le_timestamptz scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2535 ( "<=" PGNSP PGUID b f f 1114 1184 16 2543 2538 timestamp_le_timestamptz scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2536 ( "=" PGNSP PGUID b t f 1114 1184 16 2542 2539 timestamp_eq_timestamptz eqsel eqjoinsel ));
+DATA(insert OID = 2536 ( "=" PGNSP PGUID b t f 1114 1184 16 2542 2539 timestamp_eq_timestamptz eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2537 ( ">=" PGNSP PGUID b f f 1114 1184 16 2541 2534 timestamp_ge_timestamptz scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2537 ( ">=" PGNSP PGUID b f f 1114 1184 16 2541 2534 timestamp_ge_timestamptz scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2538 ( ">" PGNSP PGUID b f f 1114 1184 16 2540 2535 timestamp_gt_timestamptz scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2538 ( ">" PGNSP PGUID b f f 1114 1184 16 2540 2535 timestamp_gt_timestamptz scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2539 ( "<>" PGNSP PGUID b f f 1114 1184 16 2545 2536 timestamp_ne_timestamptz neqsel neqjoinsel ));
+DATA(insert OID = 2539 ( "<>" PGNSP PGUID b f f 1114 1184 16 2545 2536 timestamp_ne_timestamptz neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2540 ( "<" PGNSP PGUID b f f 1184 1114 16 2538 2543 timestamptz_lt_timestamp scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2540 ( "<" PGNSP PGUID b f f 1184 1114 16 2538 2543 timestamptz_lt_timestamp scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2541 ( "<=" PGNSP PGUID b f f 1184 1114 16 2537 2544 timestamptz_le_timestamp scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2541 ( "<=" PGNSP PGUID b f f 1184 1114 16 2537 2544 timestamptz_le_timestamp scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2542 ( "=" PGNSP PGUID b t f 1184 1114 16 2536 2545 timestamptz_eq_timestamp eqsel eqjoinsel ));
+DATA(insert OID = 2542 ( "=" PGNSP PGUID b t f 1184 1114 16 2536 2545 timestamptz_eq_timestamp eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2543 ( ">=" PGNSP PGUID b f f 1184 1114 16 2535 2540 timestamptz_ge_timestamp scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2543 ( ">=" PGNSP PGUID b f f 1184 1114 16 2535 2540 timestamptz_ge_timestamp scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2544 ( ">" PGNSP PGUID b f f 1184 1114 16 2534 2541 timestamptz_gt_timestamp scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2544 ( ">" PGNSP PGUID b f f 1184 1114 16 2534 2541 timestamptz_gt_timestamp scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2545 ( "<>" PGNSP PGUID b f f 1184 1114 16 2539 2542 timestamptz_ne_timestamp neqsel neqjoinsel ));
+DATA(insert OID = 2545 ( "<>" PGNSP PGUID b f f 1184 1114 16 2539 2542 timestamptz_ne_timestamp neqsel neqjoinsel "mhf"));
DESCR("not equal");
/* formerly-missing interval + datetime operators */
-DATA(insert OID = 2551 ( "+" PGNSP PGUID b f f 1186 1082 1114 1076 0 interval_pl_date - - ));
+DATA(insert OID = 2551 ( "+" PGNSP PGUID b f f 1186 1082 1114 1076 0 interval_pl_date - - "---"));
DESCR("add");
-DATA(insert OID = 2552 ( "+" PGNSP PGUID b f f 1186 1266 1266 1802 0 interval_pl_timetz - - ));
+DATA(insert OID = 2552 ( "+" PGNSP PGUID b f f 1186 1266 1266 1802 0 interval_pl_timetz - - "---"));
DESCR("add");
-DATA(insert OID = 2553 ( "+" PGNSP PGUID b f f 1186 1114 1114 2066 0 interval_pl_timestamp - - ));
+DATA(insert OID = 2553 ( "+" PGNSP PGUID b f f 1186 1114 1114 2066 0 interval_pl_timestamp - - "---"));
DESCR("add");
-DATA(insert OID = 2554 ( "+" PGNSP PGUID b f f 1186 1184 1184 1327 0 interval_pl_timestamptz - - ));
+DATA(insert OID = 2554 ( "+" PGNSP PGUID b f f 1186 1184 1184 1327 0 interval_pl_timestamptz - - "---"));
DESCR("add");
-DATA(insert OID = 2555 ( "+" PGNSP PGUID b f f 23 1082 1082 1100 0 integer_pl_date - - ));
+DATA(insert OID = 2555 ( "+" PGNSP PGUID b f f 23 1082 1082 1100 0 integer_pl_date - - "---"));
DESCR("add");
/* new operators for Y-direction rtree opfamilies */
-DATA(insert OID = 2570 ( "<<|" PGNSP PGUID b f f 603 603 16 0 0 box_below positionsel positionjoinsel ));
+DATA(insert OID = 2570 ( "<<|" PGNSP PGUID b f f 603 603 16 0 0 box_below positionsel positionjoinsel "---"));
DESCR("is below");
-DATA(insert OID = 2571 ( "&<|" PGNSP PGUID b f f 603 603 16 0 0 box_overbelow positionsel positionjoinsel ));
+DATA(insert OID = 2571 ( "&<|" PGNSP PGUID b f f 603 603 16 0 0 box_overbelow positionsel positionjoinsel "---"));
DESCR("overlaps or is below");
-DATA(insert OID = 2572 ( "|&>" PGNSP PGUID b f f 603 603 16 0 0 box_overabove positionsel positionjoinsel ));
+DATA(insert OID = 2572 ( "|&>" PGNSP PGUID b f f 603 603 16 0 0 box_overabove positionsel positionjoinsel "---"));
DESCR("overlaps or is above");
-DATA(insert OID = 2573 ( "|>>" PGNSP PGUID b f f 603 603 16 0 0 box_above positionsel positionjoinsel ));
+DATA(insert OID = 2573 ( "|>>" PGNSP PGUID b f f 603 603 16 0 0 box_above positionsel positionjoinsel "---"));
DESCR("is above");
-DATA(insert OID = 2574 ( "<<|" PGNSP PGUID b f f 604 604 16 0 0 poly_below positionsel positionjoinsel ));
+DATA(insert OID = 2574 ( "<<|" PGNSP PGUID b f f 604 604 16 0 0 poly_below positionsel positionjoinsel "---"));
DESCR("is below");
-DATA(insert OID = 2575 ( "&<|" PGNSP PGUID b f f 604 604 16 0 0 poly_overbelow positionsel positionjoinsel ));
+DATA(insert OID = 2575 ( "&<|" PGNSP PGUID b f f 604 604 16 0 0 poly_overbelow positionsel positionjoinsel "---"));
DESCR("overlaps or is below");
-DATA(insert OID = 2576 ( "|&>" PGNSP PGUID b f f 604 604 16 0 0 poly_overabove positionsel positionjoinsel ));
+DATA(insert OID = 2576 ( "|&>" PGNSP PGUID b f f 604 604 16 0 0 poly_overabove positionsel positionjoinsel "---"));
DESCR("overlaps or is above");
-DATA(insert OID = 2577 ( "|>>" PGNSP PGUID b f f 604 604 16 0 0 poly_above positionsel positionjoinsel ));
+DATA(insert OID = 2577 ( "|>>" PGNSP PGUID b f f 604 604 16 0 0 poly_above positionsel positionjoinsel "---"));
DESCR("is above");
-DATA(insert OID = 2589 ( "&<|" PGNSP PGUID b f f 718 718 16 0 0 circle_overbelow positionsel positionjoinsel ));
+DATA(insert OID = 2589 ( "&<|" PGNSP PGUID b f f 718 718 16 0 0 circle_overbelow positionsel positionjoinsel "---"));
DESCR("overlaps or is below");
-DATA(insert OID = 2590 ( "|&>" PGNSP PGUID b f f 718 718 16 0 0 circle_overabove positionsel positionjoinsel ));
+DATA(insert OID = 2590 ( "|&>" PGNSP PGUID b f f 718 718 16 0 0 circle_overabove positionsel positionjoinsel "---"));
DESCR("overlaps or is above");
/* overlap/contains/contained for arrays */
-DATA(insert OID = 2750 ( "&&" PGNSP PGUID b f f 2277 2277 16 2750 0 arrayoverlap arraycontsel arraycontjoinsel ));
+DATA(insert OID = 2750 ( "&&" PGNSP PGUID b f f 2277 2277 16 2750 0 arrayoverlap arraycontsel arraycontjoinsel "---"));
DESCR("overlaps");
#define OID_ARRAY_OVERLAP_OP 2750
-DATA(insert OID = 2751 ( "@>" PGNSP PGUID b f f 2277 2277 16 2752 0 arraycontains arraycontsel arraycontjoinsel ));
+DATA(insert OID = 2751 ( "@>" PGNSP PGUID b f f 2277 2277 16 2752 0 arraycontains arraycontsel arraycontjoinsel "---"));
DESCR("contains");
#define OID_ARRAY_CONTAINS_OP 2751
-DATA(insert OID = 2752 ( "<@" PGNSP PGUID b f f 2277 2277 16 2751 0 arraycontained arraycontsel arraycontjoinsel ));
+DATA(insert OID = 2752 ( "<@" PGNSP PGUID b f f 2277 2277 16 2751 0 arraycontained arraycontsel arraycontjoinsel "---"));
DESCR("is contained by");
#define OID_ARRAY_CONTAINED_OP 2752
/* capturing operators to preserve pre-8.3 behavior of text concatenation */
-DATA(insert OID = 2779 ( "||" PGNSP PGUID b f f 25 2776 25 0 0 textanycat - - ));
+DATA(insert OID = 2779 ( "||" PGNSP PGUID b f f 25 2776 25 0 0 textanycat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 2780 ( "||" PGNSP PGUID b f f 2776 25 25 0 0 anytextcat - - ));
+DATA(insert OID = 2780 ( "||" PGNSP PGUID b f f 2776 25 25 0 0 anytextcat - - "---"));
DESCR("concatenate");
/* obsolete names for contains/contained-by operators; remove these someday */
-DATA(insert OID = 2860 ( "@" PGNSP PGUID b f f 604 604 16 2861 0 poly_contained contsel contjoinsel ));
+DATA(insert OID = 2860 ( "@" PGNSP PGUID b f f 604 604 16 2861 0 poly_contained contsel contjoinsel "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2861 ( "~" PGNSP PGUID b f f 604 604 16 2860 0 poly_contain contsel contjoinsel ));
+DATA(insert OID = 2861 ( "~" PGNSP PGUID b f f 604 604 16 2860 0 poly_contain contsel contjoinsel "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2862 ( "@" PGNSP PGUID b f f 603 603 16 2863 0 box_contained contsel contjoinsel ));
+DATA(insert OID = 2862 ( "@" PGNSP PGUID b f f 603 603 16 2863 0 box_contained contsel contjoinsel "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2863 ( "~" PGNSP PGUID b f f 603 603 16 2862 0 box_contain contsel contjoinsel ));
+DATA(insert OID = 2863 ( "~" PGNSP PGUID b f f 603 603 16 2862 0 box_contain contsel contjoinsel "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2864 ( "@" PGNSP PGUID b f f 718 718 16 2865 0 circle_contained contsel contjoinsel ));
+DATA(insert OID = 2864 ( "@" PGNSP PGUID b f f 718 718 16 2865 0 circle_contained contsel contjoinsel "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2865 ( "~" PGNSP PGUID b f f 718 718 16 2864 0 circle_contain contsel contjoinsel ));
+DATA(insert OID = 2865 ( "~" PGNSP PGUID b f f 718 718 16 2864 0 circle_contain contsel contjoinsel "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2866 ( "@" PGNSP PGUID b f f 600 603 16 0 0 on_pb - - ));
+DATA(insert OID = 2866 ( "@" PGNSP PGUID b f f 600 603 16 0 0 on_pb - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2867 ( "@" PGNSP PGUID b f f 600 602 16 2868 0 on_ppath - - ));
+DATA(insert OID = 2867 ( "@" PGNSP PGUID b f f 600 602 16 2868 0 on_ppath - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2868 ( "~" PGNSP PGUID b f f 602 600 16 2867 0 path_contain_pt - - ));
+DATA(insert OID = 2868 ( "~" PGNSP PGUID b f f 602 600 16 2867 0 path_contain_pt - - "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2869 ( "@" PGNSP PGUID b f f 600 604 16 2870 0 pt_contained_poly - - ));
+DATA(insert OID = 2869 ( "@" PGNSP PGUID b f f 600 604 16 2870 0 pt_contained_poly - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2870 ( "~" PGNSP PGUID b f f 604 600 16 2869 0 poly_contain_pt - - ));
+DATA(insert OID = 2870 ( "~" PGNSP PGUID b f f 604 600 16 2869 0 poly_contain_pt - - "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2871 ( "@" PGNSP PGUID b f f 600 718 16 2872 0 pt_contained_circle - - ));
+DATA(insert OID = 2871 ( "@" PGNSP PGUID b f f 600 718 16 2872 0 pt_contained_circle - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2872 ( "~" PGNSP PGUID b f f 718 600 16 2871 0 circle_contain_pt - - ));
+DATA(insert OID = 2872 ( "~" PGNSP PGUID b f f 718 600 16 2871 0 circle_contain_pt - - "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2873 ( "@" PGNSP PGUID b f f 600 628 16 0 0 on_pl - - ));
+DATA(insert OID = 2873 ( "@" PGNSP PGUID b f f 600 628 16 0 0 on_pl - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2874 ( "@" PGNSP PGUID b f f 600 601 16 0 0 on_ps - - ));
+DATA(insert OID = 2874 ( "@" PGNSP PGUID b f f 600 601 16 0 0 on_ps - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2875 ( "@" PGNSP PGUID b f f 601 628 16 0 0 on_sl - - ));
+DATA(insert OID = 2875 ( "@" PGNSP PGUID b f f 601 628 16 0 0 on_sl - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2876 ( "@" PGNSP PGUID b f f 601 603 16 0 0 on_sb - - ));
+DATA(insert OID = 2876 ( "@" PGNSP PGUID b f f 601 603 16 0 0 on_sb - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2877 ( "~" PGNSP PGUID b f f 1034 1033 16 0 0 aclcontains - - ));
+DATA(insert OID = 2877 ( "~" PGNSP PGUID b f f 1034 1033 16 0 0 aclcontains - - "---"));
DESCR("deprecated, use @> instead");
/* uuid operators */
-DATA(insert OID = 2972 ( "=" PGNSP PGUID b t t 2950 2950 16 2972 2973 uuid_eq eqsel eqjoinsel ));
+DATA(insert OID = 2972 ( "=" PGNSP PGUID b t t 2950 2950 16 2972 2973 uuid_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2973 ( "<>" PGNSP PGUID b f f 2950 2950 16 2973 2972 uuid_ne neqsel neqjoinsel ));
+DATA(insert OID = 2973 ( "<>" PGNSP PGUID b f f 2950 2950 16 2973 2972 uuid_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2974 ( "<" PGNSP PGUID b f f 2950 2950 16 2975 2977 uuid_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2974 ( "<" PGNSP PGUID b f f 2950 2950 16 2975 2977 uuid_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2975 ( ">" PGNSP PGUID b f f 2950 2950 16 2974 2976 uuid_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2975 ( ">" PGNSP PGUID b f f 2950 2950 16 2974 2976 uuid_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2976 ( "<=" PGNSP PGUID b f f 2950 2950 16 2977 2975 uuid_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2976 ( "<=" PGNSP PGUID b f f 2950 2950 16 2977 2975 uuid_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2977 ( ">=" PGNSP PGUID b f f 2950 2950 16 2976 2974 uuid_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2977 ( ">=" PGNSP PGUID b f f 2950 2950 16 2976 2974 uuid_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* pg_lsn operators */
-DATA(insert OID = 3222 ( "=" PGNSP PGUID b t t 3220 3220 16 3222 3223 pg_lsn_eq eqsel eqjoinsel ));
+DATA(insert OID = 3222 ( "=" PGNSP PGUID b t t 3220 3220 16 3222 3223 pg_lsn_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3223 ( "<>" PGNSP PGUID b f f 3220 3220 16 3223 3222 pg_lsn_ne neqsel neqjoinsel ));
+DATA(insert OID = 3223 ( "<>" PGNSP PGUID b f f 3220 3220 16 3223 3222 pg_lsn_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3224 ( "<" PGNSP PGUID b f f 3220 3220 16 3225 3227 pg_lsn_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3224 ( "<" PGNSP PGUID b f f 3220 3220 16 3225 3227 pg_lsn_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3225 ( ">" PGNSP PGUID b f f 3220 3220 16 3224 3226 pg_lsn_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3225 ( ">" PGNSP PGUID b f f 3220 3220 16 3224 3226 pg_lsn_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3226 ( "<=" PGNSP PGUID b f f 3220 3220 16 3227 3225 pg_lsn_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3226 ( "<=" PGNSP PGUID b f f 3220 3220 16 3227 3225 pg_lsn_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3227 ( ">=" PGNSP PGUID b f f 3220 3220 16 3226 3224 pg_lsn_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3227 ( ">=" PGNSP PGUID b f f 3220 3220 16 3226 3224 pg_lsn_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 3228 ( "-" PGNSP PGUID b f f 3220 3220 1700 0 0 pg_lsn_mi - - ));
+DATA(insert OID = 3228 ( "-" PGNSP PGUID b f f 3220 3220 1700 0 0 pg_lsn_mi - - "---"));
DESCR("minus");
/* enum operators */
-DATA(insert OID = 3516 ( "=" PGNSP PGUID b t t 3500 3500 16 3516 3517 enum_eq eqsel eqjoinsel ));
+DATA(insert OID = 3516 ( "=" PGNSP PGUID b t t 3500 3500 16 3516 3517 enum_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3517 ( "<>" PGNSP PGUID b f f 3500 3500 16 3517 3516 enum_ne neqsel neqjoinsel ));
+DATA(insert OID = 3517 ( "<>" PGNSP PGUID b f f 3500 3500 16 3517 3516 enum_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3518 ( "<" PGNSP PGUID b f f 3500 3500 16 3519 3521 enum_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3518 ( "<" PGNSP PGUID b f f 3500 3500 16 3519 3521 enum_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3519 ( ">" PGNSP PGUID b f f 3500 3500 16 3518 3520 enum_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3519 ( ">" PGNSP PGUID b f f 3500 3500 16 3518 3520 enum_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3520 ( "<=" PGNSP PGUID b f f 3500 3500 16 3521 3519 enum_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3520 ( "<=" PGNSP PGUID b f f 3500 3500 16 3521 3519 enum_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3521 ( ">=" PGNSP PGUID b f f 3500 3500 16 3520 3518 enum_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3521 ( ">=" PGNSP PGUID b f f 3500 3500 16 3520 3518 enum_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/*
* tsearch operations
*/
-DATA(insert OID = 3627 ( "<" PGNSP PGUID b f f 3614 3614 16 3632 3631 tsvector_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3627 ( "<" PGNSP PGUID b f f 3614 3614 16 3632 3631 tsvector_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3628 ( "<=" PGNSP PGUID b f f 3614 3614 16 3631 3632 tsvector_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3628 ( "<=" PGNSP PGUID b f f 3614 3614 16 3631 3632 tsvector_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3629 ( "=" PGNSP PGUID b t f 3614 3614 16 3629 3630 tsvector_eq eqsel eqjoinsel ));
+DATA(insert OID = 3629 ( "=" PGNSP PGUID b t f 3614 3614 16 3629 3630 tsvector_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3630 ( "<>" PGNSP PGUID b f f 3614 3614 16 3630 3629 tsvector_ne neqsel neqjoinsel ));
+DATA(insert OID = 3630 ( "<>" PGNSP PGUID b f f 3614 3614 16 3630 3629 tsvector_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3631 ( ">=" PGNSP PGUID b f f 3614 3614 16 3628 3627 tsvector_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3631 ( ">=" PGNSP PGUID b f f 3614 3614 16 3628 3627 tsvector_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 3632 ( ">" PGNSP PGUID b f f 3614 3614 16 3627 3628 tsvector_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3632 ( ">" PGNSP PGUID b f f 3614 3614 16 3627 3628 tsvector_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3633 ( "||" PGNSP PGUID b f f 3614 3614 3614 0 0 tsvector_concat - - ));
+DATA(insert OID = 3633 ( "||" PGNSP PGUID b f f 3614 3614 3614 0 0 tsvector_concat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 3636 ( "@@" PGNSP PGUID b f f 3614 3615 16 3637 0 ts_match_vq tsmatchsel tsmatchjoinsel ));
+DATA(insert OID = 3636 ( "@@" PGNSP PGUID b f f 3614 3615 16 3637 0 ts_match_vq tsmatchsel tsmatchjoinsel "---"));
DESCR("text search match");
-DATA(insert OID = 3637 ( "@@" PGNSP PGUID b f f 3615 3614 16 3636 0 ts_match_qv tsmatchsel tsmatchjoinsel ));
+DATA(insert OID = 3637 ( "@@" PGNSP PGUID b f f 3615 3614 16 3636 0 ts_match_qv tsmatchsel tsmatchjoinsel "---"));
DESCR("text search match");
-DATA(insert OID = 3660 ( "@@@" PGNSP PGUID b f f 3614 3615 16 3661 0 ts_match_vq tsmatchsel tsmatchjoinsel ));
+DATA(insert OID = 3660 ( "@@@" PGNSP PGUID b f f 3614 3615 16 3661 0 ts_match_vq tsmatchsel tsmatchjoinsel "---"));
DESCR("deprecated, use @@ instead");
-DATA(insert OID = 3661 ( "@@@" PGNSP PGUID b f f 3615 3614 16 3660 0 ts_match_qv tsmatchsel tsmatchjoinsel ));
+DATA(insert OID = 3661 ( "@@@" PGNSP PGUID b f f 3615 3614 16 3660 0 ts_match_qv tsmatchsel tsmatchjoinsel "---"));
DESCR("deprecated, use @@ instead");
-DATA(insert OID = 3674 ( "<" PGNSP PGUID b f f 3615 3615 16 3679 3678 tsquery_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3674 ( "<" PGNSP PGUID b f f 3615 3615 16 3679 3678 tsquery_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3675 ( "<=" PGNSP PGUID b f f 3615 3615 16 3678 3679 tsquery_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3675 ( "<=" PGNSP PGUID b f f 3615 3615 16 3678 3679 tsquery_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3676 ( "=" PGNSP PGUID b t f 3615 3615 16 3676 3677 tsquery_eq eqsel eqjoinsel ));
+DATA(insert OID = 3676 ( "=" PGNSP PGUID b t f 3615 3615 16 3676 3677 tsquery_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3677 ( "<>" PGNSP PGUID b f f 3615 3615 16 3677 3676 tsquery_ne neqsel neqjoinsel ));
+DATA(insert OID = 3677 ( "<>" PGNSP PGUID b f f 3615 3615 16 3677 3676 tsquery_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3678 ( ">=" PGNSP PGUID b f f 3615 3615 16 3675 3674 tsquery_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3678 ( ">=" PGNSP PGUID b f f 3615 3615 16 3675 3674 tsquery_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 3679 ( ">" PGNSP PGUID b f f 3615 3615 16 3674 3675 tsquery_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3679 ( ">" PGNSP PGUID b f f 3615 3615 16 3674 3675 tsquery_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3680 ( "&&" PGNSP PGUID b f f 3615 3615 3615 0 0 tsquery_and - - ));
+DATA(insert OID = 3680 ( "&&" PGNSP PGUID b f f 3615 3615 3615 0 0 tsquery_and - - "---"));
DESCR("AND-concatenate");
-DATA(insert OID = 3681 ( "||" PGNSP PGUID b f f 3615 3615 3615 0 0 tsquery_or - - ));
+DATA(insert OID = 3681 ( "||" PGNSP PGUID b f f 3615 3615 3615 0 0 tsquery_or - - "---"));
DESCR("OR-concatenate");
-DATA(insert OID = 3682 ( "!!" PGNSP PGUID l f f 0 3615 3615 0 0 tsquery_not - - ));
+DATA(insert OID = 3682 ( "!!" PGNSP PGUID l f f 0 3615 3615 0 0 tsquery_not - - "---"));
DESCR("NOT tsquery");
-DATA(insert OID = 3693 ( "@>" PGNSP PGUID b f f 3615 3615 16 3694 0 tsq_mcontains contsel contjoinsel ));
+DATA(insert OID = 3693 ( "@>" PGNSP PGUID b f f 3615 3615 16 3694 0 tsq_mcontains contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 3694 ( "<@" PGNSP PGUID b f f 3615 3615 16 3693 0 tsq_mcontained contsel contjoinsel ));
+DATA(insert OID = 3694 ( "<@" PGNSP PGUID b f f 3615 3615 16 3693 0 tsq_mcontained contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 3762 ( "@@" PGNSP PGUID b f f 25 25 16 0 0 ts_match_tt contsel contjoinsel ));
+DATA(insert OID = 3762 ( "@@" PGNSP PGUID b f f 25 25 16 0 0 ts_match_tt contsel contjoinsel "---"));
DESCR("text search match");
-DATA(insert OID = 3763 ( "@@" PGNSP PGUID b f f 25 3615 16 0 0 ts_match_tq contsel contjoinsel ));
+DATA(insert OID = 3763 ( "@@" PGNSP PGUID b f f 25 3615 16 0 0 ts_match_tq contsel contjoinsel "---"));
DESCR("text search match");
/* generic record comparison operators */
-DATA(insert OID = 2988 ( "=" PGNSP PGUID b t f 2249 2249 16 2988 2989 record_eq eqsel eqjoinsel ));
+DATA(insert OID = 2988 ( "=" PGNSP PGUID b t f 2249 2249 16 2988 2989 record_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define RECORD_EQ_OP 2988
-DATA(insert OID = 2989 ( "<>" PGNSP PGUID b f f 2249 2249 16 2989 2988 record_ne neqsel neqjoinsel ));
+DATA(insert OID = 2989 ( "<>" PGNSP PGUID b f f 2249 2249 16 2989 2988 record_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2990 ( "<" PGNSP PGUID b f f 2249 2249 16 2991 2993 record_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2990 ( "<" PGNSP PGUID b f f 2249 2249 16 2991 2993 record_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define RECORD_LT_OP 2990
-DATA(insert OID = 2991 ( ">" PGNSP PGUID b f f 2249 2249 16 2990 2992 record_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2991 ( ">" PGNSP PGUID b f f 2249 2249 16 2990 2992 record_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
#define RECORD_GT_OP 2991
-DATA(insert OID = 2992 ( "<=" PGNSP PGUID b f f 2249 2249 16 2993 2991 record_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2992 ( "<=" PGNSP PGUID b f f 2249 2249 16 2993 2991 record_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2993 ( ">=" PGNSP PGUID b f f 2249 2249 16 2992 2990 record_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2993 ( ">=" PGNSP PGUID b f f 2249 2249 16 2992 2990 record_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* byte-oriented tests for identical rows and fast sorting */
-DATA(insert OID = 3188 ( "*=" PGNSP PGUID b t f 2249 2249 16 3188 3189 record_image_eq eqsel eqjoinsel ));
+DATA(insert OID = 3188 ( "*=" PGNSP PGUID b t f 2249 2249 16 3188 3189 record_image_eq eqsel eqjoinsel "mhf"));
DESCR("identical");
-DATA(insert OID = 3189 ( "*<>" PGNSP PGUID b f f 2249 2249 16 3189 3188 record_image_ne neqsel neqjoinsel ));
+DATA(insert OID = 3189 ( "*<>" PGNSP PGUID b f f 2249 2249 16 3189 3188 record_image_ne neqsel neqjoinsel "mhf"));
DESCR("not identical");
-DATA(insert OID = 3190 ( "*<" PGNSP PGUID b f f 2249 2249 16 3191 3193 record_image_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3190 ( "*<" PGNSP PGUID b f f 2249 2249 16 3191 3193 record_image_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3191 ( "*>" PGNSP PGUID b f f 2249 2249 16 3190 3192 record_image_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3191 ( "*>" PGNSP PGUID b f f 2249 2249 16 3190 3192 record_image_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3192 ( "*<=" PGNSP PGUID b f f 2249 2249 16 3193 3191 record_image_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3192 ( "*<=" PGNSP PGUID b f f 2249 2249 16 3193 3191 record_image_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3193 ( "*>=" PGNSP PGUID b f f 2249 2249 16 3192 3190 record_image_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3193 ( "*>=" PGNSP PGUID b f f 2249 2249 16 3192 3190 record_image_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* generic range type operators */
-DATA(insert OID = 3882 ( "=" PGNSP PGUID b t t 3831 3831 16 3882 3883 range_eq eqsel eqjoinsel ));
+DATA(insert OID = 3882 ( "=" PGNSP PGUID b t t 3831 3831 16 3882 3883 range_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3883 ( "<>" PGNSP PGUID b f f 3831 3831 16 3883 3882 range_ne neqsel neqjoinsel ));
+DATA(insert OID = 3883 ( "<>" PGNSP PGUID b f f 3831 3831 16 3883 3882 range_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3884 ( "<" PGNSP PGUID b f f 3831 3831 16 3887 3886 range_lt rangesel scalarltjoinsel ));
+DATA(insert OID = 3884 ( "<" PGNSP PGUID b f f 3831 3831 16 3887 3886 range_lt rangesel scalarltjoinsel "---"));
DESCR("less than");
#define OID_RANGE_LESS_OP 3884
-DATA(insert OID = 3885 ( "<=" PGNSP PGUID b f f 3831 3831 16 3886 3887 range_le rangesel scalarltjoinsel ));
+DATA(insert OID = 3885 ( "<=" PGNSP PGUID b f f 3831 3831 16 3886 3887 range_le rangesel scalarltjoinsel "---"));
DESCR("less than or equal");
#define OID_RANGE_LESS_EQUAL_OP 3885
-DATA(insert OID = 3886 ( ">=" PGNSP PGUID b f f 3831 3831 16 3885 3884 range_ge rangesel scalargtjoinsel ));
+DATA(insert OID = 3886 ( ">=" PGNSP PGUID b f f 3831 3831 16 3885 3884 range_ge rangesel scalargtjoinsel "---"));
DESCR("greater than or equal");
#define OID_RANGE_GREATER_EQUAL_OP 3886
-DATA(insert OID = 3887 ( ">" PGNSP PGUID b f f 3831 3831 16 3884 3885 range_gt rangesel scalargtjoinsel ));
+DATA(insert OID = 3887 ( ">" PGNSP PGUID b f f 3831 3831 16 3884 3885 range_gt rangesel scalargtjoinsel "---"));
DESCR("greater than");
#define OID_RANGE_GREATER_OP 3887
-DATA(insert OID = 3888 ( "&&" PGNSP PGUID b f f 3831 3831 16 3888 0 range_overlaps rangesel areajoinsel ));
+DATA(insert OID = 3888 ( "&&" PGNSP PGUID b f f 3831 3831 16 3888 0 range_overlaps rangesel areajoinsel "---"));
DESCR("overlaps");
#define OID_RANGE_OVERLAP_OP 3888
-DATA(insert OID = 3889 ( "@>" PGNSP PGUID b f f 3831 2283 16 3891 0 range_contains_elem rangesel contjoinsel ));
+DATA(insert OID = 3889 ( "@>" PGNSP PGUID b f f 3831 2283 16 3891 0 range_contains_elem rangesel contjoinsel "---"));
DESCR("contains");
#define OID_RANGE_CONTAINS_ELEM_OP 3889
-DATA(insert OID = 3890 ( "@>" PGNSP PGUID b f f 3831 3831 16 3892 0 range_contains rangesel contjoinsel ));
+DATA(insert OID = 3890 ( "@>" PGNSP PGUID b f f 3831 3831 16 3892 0 range_contains rangesel contjoinsel "---"));
DESCR("contains");
#define OID_RANGE_CONTAINS_OP 3890
-DATA(insert OID = 3891 ( "<@" PGNSP PGUID b f f 2283 3831 16 3889 0 elem_contained_by_range rangesel contjoinsel ));
+DATA(insert OID = 3891 ( "<@" PGNSP PGUID b f f 2283 3831 16 3889 0 elem_contained_by_range rangesel contjoinsel "---"));
DESCR("is contained by");
#define OID_RANGE_ELEM_CONTAINED_OP 3891
-DATA(insert OID = 3892 ( "<@" PGNSP PGUID b f f 3831 3831 16 3890 0 range_contained_by rangesel contjoinsel ));
+DATA(insert OID = 3892 ( "<@" PGNSP PGUID b f f 3831 3831 16 3890 0 range_contained_by rangesel contjoinsel "---"));
DESCR("is contained by");
#define OID_RANGE_CONTAINED_OP 3892
-DATA(insert OID = 3893 ( "<<" PGNSP PGUID b f f 3831 3831 16 3894 0 range_before rangesel scalarltjoinsel ));
+DATA(insert OID = 3893 ( "<<" PGNSP PGUID b f f 3831 3831 16 3894 0 range_before rangesel scalarltjoinsel "---"));
DESCR("is left of");
#define OID_RANGE_LEFT_OP 3893
-DATA(insert OID = 3894 ( ">>" PGNSP PGUID b f f 3831 3831 16 3893 0 range_after rangesel scalargtjoinsel ));
+DATA(insert OID = 3894 ( ">>" PGNSP PGUID b f f 3831 3831 16 3893 0 range_after rangesel scalargtjoinsel "---"));
DESCR("is right of");
#define OID_RANGE_RIGHT_OP 3894
-DATA(insert OID = 3895 ( "&<" PGNSP PGUID b f f 3831 3831 16 0 0 range_overleft rangesel scalarltjoinsel ));
+DATA(insert OID = 3895 ( "&<" PGNSP PGUID b f f 3831 3831 16 0 0 range_overleft rangesel scalarltjoinsel "---"));
DESCR("overlaps or is left of");
#define OID_RANGE_OVERLAPS_LEFT_OP 3895
-DATA(insert OID = 3896 ( "&>" PGNSP PGUID b f f 3831 3831 16 0 0 range_overright rangesel scalargtjoinsel ));
+DATA(insert OID = 3896 ( "&>" PGNSP PGUID b f f 3831 3831 16 0 0 range_overright rangesel scalargtjoinsel "---"));
DESCR("overlaps or is right of");
#define OID_RANGE_OVERLAPS_RIGHT_OP 3896
-DATA(insert OID = 3897 ( "-|-" PGNSP PGUID b f f 3831 3831 16 3897 0 range_adjacent contsel contjoinsel ));
+DATA(insert OID = 3897 ( "-|-" PGNSP PGUID b f f 3831 3831 16 3897 0 range_adjacent contsel contjoinsel "---"));
DESCR("is adjacent to");
-DATA(insert OID = 3898 ( "+" PGNSP PGUID b f f 3831 3831 3831 3898 0 range_union - - ));
+DATA(insert OID = 3898 ( "+" PGNSP PGUID b f f 3831 3831 3831 3898 0 range_union - - "---"));
DESCR("range union");
-DATA(insert OID = 3899 ( "-" PGNSP PGUID b f f 3831 3831 3831 0 0 range_minus - - ));
+DATA(insert OID = 3899 ( "-" PGNSP PGUID b f f 3831 3831 3831 0 0 range_minus - - "---"));
DESCR("range difference");
-DATA(insert OID = 3900 ( "*" PGNSP PGUID b f f 3831 3831 3831 3900 0 range_intersect - - ));
+DATA(insert OID = 3900 ( "*" PGNSP PGUID b f f 3831 3831 3831 3900 0 range_intersect - - "---"));
DESCR("range intersection");
-DATA(insert OID = 3962 ( "->" PGNSP PGUID b f f 114 25 114 0 0 json_object_field - - ));
+DATA(insert OID = 3962 ( "->" PGNSP PGUID b f f 114 25 114 0 0 json_object_field - - "---"));
DESCR("get json object field");
-DATA(insert OID = 3963 ( "->>" PGNSP PGUID b f f 114 25 25 0 0 json_object_field_text - - ));
+DATA(insert OID = 3963 ( "->>" PGNSP PGUID b f f 114 25 25 0 0 json_object_field_text - - "---"));
DESCR("get json object field as text");
-DATA(insert OID = 3964 ( "->" PGNSP PGUID b f f 114 23 114 0 0 json_array_element - - ));
+DATA(insert OID = 3964 ( "->" PGNSP PGUID b f f 114 23 114 0 0 json_array_element - - "---"));
DESCR("get json array element");
-DATA(insert OID = 3965 ( "->>" PGNSP PGUID b f f 114 23 25 0 0 json_array_element_text - - ));
+DATA(insert OID = 3965 ( "->>" PGNSP PGUID b f f 114 23 25 0 0 json_array_element_text - - "---"));
DESCR("get json array element as text");
-DATA(insert OID = 3966 ( "#>" PGNSP PGUID b f f 114 1009 114 0 0 json_extract_path - - ));
+DATA(insert OID = 3966 ( "#>" PGNSP PGUID b f f 114 1009 114 0 0 json_extract_path - - "---"));
DESCR("get value from json with path elements");
-DATA(insert OID = 3967 ( "#>>" PGNSP PGUID b f f 114 1009 25 0 0 json_extract_path_text - - ));
+DATA(insert OID = 3967 ( "#>>" PGNSP PGUID b f f 114 1009 25 0 0 json_extract_path_text - - "---"));
DESCR("get value from json as text with path elements");
-DATA(insert OID = 3211 ( "->" PGNSP PGUID b f f 3802 25 3802 0 0 jsonb_object_field - - ));
+DATA(insert OID = 3211 ( "->" PGNSP PGUID b f f 3802 25 3802 0 0 jsonb_object_field - - "---"));
DESCR("get jsonb object field");
-DATA(insert OID = 3477 ( "->>" PGNSP PGUID b f f 3802 25 25 0 0 jsonb_object_field_text - - ));
+DATA(insert OID = 3477 ( "->>" PGNSP PGUID b f f 3802 25 25 0 0 jsonb_object_field_text - - "---"));
DESCR("get jsonb object field as text");
-DATA(insert OID = 3212 ( "->" PGNSP PGUID b f f 3802 23 3802 0 0 jsonb_array_element - - ));
+DATA(insert OID = 3212 ( "->" PGNSP PGUID b f f 3802 23 3802 0 0 jsonb_array_element - - "---"));
DESCR("get jsonb array element");
-DATA(insert OID = 3481 ( "->>" PGNSP PGUID b f f 3802 23 25 0 0 jsonb_array_element_text - - ));
+DATA(insert OID = 3481 ( "->>" PGNSP PGUID b f f 3802 23 25 0 0 jsonb_array_element_text - - "---"));
DESCR("get jsonb array element as text");
-DATA(insert OID = 3213 ( "#>" PGNSP PGUID b f f 3802 1009 3802 0 0 jsonb_extract_path - - ));
+DATA(insert OID = 3213 ( "#>" PGNSP PGUID b f f 3802 1009 3802 0 0 jsonb_extract_path - - "---"));
DESCR("get value from jsonb with path elements");
-DATA(insert OID = 3206 ( "#>>" PGNSP PGUID b f f 3802 1009 25 0 0 jsonb_extract_path_text - - ));
+DATA(insert OID = 3206 ( "#>>" PGNSP PGUID b f f 3802 1009 25 0 0 jsonb_extract_path_text - - "---"));
DESCR("get value from jsonb as text with path elements");
-DATA(insert OID = 3240 ( "=" PGNSP PGUID b t t 3802 3802 16 3240 3241 jsonb_eq eqsel eqjoinsel ));
+DATA(insert OID = 3240 ( "=" PGNSP PGUID b t t 3802 3802 16 3240 3241 jsonb_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3241 ( "<>" PGNSP PGUID b f f 3802 3802 16 3241 3240 jsonb_ne neqsel neqjoinsel ));
+DATA(insert OID = 3241 ( "<>" PGNSP PGUID b f f 3802 3802 16 3241 3240 jsonb_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3242 ( "<" PGNSP PGUID b f f 3802 3802 16 3243 3245 jsonb_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3242 ( "<" PGNSP PGUID b f f 3802 3802 16 3243 3245 jsonb_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3243 ( ">" PGNSP PGUID b f f 3802 3802 16 3242 3244 jsonb_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3243 ( ">" PGNSP PGUID b f f 3802 3802 16 3242 3244 jsonb_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3244 ( "<=" PGNSP PGUID b f f 3802 3802 16 3245 3243 jsonb_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3244 ( "<=" PGNSP PGUID b f f 3802 3802 16 3245 3243 jsonb_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3245 ( ">=" PGNSP PGUID b f f 3802 3802 16 3244 3242 jsonb_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3245 ( ">=" PGNSP PGUID b f f 3802 3802 16 3244 3242 jsonb_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 3246 ( "@>" PGNSP PGUID b f f 3802 3802 16 3250 0 jsonb_contains contsel contjoinsel ));
+DATA(insert OID = 3246 ( "@>" PGNSP PGUID b f f 3802 3802 16 3250 0 jsonb_contains contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 3247 ( "?" PGNSP PGUID b f f 3802 25 16 0 0 jsonb_exists contsel contjoinsel ));
+DATA(insert OID = 3247 ( "?" PGNSP PGUID b f f 3802 25 16 0 0 jsonb_exists contsel contjoinsel "---"));
DESCR("exists");
-DATA(insert OID = 3248 ( "?|" PGNSP PGUID b f f 3802 1009 16 0 0 jsonb_exists_any contsel contjoinsel ));
+DATA(insert OID = 3248 ( "?|" PGNSP PGUID b f f 3802 1009 16 0 0 jsonb_exists_any contsel contjoinsel "---"));
DESCR("exists any");
-DATA(insert OID = 3249 ( "?&" PGNSP PGUID b f f 3802 1009 16 0 0 jsonb_exists_all contsel contjoinsel ));
+DATA(insert OID = 3249 ( "?&" PGNSP PGUID b f f 3802 1009 16 0 0 jsonb_exists_all contsel contjoinsel "---"));
DESCR("exists all");
-DATA(insert OID = 3250 ( "<@" PGNSP PGUID b f f 3802 3802 16 3246 0 jsonb_contained contsel contjoinsel ));
+DATA(insert OID = 3250 ( "<@" PGNSP PGUID b f f 3802 3802 16 3246 0 jsonb_contained contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 3284 ( "||" PGNSP PGUID b f f 3802 3802 3802 0 0 jsonb_concat - - ));
+DATA(insert OID = 3284 ( "||" PGNSP PGUID b f f 3802 3802 3802 0 0 jsonb_concat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 3285 ( "-" PGNSP PGUID b f f 3802 25 3802 0 0 3302 - - ));
+DATA(insert OID = 3285 ( "-" PGNSP PGUID b f f 3802 25 3802 0 0 3302 - - "---"));
DESCR("delete object field");
-DATA(insert OID = 3286 ( "-" PGNSP PGUID b f f 3802 23 3802 0 0 3303 - - ));
+DATA(insert OID = 3286 ( "-" PGNSP PGUID b f f 3802 23 3802 0 0 3303 - - "---"));
DESCR("delete array element");
-DATA(insert OID = 3287 ( "#-" PGNSP PGUID b f f 3802 1009 3802 0 0 jsonb_delete_path - - ));
+DATA(insert OID = 3287 ( "#-" PGNSP PGUID b f f 3802 1009 3802 0 0 jsonb_delete_path - - "---"));
DESCR("delete path");
/*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 9254f85..9865a9c 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -250,6 +250,7 @@ typedef enum NodeTag
T_MinMaxAggInfo,
T_PlannerParamItem,
T_MVStatisticInfo,
+ T_RestrictStatData,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 1979cdf..b78ee5d 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -15,12 +15,12 @@
#define RELATION_H
#include "access/sdir.h"
+#include "access/htup.h"
#include "lib/stringinfo.h"
#include "nodes/params.h"
#include "nodes/parsenodes.h"
#include "storage/block.h"
-
/*
* Relids
* Set of relation identifiers (indexes into the rangetable).
@@ -1341,6 +1341,26 @@ typedef struct RestrictInfo
Selectivity right_bucketsize; /* avg bucketsize of right side */
} RestrictInfo;
+typedef struct bm_mvstat
+{
+ Bitmapset *attrs;
+ MVStatisticInfo *stats;
+ int mvkind;
+} bm_mvstat;
+
+typedef struct RestrictStatData
+{
+ NodeTag type;
+ BoolExprType boolop;
+ Node *clause;
+ Node *mvclause;
+ Node *nonmvclause;
+ List *children;
+ List *mvstats;
+ Bitmapset *mvattrs;
+ List *unusedrinfos;
+} RestrictStatData;
+
/*
* Since mergejoinscansel() is a relatively expensive function, and would
* otherwise be invoked many times while planning a large join tree,
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 6bfd338..24003ae 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -183,13 +183,11 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions);
+ SpecialJoinInfo *sjinfo);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions);
+ SpecialJoinInfo *sjinfo);
#endif /* COST_H */
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index a40c9b1..bb9d68b 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -84,6 +84,7 @@ extern Oid get_commutator(Oid opno);
extern Oid get_negator(Oid opno);
extern RegProcedure get_oprrest(Oid opno);
extern RegProcedure get_oprjoin(Oid opno);
+extern int get_oprmvstat(Oid opno);
extern char *get_func_name(Oid funcid);
extern Oid get_func_namespace(Oid funcid);
extern Oid get_func_rettype(Oid funcid);
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index f2fbc11..a08fd58 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -34,6 +34,9 @@ extern int mvstat_search_type;
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+#define MVSTATISTIC_MCV 1
+#define MVSTATISTIC_HIST 2
+#define MVSTATISTIC_FDEP 4
/*
* Functional dependencies, tracking column-level relationships (values
--
1.8.3.1
Hi,
On 07/16/2015 01:51 PM, Kyotaro HORIGUCHI wrote:
Hi, I'd like to show you the modified constitution of
multivariate statistics application logic. Please find the
attached. They apply on your v7 patch.
Sadly I do have some trouble getting it to apply correctly :-(
So for now all my comments are based on just reading the code.
FWIW I've rebased my patch to the current master, it's available on
github as usual:
https://github.com/tvondra/postgres/commits/mvstats
The code to find mv-applicable clause is moved out of the main
flow of clauselist_selectivity. As I said in the previous mail,
the new function transformRestrictInfoForEstimate (too bad name
but just for PoC:) scans clauselist and generates
RestrictStatsData struct which drives mv-aware selectivity
calculation. This struct isolates MV and non-MV estimation.The struct RestrictStatData mainly consists of the following
three parts,- clause to be estimated by current logic (MV is not applicable)
- clause to be estimated by MV-staistics.
- list of child RestrictStatDatas, which are to be run
recursively.mvclause_selectivty() is the topmost function where mv stats
works. This structure effectively prevents main estimation flow
from being broken by modifying mvstats part. Although I haven't
measured but I'm positive the code is far reduced from yours.I attached two patches to this message. The first one is to
rebase v7 patch to current(maybe) master and the second applies
the refactoring.I'm a little anxious about performance but I think this makes the
process to apply mv-stats far clearer. Regtests for mvstats
succeeded asis except for fdep, which is not implememted in this
patch.What do you think about this?
I'm not sure, at this point. I'm having a hard time understanding how
exactly the code works - there are pretty much no comments explaining
the implementation, so it takes time to understand the code. This is
especially true about transformRestrictInfoForEstimate which is also
quite long. I understand it's a PoC, but comments would really help.
On a conceptual level, I think the idea to split the estimation into two
phases - enrich the expression tree with nodes with details about stats
etc, and then actually do the estimation in the second phase might be
interesting. Not because it's somehow clearer, but because it gives us a
chance to see the expression tree as a whole, with details about all the
stats (with the current code we process/estimate the tree
incrementally). But I don't really know how useful that would be.
I don't think the proposed change makes the process somehow clearer. I
know it's a PoC at this point, so I don't expect it to be perfect, but
for me the original code is certainly clearer. Of course, I'm biased as
I wrote the current code, and I (naturally) shaped it to match my ideas
during the development process, and I'm much more familiar with it.
Omitting the support for functional dependencies is a bit unfortunate, I
think. Is that merely to make the PoC simpler, or is there something
that makes it impossible to support that kind of stats?
Another thing that I noticed is that you completely removed the code
that combined multiple stats (and selected the best combination of
stats). In other words, you've reverted to the intermediate single
statistics approach, including removing the improved handling of OR
clauses and conditions. It's a bit difficult to judge the proposed
approach not knowing how well it supports those (quite crucial)
features. What if it can't support some them., or what if it makes the
code much more complicated (thus defeating the goal of making it more
clear)?
I share your concern about the performance impact - one thing is that
this new code might be slower than the original one, but a more serious
issue IMHO is that the performance impact will happen even for relations
with no multivariate stats at all. The original patch was very careful
about getting ~0% overhead in such cases, and if the new code does not
allow that, I don't see this approach as acceptable. We must not put
additional overhead on people not using multivariate stats.
But I think it's worth exploring this idea a bit more - can you rebase
it to the current patch version (as on github) and adding the missing
pieces (functional dependencies, multi-statistics estimation and passing
conditions)?
One more thing - I noticed you extended the pg_operator catalog with a
oprmvstat attribute, used to flag operators that are compatible with
multivariate stats. I'm not happy with the current approach (using
oprrest to do this decision), but I'm not really sure this is a good
solution either. The culprit is that it only answers one of the two
important questions - Is it compatible? How to perform the estimation?
So we'd have to rely on oprrest anyway, when actually performing the
estimation of a clause with "compatible" operator. And we'd have to keep
in sync two places (catalog and checks in file), and we'd have to update
the catalog after improving the implementation (adding support for
another operator).
kind regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
At Sat, 25 Jul 2015 23:09:31 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <55B3FB0B.7000201@2ndquadrant.com>
Hi,
On 07/16/2015 01:51 PM, Kyotaro HORIGUCHI wrote:
Hi, I'd like to show you the modified constitution of
multivariate statistics application logic. Please find the
attached. They apply on your v7 patch.Sadly I do have some trouble getting it to apply correctly :-(
So for now all my comments are based on just reading the code.
Ah. My modification to rebase to the master for the time should
be culprit. Sorry for the dirty patch.
# I would recreate the patch if you complained before struggling
# with the thing..
The core of the modification is on closesel.c. I attached the
patched closesel.c.
FWIW I've rebased my patch to the current master, it's available on
github as usual:
Thanks.
The code to find mv-applicable clause is moved out of the main
flow of clauselist_selectivity. As I said in the previous mail,
the new function transformRestrictInfoForEstimate (too bad name
but just for PoC:) scans clauselist and generates
RestrictStatsData struct which drives mv-aware selectivity
calculation. This struct isolates MV and non-MV estimation.The struct RestrictStatData mainly consists of the following
three parts,- clause to be estimated by current logic (MV is not applicable)
- clause to be estimated by MV-staistics.
- list of child RestrictStatDatas, which are to be run
recursively.mvclause_selectivty() is the topmost function where mv stats
works. This structure effectively prevents main estimation flow
from being broken by modifying mvstats part. Although I haven't
measured but I'm positive the code is far reduced from yours.I attached two patches to this message. The first one is to
rebase v7 patch to current(maybe) master and the second applies
the refactoring.I'm a little anxious about performance but I think this makes the
process to apply mv-stats far clearer. Regtests for mvstats
succeeded asis except for fdep, which is not implememted in this
patch.What do you think about this?
I'm not sure, at this point. I'm having a hard time understanding how
exactly the code works - there are pretty much no comments explaining
the implementation, so it takes time to understand the code. This is
especially true about transformRestrictInfoForEstimate which is also
quite long. I understand it's a PoC, but comments would really help.
The patch itself shold hardly readable because it's not from
master but from your last patch plus somthing.
My concern about the code at the time was following,
- You embedded the logic of multivariate estimation into
clauselist_selectivity. I think estimate using multivariate
statistics is quite different from the ordinary estimate based
on single column stats then they are logically separatable and
we should do so.
- You are taking top-down approach and it runs tree-walking to
check appliability of mv-stats for every stepping down in
clause tree. If the subtree found to be mv-applicable, split it
to two parts - mv-compatible and non-compatible. These steps
requires expression tree walking, which looks using too-much
CPU.
- You look to be considering the cases when users create many
multivariate statistics on attribute sets having
duplications. But it looks too-much for me. MV-stats are more
resource-eating so we can assume the minimum usage of that.
My suggestion in the patch is a bottom-up approach to find
mv-applicable portion(s) in the expression tree, which is the
basic way of planner overall. The approach requires no repetitive
run of tree walker, that is, pull_varnos. It could fail to find
the 'optimal' solution for complex situations but needs far less
calculation for almost the same return (I think..).
Even though it doesn't consider the functional dependency, the
reduce of the code shows the efficiency. It does not nothing
tricky.
On a conceptual level, I think the idea to split the estimation into
two phases - enrich the expression tree with nodes with details about
stats etc, and then actually do the estimation in the second phase
might be interesting. Not because it's somehow clearer, but because it
gives us a chance to see the expression tree as a whole, with details
about all the stats (with the current code we process/estimate the
tree incrementally). But I don't really know how useful that would be.
It is difficult to say which approach is better sinch it is
affected by what we think important than other things. However I
concern about that your code substantially reconstructs the
expression (clause) tree midst of processing it. I believe it
should be a separate phase for simplicity. Of course additional
required resource is also should be considered but it is rather
reduced for this case.
I don't think the proposed change makes the process somehow clearer. I
know it's a PoC at this point, so I don't expect it to be perfect, but
for me the original code is certainly clearer. Of course, I'm biased
as I wrote the current code, and I (naturally) shaped it to match my
ideas during the development process, and I'm much more familiar with
it.
Mmm. we need someone else's opition:) What I think on this point
is described just above... OK, I try to describe this in other
words.
The embedded approach simply increases the state and code path
by, roughly, multiplication basis. The separate approcach adds
them in addition basis. I thinks this is the most siginificant
point of why I feel it 'clear'.
Of course, the acceptable complexity differs according to the
fundamental complexity, performance, required memory or someting
others but I feel it is too-much complexity for the objective.
Omitting the support for functional dependencies is a bit unfortunate,
I think. Is that merely to make the PoC simpler, or is there something
that makes it impossible to support that kind of stats?
I don't think so. I ommited it simply because it would more time
to implement.
Another thing that I noticed is that you completely removed the code
that combined multiple stats (and selected the best combination of
stats). In other words, you've reverted to the intermediate single
statistics approach, including removing the improved handling of OR
clauses and conditions.
Yeah, good catch :p I noticed that just after submitting the
patch that I retaion only one statistics at the second level from
the bottom but it is easily fixed by changing pruning timing. The
struct can hold multiple statistics anyway.
And I don't omit OR case. It is handled along with the AND
case. (in wrong way?)
It's a bit difficult to judge the proposed
approach not knowing how well it supports those (quite crucial)
features. What if it can't support some them., or what if it makes the
code much more complicated (thus defeating the goal of making it more
clear)?
OR is supported, Fdep is maybe supportable, but all of them
occurs within the function with the entangled name
(transform..something). But I should put more consider on your
latest code before that.
I share your concern about the performance impact - one thing is that
this new code might be slower than the original one, but a more
serious issue IMHO is that the performance impact will happen even for
relations with no multivariate stats at all. The original patch was
very careful about getting ~0% overhead in such cases,
I don't think so. find_stats runs pull_varnos and
transformRestric.. also uses pull_varnos to bail out at the top
level. They should have almost the same overhead for the case.
and if the new
code does not allow that, I don't see this approach as acceptable. We
must not put additional overhead on people not using multivariate
stats.But I think it's worth exploring this idea a bit more - can you rebase
it to the current patch version (as on github) and adding the missing
pieces (functional dependencies, multi-statistics estimation and
passing conditions)?
With pleasure. Please wait for a while.
One more thing - I noticed you extended the pg_operator catalog with a
oprmvstat attribute, used to flag operators that are compatible with
multivariate stats. I'm not happy with the current approach (using
oprrest to do this decision), but I'm not really sure this is a good
solution either. The culprit is that it only answers one of the two
important questions - Is it compatible? How to perform the estimation?
Hostly saying, I also don't like this. But checking oprrest is
unpleasant much the same.
So we'd have to rely on oprrest anyway, when actually performing the
estimation of a clause with "compatible" operator. And we'd have to
keep in sync two places (catalog and checks in file), and we'd have to
update the catalog after improving the implementation (adding support
for another operator).
Mmm. It depends on what the deveopers think about the definition
of oprrest. More practically, I'm worried whether it cannot be
other than eqsel for any equality operator. And the same for
comparison operators.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello Horiguchi-san,
On 07/27/2015 09:04 AM, Kyotaro HORIGUCHI wrote:
Hello,
At Sat, 25 Jul 2015 23:09:31 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <55B3FB0B.7000201@2ndquadrant.com>
Hi,
On 07/16/2015 01:51 PM, Kyotaro HORIGUCHI wrote:
Hi, I'd like to show you the modified constitution of
multivariate statistics application logic. Please find the
attached. They apply on your v7 patch.Sadly I do have some trouble getting it to apply correctly :-(
So for now all my comments are based on just reading the code.Ah. My modification to rebase to the master for the time should
be culprit. Sorry for the dirty patch.# I would recreate the patch if you complained before struggling
# with the thing..The core of the modification is on closesel.c. I attached the
patched closesel.c.
I don't see any attachment. Perhaps you forgot to actually attach it?
My concern about the code at the time was following,
- You embedded the logic of multivariate estimation into
clauselist_selectivity. I think estimate using multivariate
statistics is quite different from the ordinary estimate based
on single column stats then they are logically separatable and
we should do so.
I don't see them as very different, actually quite the opposite. The two
kinds of statistics are complementary and should naturally coexist.
Perhaps the current code is not perfect and a refactoring would make the
code more readable, but I don't think it's primary aim should be to
separate regular and multivariate stats.
- You are taking top-down approach and it runs tree-walking to
check appliability of mv-stats for every stepping down in
clause tree. If the subtree found to be mv-applicable, split it
to two parts - mv-compatible and non-compatible. These steps
requires expression tree walking, which looks using too-much
CPU.
I'm taking top-down approach because that's what the regular stats do,
and also because that's what allows implementing the features that I
think are interesting - ability to combine multiple stats in an
efficient way, pass conditions and such. I think those two features are
very useful and allow very interesting things.
The bottom-up would work too, probably - I mean, we could start from
leaves of the expression tree, and build the largest "subtree"
compatible with multivariate stats and then try to estimate it. I don't
see how we could pass conditions though, which works naturally in the
top-down approach.
Or maybe a combination of both - identify the "compatible" subtrees
first, then perform the top-down phase.
- You look to be considering the cases when users create many
multivariate statistics on attribute sets having
duplications. But it looks too-much for me. MV-stats are more
resource-eating so we can assume the minimum usage of that.
Not really. I don't expect huge numbers of multivariate stats to be
built on the tables.
But I think restricting the users to use a single multivariate
statistics per table would be a significant limitation. And once you
allow using multiple multivariate statistics for the set of clauses,
supporting over-lapping stats is not that difficult.
What it however makes possible is combining multiple "small" stats into
a larger one in a very efficient way - it assumes the overlap is
sufficient, of course. But if that's true you may build multiple small
(and very accurate) stats instead of one huge (or very inaccurate)
statistics.
This also makes it possible to handle complex combinations of clauses
that are compatible and incompatible with multivariate statistics, by
passing the conditions.
My suggestion in the patch is a bottom-up approach to find
mv-applicable portion(s) in the expression tree, which is the
basic way of planner overall. The approach requires no repetitive
run of tree walker, that is, pull_varnos. It could fail to find
the 'optimal' solution for complex situations but needs far less
calculation for almost the same return (I think..).Even though it doesn't consider the functional dependency, the
reduce of the code shows the efficiency. It does not nothing
tricky.
OK
On a conceptual level, I think the idea to split the estimation into
two phases - enrich the expression tree with nodes with details about
stats etc, and then actually do the estimation in the second phase
might be interesting. Not because it's somehow clearer, but because it
gives us a chance to see the expression tree as a whole, with details
about all the stats (with the current code we process/estimate the
tree incrementally). But I don't really know how useful that would be.It is difficult to say which approach is better sinch it is
affected by what we think important than other things. However I
concern about that your code substantially reconstructs the
expression (clause) tree midst of processing it. I believe it
should be a separate phase for simplicity. Of course additional
required resource is also should be considered but it is rather
reduced for this case.
What do you mean by "reconstruct the expression tree"? It's true I'm
walking the expression tree top-down, but how is that reconstructing?
I don't think the proposed change makes the process somehow clearer. I
know it's a PoC at this point, so I don't expect it to be perfect, but
for me the original code is certainly clearer. Of course, I'm biased
as I wrote the current code, and I (naturally) shaped it to match my
ideas during the development process, and I'm much more familiar with
it.Mmm. we need someone else's opition:) What I think on this point
is described just above... OK, I try to describe this in other
words.
I find your comments very valuable. I may not agree with some of them,
but I certainly appreciate your point of view. So thank you very much
for the time you spent reviewing this patch so far!
The embedded approach simply increases the state and code path by,
roughly, multiplication basis. The separate approcach adds them in
addition basis. I thinks this is the most siginificant point of why I
feel it 'clear'.Of course, the acceptable complexity differs according to the
fundamental complexity, performance, required memory or someting
others but I feel it is too-much complexity for the objective.
Yes, I think we might have slightly different objectives in mind.
Regarding the complexity - I am not too worried about spending more CPU
cycles on this, as long as it does not impact the case where people have
no multivariate statistics at all. That's because I expect people to use
this for large DSS/DWH data sets with lots of dependencies in the (often
denormalized) tables and complex conditions - in those cases the
planning difference is negligible, especially if the improved estimates
make the query run in seconds instead of hours.
This is why I was so careful to entirely skip the expensive processing
when where were no multivariate stats, and why I don't like the fact
that your approach makes this skip more difficult (or maybe impossible,
I'm not sure).
It's also true that most OLTP queries (especially the short ones, thus
most impacted by the increase of planning time) use rather short/simple
clause lists, so even the top-down approach should be very cheap.
Omitting the support for functional dependencies is a bit unfortunate,
I think. Is that merely to make the PoC simpler, or is there something
that makes it impossible to support that kind of stats?I don't think so. I ommited it simply because it would more time
to implement.
OK, thanks for confirming this.
Another thing that I noticed is that you completely removed the code
that combined multiple stats (and selected the best combination of
stats). In other words, you've reverted to the intermediate single
statistics approach, including removing the improved handling of OR
clauses and conditions.Yeah, good catch :p I noticed that just after submitting the
patch that I retaion only one statistics at the second level from
the bottom but it is easily fixed by changing pruning timing. The
struct can hold multiple statistics anyway.
Great!
And I don't omit OR case. It is handled along with the AND
case. (in wrong way?)
Oh, I see. I got a bit confused because you've removed the optimization
step (and conditions), and that needs to be handled a bit differently
for the OR clauses.
It's a bit difficult to judge the proposed
approach not knowing how well it supports those (quite crucial)
features. What if it can't support some them., or what if it makes the
code much more complicated (thus defeating the goal of making it more
clear)?OR is supported, Fdep is maybe supportable, but all of them
occurs within the function with the entangled name
(transform..something). But I should put more consider on your
latest code before that.
Good. Likewise, I'd like to see more of your approach ;-)
I share your concern about the performance impact - one thing is that
this new code might be slower than the original one, but a more
serious issue IMHO is that the performance impact will happen even for
relations with no multivariate stats at all. The original patch was
very careful about getting ~0% overhead in such cases,I don't think so. find_stats runs pull_varnos and
transformRestric.. also uses pull_varnos to bail out at the top
level. They should have almost the same overhead for the case.
Understood. As I explained above, I'm not all that concerned about the
performance impact, as long as we make sure it only applies to people
using the multivariate stats.
I also think a combined approach - first a bottom-up step (identifying
the largest compatible subtrees & caching the varnos), then a top-down
step (doing the same optimization as implemented today) might minimize
the performance impact.
and if the new
code does not allow that, I don't see this approach as acceptable. We
must not put additional overhead on people not using multivariate
stats.But I think it's worth exploring this idea a bit more - can you rebase
it to the current patch version (as on github) and adding the missing
pieces (functional dependencies, multi-statistics estimation and
passing conditions)?With pleasure. Please wait for a while.
Sure. Take your time.
One more thing - I noticed you extended the pg_operator catalog with a
oprmvstat attribute, used to flag operators that are compatible with
multivariate stats. I'm not happy with the current approach (using
oprrest to do this decision), but I'm not really sure this is a good
solution either. The culprit is that it only answers one of the two
important questions - Is it compatible? How to perform the estimation?Hostly saying, I also don't like this. But checking oprrest is
unpleasant much the same.
The patch is already quite massive, so let's use the same approach as
current stats, and leave this problem for another patch. If we come up
with a great idea, we can work on it, but I see this as a loosely
related annoyance rather than something this patch aims to address.
So we'd have to rely on oprrest anyway, when actually performing the
estimation of a clause with "compatible" operator. And we'd have to
keep in sync two places (catalog and checks in file), and we'd have to
update the catalog after improving the implementation (adding support
for another operator).Mmm. It depends on what the deveopers think about the definition
of oprrest. More practically, I'm worried whether it cannot be
other than eqsel for any equality operator. And the same for
comparison operators.
OTOH if you define a new operator with oprrest=F_EQSEL, you're
effectively saying "It's OK to estimate this using regular eq/lt/gt
operators". If your operator is somehow incompatible with that, you
should not set oprrest=F_EQSEL.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/25/2015 11:43 PM, Tomas Vondra wrote:
There are 6 files attached, but only 0002-0006 are actually part of the
multivariate statistics patch itself.
All of these patches are huge. In order to review this in a reasonable
amount of time, we need to do this in several steps. So let's see what
would be the minimal set of these patches that could be reviewed and
committed, while still being useful.
The main patches are:
1. shared infrastructure and functional dependencies
2. clause reduction using functional dependencies
3. multivariate MCV lists
4. multivariate histograms
5. multi-statistics estimation
Would it make sense to commit only patches 1 and 2 first? Would that be
enough to get a benefit from this?
I have some doubts about the clause reduction and functional
dependencies part of this. It seems to treat functional dependency as a
boolean property, but even with the classic zipcode and city case, it's
not always an all or nothing thing. At least in some countries, there
can be zipcodes that span multiple cities. So zipcode=X does not
completely imply city=Y, although there is a strong correlation (if
that's the right term). How strong does the correlation need to be for
this patch to decide that zipcode implies city? I couldn't actually see
a clear threshold stated anywhere.
So rather than treating functional dependence as a boolean, I think it
would make more sense to put a 0.0-1.0 number to it. That means that you
can't do clause reduction like it's done in this patch, where you
actually remove clauses from the query for cost esimation purposes.
Instead, you need to calculate the selectivity for each clause
independently, but instead of just multiplying the selectivities
together, apply the "dependence factor" to it.
Does that make sense? I haven't really looked at the MCV, histogram and
"multi-statistics estimation" patches yet. Do those patches make the
clause reduction patch obsolete? Should we forget about the clause
reduction and functional dependency patch, and focus on those later
patches instead?
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello, I certainly attached the file this time.
At Mon, 27 Jul 2015 23:54:08 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <55B6A880.3050801@2ndquadrant.com>
The core of the modification is on closesel.c. I attached the
patched closesel.c.I don't see any attachment. Perhaps you forgot to actually attach it?
Very sorry to have forgotten to attach it. I attached the new
patch applicable on the head of mvstats branch of your
repository.
My concern about the code at the time was following,
- You embedded the logic of multivariate estimation into
clauselist_selectivity. I think estimate using multivariate
statistics is quite different from the ordinary estimate based
on single column stats then they are logically separatable and
we should do so.I don't see them as very different, actually quite the opposite. The
two kinds of statistics are complementary and should naturally
coexist. Perhaps the current code is not perfect and a refactoring
would make the code more readable, but I don't think it's primary aim
should be to separate regular and multivariate stats.- You are taking top-down approach and it runs tree-walking to
check appliability of mv-stats for every stepping down in
clause tree. If the subtree found to be mv-applicable, split it
to two parts - mv-compatible and non-compatible. These steps
requires expression tree walking, which looks using too-much
CPU.I'm taking top-down approach because that's what the regular stats do,
and also because that's what allows implementing the features that I
think are interesting - ability to combine multiple stats in an
efficient way, pass conditions and such. I think those two features
are very useful and allow very interesting things.The bottom-up would work too, probably - I mean, we could start from
leaves of the expression tree, and build the largest "subtree"
compatible with multivariate stats and then try to estimate it. I
don't see how we could pass conditions though, which works naturally
in the top-down approach.
By the way, the 'condition' looks to mean what will be received
by the parameter of clause(list)_selectivity with the same
name. But it is always NIL. Looking at the comment for
collect_mv_attnum, it is prepared for 'multitable statistics'. If
so, I think it's better removed from the current patch, because
it is useless now.
Or maybe a combination of both - identify the "compatible" subtrees
first, then perform the top-down phase.- You look to be considering the cases when users create many
multivariate statistics on attribute sets having
duplications. But it looks too-much for me. MV-stats are more
resource-eating so we can assume the minimum usage of that.Not really. I don't expect huge numbers of multivariate stats to be
built on the tables.But I think restricting the users to use a single multivariate
statistics per table would be a significant limitation. And once you
allow using multiple multivariate statistics for the set of clauses,
supporting over-lapping stats is not that difficult.What it however makes possible is combining multiple "small" stats
into a larger one in a very efficient way - it assumes the overlap is
sufficient, of course. But if that's true you may build multiple small
(and very accurate) stats instead of one huge (or very inaccurate)
statistics.This also makes it possible to handle complex combinations of clauses
that are compatible and incompatible with multivariate statistics, by
passing the conditions.My suggestion in the patch is a bottom-up approach to find
mv-applicable portion(s) in the expression tree, which is the
basic way of planner overall. The approach requires no repetitive
run of tree walker, that is, pull_varnos. It could fail to find
the 'optimal' solution for complex situations but needs far less
calculation for almost the same return (I think..).Even though it doesn't consider the functional dependency, the
reduce of the code shows the efficiency. It does not nothing
tricky.OK
The functional dependency code looks immature in both the
detection phase and application phase in comparison to MCV and
histogram. Addition to that, as the comment in dependencies.c
says, fdep is not so significant (than MCV/HIST) because it is
usually carefully avoided and should be noticed and considered in
designing of application or the whole system.
Persisting to apply them all at once doesn't seem to be a good
strategy to be adopted earlier.
Or perhaps it might be better to register the dependency itself
than registering incomplete information (only the set of colums
involoved in the relationship) and try to detect the relationship
from the given values. I suppose those who can register the
columnset know the precise nature of the dependency in advance.
On a conceptual level, I think the idea to split the estimation into
two phases - enrich the expression tree with nodes with details about
stats etc, and then actually do the estimation in the second phase
might be interesting. Not because it's somehow clearer, but because it
gives us a chance to see the expression tree as a whole, with details
about all the stats (with the current code we process/estimate the
tree incrementally). But I don't really know how useful that would be.It is difficult to say which approach is better sinch it is
affected by what we think important than other things. However I
concern about that your code substantially reconstructs the
expression (clause) tree midst of processing it. I believe it
should be a separate phase for simplicity. Of course additional
required resource is also should be considered but it is rather
reduced for this case.What do you mean by "reconstruct the expression tree"? It's true I'm
walking the expression tree top-down, but how is that reconstructing?
For example clauselist_mv_split does. It separates mvclauses from
original clauselist and apply mv-stats at once and (parhaps) let
the rest be processed in the 'normal' route. I called this as
"reconstruct", which I tried to do explicity and separately.
I don't think the proposed change makes the process somehow clearer. I
know it's a PoC at this point, so I don't expect it to be perfect, but
for me the original code is certainly clearer. Of course, I'm biased
as I wrote the current code, and I (naturally) shaped it to match my
ideas during the development process, and I'm much more familiar with
it.Mmm. we need someone else's opition:) What I think on this point
is described just above... OK, I try to describe this in other
words.I find your comments very valuable. I may not agree with some of them,
but I certainly appreciate your point of view. So thank you very much
for the time you spent reviewing this patch so far!
Yeah, thank you for your patience and kindness.
The embedded approach simply increases the state and code path by,
roughly, multiplication basis. The separate approcach adds them in
addition basis. I thinks this is the most siginificant point of why I
feel it 'clear'.Of course, the acceptable complexity differs according to the
fundamental complexity, performance, required memory or someting
others but I feel it is too-much complexity for the objective.Yes, I think we might have slightly different objectives in mind.
Sure! Now I'm understand what is the point.
Regarding the complexity - I am not too worried about spending more
CPU cycles on this, as long as it does not impact the case where
people have no multivariate statistics at all. That's because I expect
people to use this for large DSS/DWH data sets with lots of
dependencies in the (often denormalized) tables and complex conditions
- in those cases the planning difference is negligible, especially if
the improved estimates make the query run in seconds instead of hours.
I share the vision with you. If that is the case, the mv-stats
route should not be intrude the existing non-mv-stats route. I
feel you have too much intruded clauselist_selectivity all the
more.
If that is the case, my mv-distinct code has different objective
from you. It aims to save the misestimation from multicolumn
correlations more commonly occurs in OLTP usage.
This is why I was so careful to entirely skip the expensive processing
when where were no multivariate stats, and why I don't like the fact
that your approach makes this skip more difficult (or maybe
impossible, I'm not sure).
My code totally skips if transformRestrictionForEstimate returns
NULL and runs clauselist_selectivity as usual. I think almost the
same as yours.
However, if you think it I believe we should not only skipping
calculation but also hiding the additional code blocks which is
overwhelming the normal route. The one of major objectives of my
approach is that point.
It's also true that most OLTP queries (especially the short ones, thus
most impacted by the increase of planning time) use rather
short/simple clause lists, so even the top-down approach should be
very cheap.Omitting the support for functional dependencies is a bit unfortunate,
I think. Is that merely to make the PoC simpler, or is there something
that makes it impossible to support that kind of stats?I don't think so. I ommited it simply because it would more time
to implement.OK, thanks for confirming this.
Another thing that I noticed is that you completely removed the code
that combined multiple stats (and selected the best combination of
stats). In other words, you've reverted to the intermediate single
statistics approach, including removing the improved handling of OR
clauses and conditions.Yeah, good catch :p I noticed that just after submitting the
patch that I retaion only one statistics at the second level from
the bottom but it is easily fixed by changing pruning timing. The
struct can hold multiple statistics anyway.Great!
But sorry. I found that considering multiple stats at every level
cannot be done without exhaustive searching of combinations among
child clauses and needs additional data structure. It needs more
thoughs.. As mentioned later, top-down might be suitable for
this optimization.
And I don't omit OR case. It is handled along with the AND
case. (in wrong way?)Oh, I see. I got a bit confused because you've removed the
optimization step (and conditions), and that needs to be handled a bit
differently for the OR clauses.
Sorry to have forced you reading unapplicable patch:p
It's a bit difficult to judge the proposed
approach not knowing how well it supports those (quite crucial)
features. What if it can't support some them., or what if it makes the
code much more complicated (thus defeating the goal of making it more
clear)?OR is supported, Fdep is maybe supportable, but all of them
occurs within the function with the entangled name
(transform..something). But I should put more consider on your
latest code before that.Good. Likewise, I'd like to see more of your approach ;-)
I share your concern about the performance impact - one thing is that
this new code might be slower than the original one, but a more
serious issue IMHO is that the performance impact will happen even for
relations with no multivariate stats at all. The original patch was
very careful about getting ~0% overhead in such cases,I don't think so. find_stats runs pull_varnos and
transformRestric.. also uses pull_varnos to bail out at the top
level. They should have almost the same overhead for the case.Understood. As I explained above, I'm not all that concerned about the
performance impact, as long as we make sure it only applies to people
using the multivariate stats.I also think a combined approach - first a bottom-up step (identifying
the largest compatible subtrees & caching the varnos), then a top-down
step (doing the same optimization as implemented today) might minimize
the performance impact.
I almost reaching the same conclusion.
and if the new
code does not allow that, I don't see this approach as acceptable. We
must not put additional overhead on people not using multivariate
stats.But I think it's worth exploring this idea a bit more - can you rebase
it to the current patch version (as on github) and adding the missing
pieces (functional dependencies, multi-statistics estimation and
passing conditions)?With pleasure. Please wait for a while.
Sure. Take your time.
One more thing - I noticed you extended the pg_operator catalog with a
oprmvstat attribute, used to flag operators that are compatible with
multivariate stats. I'm not happy with the current approach (using
oprrest to do this decision), but I'm not really sure this is a good
solution either. The culprit is that it only answers one of the two
important questions - Is it compatible? How to perform the estimation?Hostly saying, I also don't like this. But checking oprrest is
unpleasant much the same.The patch is already quite massive, so let's use the same approach as
current stats, and leave this problem for another patch. If we come up
with a great idea, we can work on it, but I see this as a loosely
related annoyance rather than something this patch aims to address.
Agreed.
So we'd have to rely on oprrest anyway, when actually performing the
estimation of a clause with "compatible" operator. And we'd have to
keep in sync two places (catalog and checks in file), and we'd have to
update the catalog after improving the implementation (adding support
for another operator).Mmm. It depends on what the deveopers think about the definition
of oprrest. More practically, I'm worried whether it cannot be
other than eqsel for any equality operator. And the same for
comparison operators.OTOH if you define a new operator with oprrest=F_EQSEL, you're
effectively saying "It's OK to estimate this using regular eq/lt/gt
operators". If your operator is somehow incompatible with that, you
should not set oprrest=F_EQSEL.
In contrast, some function other than F_EQSEL might be compatible
with mv-statistics.
For all that, it's not my concern. Although I think they really
are effectively the same, I'm uneasy to use the field apparently
not intended (or suitable) to distinguish such kind of property
of operator.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Modify-the-estimate-path-to-be-bottom-up-processing.patchtext/x-patch; charset=us-asciiDownload
>From 69da94afdd35ed3469dfe9793db38d895adf2b1e Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 30 Jul 2015 18:16:30 +0900
Subject: [PATCH] Modify the estimate path to be bottom-up processing.
---
src/backend/catalog/pg_operator.c | 6 +
src/backend/optimizer/path/clausesel.c | 4134 +++++++-------------------------
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/cache/lsyscache.c | 40 +
src/include/catalog/pg_operator.h | 1550 ++++++------
src/include/nodes/nodes.h | 1 +
src/include/nodes/relation.h | 22 +-
src/include/optimizer/cost.h | 6 +-
src/include/utils/lsyscache.h | 1 +
src/include/utils/mvstats.h | 3 +
12 files changed, 1693 insertions(+), 4114 deletions(-)
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index 072f530..dea39d3 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -251,6 +251,9 @@ OperatorShellMake(const char *operatorName,
values[Anum_pg_operator_oprrest - 1] = ObjectIdGetDatum(InvalidOid);
values[Anum_pg_operator_oprjoin - 1] = ObjectIdGetDatum(InvalidOid);
+ /* XXXX: How this should be implemented? */
+ values[Anum_pg_operator_oprmvstat - 1] = CStringGetTextDatum("---");
+
/*
* open pg_operator
*/
@@ -508,6 +511,9 @@ OperatorCreate(const char *operatorName,
values[Anum_pg_operator_oprrest - 1] = ObjectIdGetDatum(restrictionId);
values[Anum_pg_operator_oprjoin - 1] = ObjectIdGetDatum(joinId);
+ /* XXXX: How this should be implemented? */
+ values[Anum_pg_operator_oprmvstat - 1] = CStringGetTextDatum("---");
+
pg_operator_desc = heap_open(OperatorRelationId, RowExclusiveLock);
/*
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 9bb5b3f..b8bb9f3 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -46,13 +46,6 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
-static Selectivity clauselist_selectivity_or(PlannerInfo *root,
- List *clauses,
- int varRelid,
- JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions);
-
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -60,38 +53,6 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_MCV 0x02
#define MV_CLAUSE_TYPE_HIST 0x04
-static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
- int type);
-
-static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
- int type);
-
-static Bitmapset *clause_mv_get_attnums(PlannerInfo *root, Node *clause);
-
-static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
- Oid varRelid, List *stats,
- SpecialJoinInfo *sjinfo);
-
-static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
- List *clauses, Oid varRelid,
- List **mvclauses, MVStatisticInfo *mvstats, int types);
-
-static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- MVStatisticInfo *mvstats, List *clauses,
- List *conditions, bool is_or);
-
-static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- MVStatisticInfo *mvstats,
- List *clauses, List *conditions,
- bool is_or, bool *fullmatch,
- Selectivity *lowsel);
-static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- MVStatisticInfo *mvstats,
- List *clauses, List *conditions,
- bool is_or);
-
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
int nmatches, char * matches,
@@ -104,79 +65,11 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
-/*
- * Describes a combination of multiple statistics to cover attributes
- * referenced by the clauses. The array 'stats' (with nstats elements)
- * lists attributes (in the order as they are applied), and number of
- * clause attributes covered by this solution.
- *
- * choose_mv_statistics_exhaustive() uses this to track both the current
- * and the best solutions, while walking through the state of possible
- * combination.
- */
-typedef struct mv_solution_t {
- int nclauses; /* number of clauses covered */
- int nconditions; /* number of conditions covered */
- int nstats; /* number of stats applied */
- int *stats; /* stats (in the apply order) */
-} mv_solution_t;
-
-static List *choose_mv_statistics(PlannerInfo *root,
- List *mvstats,
- List *clauses, List *conditions,
- Oid varRelid,
- SpecialJoinInfo *sjinfo);
-
-static List *filter_clauses(PlannerInfo *root, Oid varRelid,
- SpecialJoinInfo *sjinfo, int type,
- List *stats, List *clauses,
- Bitmapset **attnums);
-
-static List *filter_stats(List *stats, Bitmapset *new_attnums,
- Bitmapset *all_attnums);
-
-static Bitmapset **make_stats_attnums(MVStatisticInfo *mvstats,
- int nmvstats);
-
-static MVStatisticInfo *make_stats_array(List *stats, int *nmvstats);
-
-static List* filter_redundant_stats(List *stats,
- List *clauses, List *conditions);
-
-static Node** make_clauses_array(List *clauses, int *nclauses);
-
-static Bitmapset ** make_clauses_attnums(PlannerInfo *root, Oid varRelid,
- SpecialJoinInfo *sjinfo, int type,
- Node **clauses, int nclauses);
-
-static bool* make_cover_map(Bitmapset **stats_attnums, int nmvstats,
- Bitmapset **clauses_attnums, int nclauses);
-
-static bool has_stats(List *stats, int type);
-
-static List * find_stats(PlannerInfo *root, List *clauses,
- Oid varRelid, Index *relid);
-
-static Bitmapset* fdeps_collect_attnums(List *stats);
-
-static int *make_idx_to_attnum_mapping(Bitmapset *attnums);
-static int *make_attnum_to_idx_mapping(Bitmapset *attnums);
-
-static bool *build_adjacency_matrix(List *stats, Bitmapset *attnums,
- int *idx_to_attnum, int *attnum_to_idx);
-
-static void multiply_adjacency_matrix(bool *matrix, int natts);
-
static List* fdeps_reduce_clauses(List *clauses,
Bitmapset *attnums, bool *matrix,
int *idx_to_attnum, int *attnum_to_idx,
Index relid);
-static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
- List *clauses, Bitmapset *deps_attnums,
- List **reduced_clauses, List **deps_clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
-
static Bitmapset * get_varattnos(Node * node, Index relid);
int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
@@ -188,397 +81,41 @@ int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
#define UPDATE_RESULT(m,r,isor) \
(m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+typedef enum mv_selec_status
+{
+ NORMAL,
+ FULL_MATCH,
+ FAILURE
+} mv_selec_status;
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
+/***************/
+RestrictStatData *
+transformRestrictInfoForEstimate(PlannerInfo *root, List *clauses, int varRelid, SpecialJoinInfo *sjinfo);
/*
- * clauselist_selectivity -
- * Compute the selectivity of an implicitly-ANDed list of boolean
- * expression clauses. The list can be empty, in which case 1.0
- * must be returned. List elements may be either RestrictInfos
- * or bare expression clauses --- the former is preferred since
- * it allows caching of results.
- *
- * See clause_selectivity() for the meaning of the additional parameters.
- *
- * Our basic approach is to take the product of the selectivities of the
- * subclauses. However, that's only right if the subclauses have independent
- * probabilities, and in reality they are often NOT independent. So,
- * we want to be smarter where we can.
- *
- * Currently, the only extra smarts we have is to recognize "range queries",
- * such as "x > 34 AND x < 42". Clauses are recognized as possible range
- * query components if they are restriction opclauses whose operators have
- * scalarltsel() or scalargtsel() as their restriction selectivity estimator.
- * We pair up clauses of this form that refer to the same variable. An
- * unpairable clause of this kind is simply multiplied into the selectivity
- * product in the normal way. But when we find a pair, we know that the
- * selectivities represent the relative positions of the low and high bounds
- * within the column's range, so instead of figuring the selectivity as
- * hisel * losel, we can figure it as hisel + losel - 1. (To visualize this,
- * see that hisel is the fraction of the range below the high bound, while
- * losel is the fraction above the low bound; so hisel can be interpreted
- * directly as a 0..1 value but we need to convert losel to 1-losel before
- * interpreting it as a value. Then the available range is 1-losel to hisel.
- * However, this calculation double-excludes nulls, so really we need
- * hisel + losel + null_frac - 1.)
- *
- * If either selectivity is exactly DEFAULT_INEQ_SEL, we forget this equation
- * and instead use DEFAULT_RANGE_INEQ_SEL. The same applies if the equation
- * yields an impossible (negative) result.
- *
- * A free side-effect is that we can recognize redundant inequalities such
- * as "x < 4 AND x < 5"; only the tighter constraint will be counted.
- *
- * Of course this is all very dependent on the behavior of
- * scalarltsel/scalargtsel; perhaps some day we can generalize the approach.
- *
- *
- * Multivariate statististics
- * --------------------------
- * This also uses multivariate stats to estimate combinations of
- * conditions, in a way (a) maximizing the estimate accuracy by using
- * as many stats as possible, and (b) minimizing the overhead,
- * especially when there are no suitable multivariate stats (so if you
- * are not using multivariate stats, there's no additional overhead).
- *
- * The following checks are performed (in this order), and the optimizer
- * falls back to regular stats on the first 'false'.
- *
- * NOTE: This explains how this works with all the patches applied, not
- * just the functional dependencies.
- *
- * (0) check if there are multivariate stats on the relation
- *
- * If no, just skip all the following steps (directly to the
- * original code).
- *
- * (1) check how many attributes are there in conditions compatible
- * with functional dependencies
- *
- * Only simple equality clauses are considered compatible with
- * functional dependencies (and that's unlikely to change, because
- * that's the only case when functional dependencies are useful).
- *
- * If there are no conditions that might be handled by multivariate
- * stats, or if the conditions reference just a single column, it
- * makes no sense to use functional dependencies, so skip to (4).
- *
- * (2) reduce the clauses using functional dependencies
- *
- * This simply attempts to 'reduce' the clauses by applying functional
- * dependencies. For example if there are two clauses:
- *
- * WHERE (a = 1) AND (b = 2)
- *
- * and we know that 'a' determines the value of 'b', we may remove
- * the second condition (b = 2) when computing the selectivity.
- * This is of course tricky - see mvstats/dependencies.c for details.
- *
- * After the reduction, step (1) is to be repeated.
- *
- * (3) check how many attributes are there in conditions compatible
- * with MCV lists and histograms
- *
- * What conditions are compatible with multivariate stats is decided
- * by clause_is_mv_compatible(). At this moment, only conditions
- * of the form "column operator constant" (for simple comparison
- * operators), IS [NOT] NULL and some AND/OR clauses are considered
- * compatible with multivariate statistics.
- *
- * Again, see clause_is_mv_compatible() for details.
- *
- * (4) check how many attributes are there in conditions compatible
- * with MCV lists and histograms
- *
- * If there are no conditions that might be handled by MCV lists
- * or histograms, or if the conditions reference just a single
- * column, it makes no sense to continue, so just skip to (7).
- *
- * (5) choose the stats matching the most columns
- *
- * If there are multiple instances of multivariate statistics (e.g.
- * built on different sets of columns), we choose the stats covering
- * the most columns from step (1). It may happen that all available
- * stats match just a single column - for example with conditions
- *
- * WHERE a = 1 AND b = 2
- *
- * and statistics built on (a,c) and (b,c). In such case just fall
- * back to the regular stats because it makes no sense to use the
- * multivariate statistics.
- *
- * For more details about how exactly we choose the stats, see
- * choose_mv_statistics().
- *
- * (6) use the multivariate stats to estimate matching clauses
- *
- * (7) estimate the remaining clauses using the regular statistics
+ * boolop_selectivity -
*/
-Selectivity
-clauselist_selectivity(PlannerInfo *root,
+static Selectivity
+and_clause_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions)
+ SpecialJoinInfo *sjinfo)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
- /* processing mv stats */
- Index relid = InvalidOid;
-
- /* attributes in mv-compatible clauses */
- Bitmapset *mvattnums = NULL;
- List *stats = NIL;
-
- /* use clauses (not conditions), because those are always non-empty */
- stats = find_stats(root, clauses, varRelid, &relid);
-
- /*
- * If there's exactly one clause, then no use in trying to match up
- * pairs, or matching multivariate statistics, so just go directly
- * to clause_selectivity().
- */
- if (list_length(clauses) == 1)
- return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo, conditions);
-
- /*
- * Check that there are some stats with functional dependencies
- * built (by walking the stats list). We're going to find that
- * anyway when trying to apply the functional dependencies, but
- * this is probably a tad faster.
- */
- if (has_stats(stats, MV_CLAUSE_TYPE_FDEP))
- {
- /*
- * Collect attributes referenced by mv-compatible clauses (looking
- * for clauses compatible with functional dependencies for now).
- */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- MV_CLAUSE_TYPE_FDEP);
-
- /*
- * If there are mv-compatible clauses, referencing at least two
- * different columns (otherwise it makes no sense to use mv stats),
- * try to reduce the clauses using functional dependencies, and
- * recollect the attributes from the reduced list.
- *
- * We don't need to select a single statistics for this - we can
- * apply all the functional dependencies we have.
- */
- if (bms_num_members(mvattnums) >= 2)
- clauses = clauselist_apply_dependencies(root, clauses, varRelid,
- stats, sjinfo);
- }
-
- /*
- * Check that there are statistics with MCV list or histogram.
- * If not, we don't need to waste time with the optimization.
- */
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
- {
- /*
- * Recollect attributes from mv-compatible clauses (maybe we've
- * removed so many clauses we have a single mv-compatible attnum).
- * From now on we're only interested in MCV-compatible clauses.
- */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
-
- /*
- * If there still are at least two columns, we'll try to select
- * a suitable combination of multivariate stats. If there are
- * multiple combinations, we'll try to choose the best one.
- * See choose_mv_statistics for more details.
- */
- if (bms_num_members(mvattnums) >= 2)
- {
- int k;
- ListCell *s;
-
- /*
- * Copy the list of conditions, so that we can build a list
- * of local conditions (and keep the original intact, for
- * the other clauses at the same level).
- */
- List *conditions_local = list_copy(conditions);
-
- /* find the best combination of statistics */
- List *solution = choose_mv_statistics(root, stats,
- clauses, conditions,
- varRelid, sjinfo);
-
- /* we have a good solution (list of stats) */
- foreach (s, solution)
- {
- MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
-
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
- List *mvclauses_new = NIL;
- List *mvclauses_conditions = NIL;
- Bitmapset *stat_attnums = NULL;
-
- /* build attnum bitmapset for this statistics */
- for (k = 0; k < mvstat->stakeys->dim1; k++)
- stat_attnums = bms_add_member(stat_attnums,
- mvstat->stakeys->values[k]);
-
- /*
- * Append the compatible conditions (passed from above)
- * to mvclauses_conditions.
- */
- foreach (l, conditions)
- {
- Node *c = (Node*)lfirst(l);
- Bitmapset *tmp = clause_mv_get_attnums(root, c);
-
- if (bms_is_subset(tmp, stat_attnums))
- mvclauses_conditions
- = lappend(mvclauses_conditions, c);
-
- bms_free(tmp);
- }
-
- /* split the clauselist into regular and mv-clauses
- *
- * We keep the list of clauses (we don't remove the
- * clauses yet, because we want to use the clauses
- * as conditions of other clauses).
- *
- * FIXME Do this only once, i.e. filter the clauses
- * once (selecting clauses covered by at least
- * one statistics) and then convert them into
- * smaller per-statistics lists of conditions
- * and estimated clauses.
- */
- clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
-
- /*
- * We've chosen the statistics to match the clauses, so
- * each statistics from the solution should have at least
- * one new clause (not covered by the previous stats).
- */
- Assert(mvclauses != NIL);
-
- /*
- * Mvclauses now contains only clauses compatible
- * with the currently selected stats, but we have to
- * split that into conditions (already matched by
- * the previous stats), and the new clauses we need
- * to estimate using this stats.
- */
- foreach (l, mvclauses)
- {
- ListCell *p;
- bool covered = false;
- Node *clause = (Node *) lfirst(l);
- Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
-
- /*
- * If already covered by previous stats, add it to
- * conditions.
- *
- * TODO Maybe this could be relaxed a bit? Because
- * with complex and/or clauses, this might
- * mean no statistics actually covers such
- * complex clause.
- */
- foreach (p, solution)
- {
- int k;
- Bitmapset *stat_attnums = NULL;
-
- MVStatisticInfo *prev_stat
- = (MVStatisticInfo *)lfirst(p);
-
- /* break if we've ran into current statistic */
- if (prev_stat == mvstat)
- break;
-
- for (k = 0; k < prev_stat->stakeys->dim1; k++)
- stat_attnums = bms_add_member(stat_attnums,
- prev_stat->stakeys->values[k]);
-
- covered = bms_is_subset(clause_attnums, stat_attnums);
-
- bms_free(stat_attnums);
-
- if (covered)
- break;
- }
-
- if (covered)
- mvclauses_conditions
- = lappend(mvclauses_conditions, clause);
- else
- mvclauses_new
- = lappend(mvclauses_new, clause);
- }
-
- /*
- * We need at least one new clause (not just conditions).
- */
- Assert(mvclauses_new != NIL);
-
- /* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvstat,
- mvclauses_new,
- mvclauses_conditions,
- false); /* AND */
- }
-
- /*
- * And now finally remove all the mv-compatible clauses.
- *
- * This only repeats the same split as above, but this
- * time we actually use the result list (and feed it to
- * the next call).
- */
- foreach (s, solution)
- {
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
-
- MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
-
- /* split the list into regular and mv-clauses */
- clauses = clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
-
- /*
- * Add the clauses to the conditions (to be passed
- * to regular clauses), irrespectedly whether it
- * will be used as a condition or a clause here.
- *
- * We only keep the remaining conditions in the
- * clauses (we keep what clauselist_mv_split returns)
- * so we add each MV condition exactly once.
- */
- conditions_local = list_concat(conditions_local, mvclauses);
- }
-
- /* from now on, work with the 'local' list of conditions */
- conditions = conditions_local;
- }
- }
-
/*
* If there's exactly one clause, then no use in trying to match up
* pairs, so just go directly to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo, conditions);
-
+ varRelid, jointype, sjinfo);
/*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
@@ -591,8 +128,7 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
- conditions);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
/*
* Check for being passed a RestrictInfo.
@@ -750,270 +286,333 @@ clauselist_selectivity(PlannerInfo *root,
return s1;
}
-/*
- * Similar to clauselist_selectivity(), but for clauses connected by OR.
- *
- * That means a few differences:
- *
- * - functional dependencies don't apply to OR-clauses
- *
- * - we can't add the previous clauses to conditions
- *
- * - combined selectivities are combined using (s1+s2 - s1*s2)
- * and not as a multiplication (s1*s2)
- *
- * Another way to evaluate this might be turning
- *
- * (a OR b OR c)
- *
- * into
- *
- * NOT ((NOT a) AND (NOT b) AND (NOT c))
- *
- * and computing selectivity of that using clauselist_selectivity().
- * That would allow (a) using the clauselist_selectivity directly and
- * (b) using the previous clauses as conditions. Not sure if it's
- * worth the additional complexity, though.
- *
- * FIXME I'm not entirely sure, but ISTM to me that the clauses might
- * be processed repeatedly - once for each statistics in the
- * solution. E.g. with (a=1 OR b=1 OR c=1) and statistics on
- * [a,b] and [b,c], we can't use [b=1] with both stats, because
- * we can't combine those using conditional probabilities as with
- * AND clauses (no conditions with OR clauses).
- *
- * FIXME Maybe we'll need an alternative choose_mv_statistics for OR
- * clauses, because we can't do so complicated stuff anyway
- * (conditions, etc.). We generally need to split the clauses
- * into multiple disjunct subsets, each estimated separately.
- * So just search for the smallest number of stats, covering the
- * clauses.
- *
- * Or maybe just get rid of all this and use the simple formula
- *
- * s1 + s2 * (s1*s2) formula, which seems to be working
- *
- * quite reasonably.
- */
static Selectivity
-clauselist_selectivity_or(PlannerInfo *root,
- List *clauses,
- int varRelid,
- JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions)
+clause_mcv_selectivity(PlannerInfo *root, MVStatisticInfo *stats,
+ Node *clause, int *status)
{
- Selectivity s1 = 0.0;
- ListCell *l;
-
- /* processing mv stats */
- Index relid = InvalidOid;
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+ int nconditions = 0;
+ char *matches = NULL;
+ char *condition_matches = NULL;
+ Selectivity s = 0.0;
+ Selectivity t = 0.0;
+ Selectivity u = 0.0;
+ BoolExpr *expr = (BoolExpr*) clause;
+ bool is_or = or_clause(clause);
+ int i;
+ bool fullmatch;
+ Selectivity lowsel;
- /* attributes in mv-compatible clauses */
- Bitmapset *mvattnums = NULL;
- List *stats = NIL;
+ Assert(IsA(expr, BoolExpr));
+
+ if (!expr || not_clause(clause)) /* For now!! */
+ {
+ *status = FAILURE;
+ return 0.0;
+ }
+ if (!stats->mcv_built)
+ {
+ *status = FAILURE;
+ return 0.0;
+ }
+
+ mcvlist = load_mv_mcvlist(stats->mvoid);
+ Assert (mcvlist != NULL);
+ Assert (mcvlist->nitems > 0);
- /* use clauses (not conditions), because those are always non-empty */
- stats = find_stats(root, clauses, varRelid, &relid);
+ nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+ matches = palloc0(sizeof(char) * nmatches);
- /* OR-clauses do not work with functional dependencies */
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
- {
- /*
- * Recollect attributes from mv-compatible clauses (maybe we've
- * removed so many clauses we have a single mv-compatible attnum).
- * From now on we're only interested in MCV-compatible clauses.
- */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+ if (!is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
- /*
- * If there still are at least two columns, we'll try to select
- * a suitable multivariate stats.
- */
- if (bms_num_members(mvattnums) >= 2)
- {
- int k;
- ListCell *s;
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
- List *solution
- = choose_mv_statistics(root, stats,
- clauses, conditions,
- varRelid, sjinfo);
+ nmatches = update_match_bitmap_mcvlist(root, expr->args,
+ stats->stakeys, mcvlist,
+ (is_or ? 0 : nmatches), matches,
+ &lowsel, &fullmatch, is_or);
- /* we have a good solution stats */
- foreach (s, solution)
- {
- Selectivity s2;
- MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ u += mcvlist->items[i]->frequency;
+
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
- List *mvclauses_new = NIL;
- List *mvclauses_conditions = NIL;
- Bitmapset *stat_attnums = NULL;
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
- /* build attnum bitmapset for this statistics */
- for (k = 0; k < mvstat->stakeys->dim1; k++)
- stat_attnums = bms_add_member(stat_attnums,
- mvstat->stakeys->values[k]);
+ t += mcvlist->items[i]->frequency;
+ }
- /*
- * Append the compatible conditions (passed from above)
- * to mvclauses_conditions.
- */
- foreach (l, conditions)
- {
- Node *c = (Node*)lfirst(l);
- Bitmapset *tmp = clause_mv_get_attnums(root, c);
+ pfree(matches);
+ pfree(condition_matches);
+ pfree(mcvlist);
- if (bms_is_subset(tmp, stat_attnums))
- mvclauses_conditions
- = lappend(mvclauses_conditions, c);
+ if (fullmatch)
+ *status = FULL_MATCH;
- bms_free(tmp);
- }
+ /* mcv_low is omitted for now */
- /* split the clauselist into regular and mv-clauses
- *
- * We keep the list of clauses (we don't remove the
- * clauses yet, because we want to use the clauses
- * as conditions of other clauses).
- *
- * FIXME Do this only once, i.e. filter the clauses
- * once (selecting clauses covered by at least
- * one statistics) and then convert them into
- * smaller per-statistics lists of conditions
- * and estimated clauses.
- */
- clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
- /*
- * We've chosen the statistics to match the clauses, so
- * each statistics from the solution should have at least
- * one new clause (not covered by the previous stats).
- */
- Assert(mvclauses != NIL);
+ return (s / t) * u;
+}
- /*
- * Mvclauses now contains only clauses compatible
- * with the currently selected stats, but we have to
- * split that into conditions (already matched by
- * the previous stats), and the new clauses we need
- * to estimate using this stats.
- *
- * XXX We'll only use the new clauses, but maybe we
- * should use the conditions too, somehow. We can't
- * use that directly in conditional probability, but
- * maybe we might use them in a different way?
- *
- * If we have a clause (a OR b OR c), then knowing
- * that 'a' is TRUE means (b OR c) can't make the
- * whole clause FALSE.
- *
- * This is pretty much what
- *
- * (a OR b) == NOT ((NOT a) AND (NOT b))
- *
- * implies.
- */
- foreach (l, mvclauses)
- {
- ListCell *p;
- bool covered = false;
- Node *clause = (Node *) lfirst(l);
- Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
+static Selectivity
+clause_hist_selectivity(PlannerInfo *root, MVStatisticInfo *stats,
+ Node *clause, int *status)
+{
+ MVSerializedHistogram mvhist = NULL;
+ int nmatches = 0;
+ int nconditions = 0;
+ char *matches = NULL;
+ char *condition_matches = NULL;
+ Selectivity s = 0.0;
+ Selectivity t = 0.0;
+ Selectivity u = 0.0;
+ BoolExpr *expr = (BoolExpr*) clause;
+ bool is_or = or_clause(clause);
+ int i;
- /*
- * If already covered by previous stats, add it to
- * conditions.
- *
- * TODO Maybe this could be relaxed a bit? Because
- * with complex and/or clauses, this might
- * mean no statistics actually covers such
- * complex clause.
- */
- foreach (p, solution)
- {
- int k;
- Bitmapset *stat_attnums = NULL;
+ Assert(IsA(expr, BoolExpr));
- MVStatisticInfo *prev_stat
- = (MVStatisticInfo *)lfirst(p);
+ if (!expr || not_clause(clause)) /* for now */
+ {
+ *status = 0;
+ return 0.0;
+ }
+ if (!stats->hist_built)
+ {
+ *status = 1;
+ return 0.0;
+ }
+ mvhist = load_mv_histogram(stats->mvoid);
+ Assert (mvhist != NULL);
+ Assert (clause != NULL);
- /* break if we've ran into current statistic */
- if (prev_stat == mvstat)
- break;
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
+ matches = palloc0(sizeof(char) * nmatches);
+ if (!is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
- for (k = 0; k < prev_stat->stakeys->dim1; k++)
- stat_attnums = bms_add_member(stat_attnums,
- prev_stat->stakeys->values[k]);
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
- covered = bms_is_subset(clause_attnums, stat_attnums);
+ update_match_bitmap_histogram(root, expr->args, stats->stakeys, mvhist,
+ (is_or ? 0 : nmatches), matches, is_or);
- bms_free(stat_attnums);
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ float coeff = 1.0;
+ u += mvhist->buckets[i]->ntuples;
- if (covered)
- break;
- }
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
- if (! covered)
- mvclauses_new = lappend(mvclauses_new, clause);
- }
+ t += coeff * mvhist->buckets[i]->ntuples;
- /*
- * We need at least one new clause (not just conditions).
- */
- Assert(mvclauses_new != NIL);
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += coeff * mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
+ }
- /* compute the multivariate stats */
- s2 = clauselist_mv_selectivity(root, mvstat,
- mvclauses_new,
- mvclauses_conditions,
- true); /* OR */
+ pfree(matches);
+ pfree(condition_matches);
+ pfree(mvhist);
- s1 = s1 + s2 - s1 * s2;
- }
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
- /*
- * And now finally remove all the mv-compatible clauses.
- *
- * This only repeats the same split as above, but this
- * time we actually use the result list (and feed it to
- * the next call).
- */
- foreach (s, solution)
- {
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
+ return (s / t) * u;
+}
- MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+static Selectivity
+apply_mvstats(PlannerInfo *root, Node *clause, bm_mvstat *statent)
+{
+ Selectivity s1 = 0.0;
+ int status;
- /* split the list into regular and mv-clauses */
- clauses = clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
- }
- }
+ if (statent->mvkind & MVSTATISTIC_MCV)
+ {
+ s1 = clause_mcv_selectivity(root, statent->stats, clause, &status);
+ if (status == FULL_MATCH && s1 > 0.0)
+ return s1;
}
+
+ if (statent->mvkind & MVSTATISTIC_HIST)
+ s1 = s1 + clause_hist_selectivity(root, statent->stats,
+ clause, &status);
- /*
- * Handle the remaining clauses (either using regular statistics,
- * or by multivariate stats at the next level).
- */
- foreach(l, clauses)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(l),
- varRelid,
- jointype,
- sjinfo,
- conditions);
+ return s1;
+}
+
+static inline Selectivity
+merge_selectivity(Selectivity s1, Selectivity s2, BoolExprType op)
+{
+ if (op == AND_EXPR)
+ s1 = s1 * s2;
+ else
s1 = s1 + s2 - s1 * s2;
+
+ return s1;
+}
+/*
+ * mvclause_selectivity -
+ */
+static Selectivity
+mvclause_selectivity(PlannerInfo *root,
+ RestrictStatData *rstat,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo)
+{
+ Selectivity s1;
+ ListCell *lc;
+
+ if (!rstat->mvclause && !rstat->nonmvclause && !rstat->children)
+ return clause_selectivity(root, rstat->clause, varRelid, jointype,
+ sjinfo);
+
+ if (rstat->boolop == NOT_EXPR)
+ {
+ RestrictStatData *clause =
+ (RestrictStatData *)linitial(rstat->children);
+
+ s1 = 1.0 - mvclause_selectivity(root, clause, varRelid,
+ jointype, sjinfo);
+ return s1;
+ }
+
+ s1 = (rstat->boolop == AND_EXPR ? 1.0 : 0.0);
+
+ if (rstat->nonmvclause)
+ s1 = merge_selectivity(s1,
+ clause_selectivity(root, rstat->nonmvclause,
+ varRelid, jointype, sjinfo),
+ rstat->boolop);
+
+ if (rstat->mvclause)
+ {
+ bm_mvstat *mvs = (bm_mvstat*)linitial(rstat->mvstats);
+ Selectivity s2 = apply_mvstats(root, rstat->mvclause, mvs);
+
+ /* Fall back to ordinary calculation */
+ if (s2 < 0)
+ s2 = clause_selectivity(root, rstat->mvclause, varRelid,
+ jointype, sjinfo);
+ s1 = merge_selectivity(s1, s2, rstat->boolop);
+ }
+
+ foreach(lc, rstat->children)
+ {
+ RestrictStatData *rsd = (RestrictStatData *) lfirst(lc);
+ Assert(IsA(rsd, RestrictStatData));
+
+ s1 = merge_selectivity(s1,
+ mvclause_selectivity(root, rsd, varRelid,
+ jointype, sjinfo),
+ rstat->boolop);
+ }
+
+ return s1;
+}
+
+
+/*
+ * clauselist_selectivity -
+ * Compute the selectivity of an implicitly-ANDed list of boolean
+ * expression clauses. The list can be empty, in which case 1.0
+ * must be returned. List elements may be either RestrictInfos
+ * or bare expression clauses --- the former is preferred since
+ * it allows caching of results.
+ *
+ * See clause_selectivity() for the meaning of the additional parameters.
+ *
+ * Our basic approach is to take the product of the selectivities of the
+ * subclauses. However, that's only right if the subclauses have independent
+ * probabilities, and in reality they are often NOT independent. So,
+ * we want to be smarter where we can.
+ *
+ * Currently, the only extra smarts we have is to recognize "range queries",
+ * such as "x > 34 AND x < 42". Clauses are recognized as possible range
+ * query components if they are restriction opclauses whose operators have
+ * scalarltsel() or scalargtsel() as their restriction selectivity estimator.
+ * We pair up clauses of this form that refer to the same variable. An
+ * unpairable clause of this kind is simply multiplied into the selectivity
+ * product in the normal way. But when we find a pair, we know that the
+ * selectivities represent the relative positions of the low and high bounds
+ * within the column's range, so instead of figuring the selectivity as
+ * hisel * losel, we can figure it as hisel + losel - 1. (To visualize this,
+ * see that hisel is the fraction of the range below the high bound, while
+ * losel is the fraction above the low bound; so hisel can be interpreted
+ * directly as a 0..1 value but we need to convert losel to 1-losel before
+ * interpreting it as a value. Then the available range is 1-losel to hisel.
+ * However, this calculation double-excludes nulls, so really we need
+ * hisel + losel + null_frac - 1.)
+ *
+ * If either selectivity is exactly DEFAULT_INEQ_SEL, we forget this equation
+ * and instead use DEFAULT_RANGE_INEQ_SEL. The same applies if the equation
+ * yields an impossible (negative) result.
+ *
+ * A free side-effect is that we can recognize redundant inequalities such
+ * as "x < 4 AND x < 5"; only the tighter constraint will be counted.
+ *
+ * Of course this is all very dependent on the behavior of
+ * scalarltsel/scalargtsel; perhaps some day we can generalize the approach.
+ *
+ *
+ * Multivariate statististics
+ * --------------------------
+ * This also uses multivariate stats to estimate combinations of
+ * conditions, in a way (a) maximizing the estimate accuracy by using
+ * as many stats as possible, and (b) minimizing the overhead,
+ * especially when there are no suitable multivariate stats (so if you
+ * are not using multivariate stats, there's no additional overhead).
+ *
+ * The following checks are performed (in this order), and the optimizer
+ * falls back to regular stats on the first 'false'.
+ *
+ * NOTE: This explains how this works with all the patches applied, not
+ * just the functional dependencies.
+ *
+ */
+Selectivity
+clauselist_selectivity(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo)
+{
+ Selectivity s1 = 1.0;
+ RestrictStatData *rstat;
+ List *rinfos = clauses;
+
+ /* Reconstruct clauses so that multivariate statistics can be applied */
+ rstat = transformRestrictInfoForEstimate(root, clauses, varRelid, sjinfo);
+
+ if (rstat)
+ {
+ rinfos = rstat->unusedrinfos;
+
+ s1 = mvclause_selectivity(root, rstat, varRelid, jointype, sjinfo);
}
+ s1 = s1 * and_clause_selectivity(root, rinfos, varRelid, jointype, sjinfo);
+
return s1;
}
@@ -1224,8 +823,7 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions)
+ SpecialJoinInfo *sjinfo)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -1355,28 +953,37 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo,
- conditions);
+ sjinfo);
}
else if (and_clause(clause))
{
- /* share code with clauselist_selectivity() */
- s1 = clauselist_selectivity(root,
+ s1 = and_clause_selectivity(root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo,
- conditions);
+ sjinfo);
}
else if (or_clause(clause))
{
- /* just call to clauselist_selectivity_or() */
- s1 = clauselist_selectivity_or(root,
- ((BoolExpr *) clause)->args,
- varRelid,
- jointype,
- sjinfo,
- conditions);
+ /*
+ * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
+ * account for the probable overlap of selected tuple sets.
+ *
+ * XXX is this too conservative?
+ */
+ ListCell *arg;
+
+ s1 = 0.0;
+ foreach(arg, ((BoolExpr *) clause)->args)
+ {
+ Selectivity s2 = clause_selectivity(root,
+ (Node *) lfirst(arg),
+ varRelid,
+ jointype,
+ sjinfo);
+
+ s1 = s1 + s2 - s1 * s2;
+ }
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -1469,1895 +1076,51 @@ clause_selectivity(PlannerInfo *root,
jointype,
sjinfo);
}
- else if (IsA(clause, CurrentOfExpr))
- {
- /* CURRENT OF selects at most one row of its table */
- CurrentOfExpr *cexpr = (CurrentOfExpr *) clause;
- RelOptInfo *crel = find_base_rel(root, cexpr->cvarno);
-
- if (crel->tuples > 0)
- s1 = 1.0 / crel->tuples;
- }
- else if (IsA(clause, RelabelType))
- {
- /* Not sure this case is needed, but it can't hurt */
- s1 = clause_selectivity(root,
- (Node *) ((RelabelType *) clause)->arg,
- varRelid,
- jointype,
- sjinfo,
- conditions);
- }
- else if (IsA(clause, CoerceToDomain))
- {
- /* Not sure this case is needed, but it can't hurt */
- s1 = clause_selectivity(root,
- (Node *) ((CoerceToDomain *) clause)->arg,
- varRelid,
- jointype,
- sjinfo,
- conditions);
- }
-
- /* Cache the result if possible */
- if (cacheable)
- {
- if (jointype == JOIN_INNER)
- rinfo->norm_selec = s1;
- else
- rinfo->outer_selec = s1;
- }
-
-#ifdef SELECTIVITY_DEBUG
- elog(DEBUG4, "clause_selectivity: s1 %f", s1);
-#endif /* SELECTIVITY_DEBUG */
-
- return s1;
-}
-
-
-/*
- * Estimate selectivity for the list of MV-compatible clauses, using
- * using a MV statistics (combining a histogram and MCV list).
- *
- * This simply passes the estimation to the MCV list and then to the
- * histogram, if available.
- *
- * TODO Clamp the selectivity by min of the per-clause selectivities
- * (i.e. the selectivity of the most restrictive clause), because
- * that's the maximum we can ever get from ANDed list of clauses.
- * This may probably prevent issues with hitting too many buckets
- * and low precision histograms.
- *
- * TODO We may support some additional conditions, most importantly
- * those matching multiple columns (e.g. "a = b" or "a < b").
- * Ultimately we could track multi-table histograms for join
- * cardinality estimation.
- *
- * TODO Further thoughts on processing equality clauses: Maybe it'd be
- * better to look for stats (with MCV) covered by the equality
- * clauses, because then we have a chance to find an exact match
- * in the MCV list, which is pretty much the best we can do. We may
- * also look at the least frequent MCV item, and use it as a upper
- * boundary for the selectivity (had there been a more frequent
- * item, it'd be in the MCV list).
- *
- * TODO There are several options for 'sanity clamping' the estimates.
- *
- * First, if we have selectivities for each condition, then
- *
- * P(A,B) <= MIN(P(A), P(B))
- *
- * Because additional conditions (connected by AND) can only lower
- * the probability.
- *
- * So we can do some basic sanity checks using the single-variate
- * stats (the ones we have right now).
- *
- * Second, when we have multivariate stats with a MCV list, then
- *
- * (a) if we have a full equality condition (one equality condition
- * on each column) and we found a match in the MCV list, this is
- * the selectivity (and it's supposed to be exact)
- *
- * (b) if we have a full equality condition and we haven't found a
- * match in the MCV list, then the selectivity is below the
- * lowest selectivity in the MCV list
- *
- * (c) if we have a equality condition (not full), we can still
- * search the MCV for matches and use the sum of probabilities
- * as a lower boundary for the histogram (if there are no
- * matches in the MCV list, then we have no boundary)
- *
- * Third, if there are multiple (combinations of) multivariate
- * stats for a set of clauses, we may compute all of them and then
- * somehow aggregate them - e.g. by choosing the minimum, median or
- * average. The stats are susceptible to overestimation (because
- * we take 50% of the bucket for partial matches). Some stats may
- * give better estimates than others, but it's very difficult to
- * say that in advance which one is the best (it depends on the
- * number of buckets, number of additional columns not referenced
- * in the clauses, type of condition etc.).
- *
- * So we may compute them all and then choose a sane aggregation
- * (minimum seems like a good approach). Of course, this may result
- * in longer / more expensive estimation (CPU-wise), but it may be
- * worth it.
- *
- * It's possible to add a GUC choosing whether to do a 'simple'
- * (using a single stats expected to give the best estimate) and
- * 'complex' (combining the multiple estimates).
- *
- * multivariate_estimates = (simple|full)
- *
- * Also, this might be enabled at a table level, by something like
- *
- * ALTER TABLE ... SET STATISTICS (simple|full)
- *
- * Which would make it possible to use this only for the tables
- * where the simple approach does not work.
- *
- * Also, there are ways to optimize this algorithmically. E.g. we
- * may try to get an estimate from a matching MCV list first, and
- * if we happen to get a "full equality match" we may stop computing
- * the estimates from other stats (for this condition) because
- * that's probably the best estimate we can really get.
- *
- * TODO When applying the clauses to the histogram/MCV list, we can do
- * that from the most selective clauses first, because that'll
- * eliminate the buckets/items sooner (so we'll be able to skip
- * them without inspection, which is more expensive). But this
- * requires really knowing the per-clause selectivities in advance,
- * and that's not what we do now.
- *
- * TODO All this is based on the assumption that the statistics represent
- * the necessary dependencies, i.e. that if two colunms are not in
- * the same statistics, there's no dependency. If that's not the
- * case, we may get misestimates, just like before. For example
- * assume we have a table with three columns [a,b,c] with exactly
- * the same values, and statistics on [a,b] and [b,c]. So somthing
- * like this:
- *
- * CREATE TABLE test AS SELECT i, i, i
- FROM generate_series(1,1000);
- *
- * ALTER TABLE test ADD STATISTICS (mcv) ON (a,b);
- * ALTER TABLE test ADD STATISTICS (mcv) ON (b,c);
- *
- * ANALYZE test;
- *
- * EXPLAIN ANALYZE SELECT * FROM test
- * WHERE (a < 10) AND (b < 20) AND (c < 10);
- *
- * The problem here is that the only shared column between the two
- * statistics is 'b' so the probability will be computed like this
- *
- * P[(a < 10) & (b < 20) & (c < 10)]
- * = P[(a < 10) & (b < 20)] * P[(c < 10) | (a < 10) & (b < 20)]
- * = P[(a < 10) & (b < 20)] * P[(c < 10) | (b < 20)]
- *
- * or like this
- *
- * P[(a < 10) & (b < 20) & (c < 10)]
- * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20) & (c < 10)]
- * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20)]
- *
- * In both cases the conditional probabilities will be evaluated as
- * 0.5, because they lack the other column (which would make it 1.0).
- *
- * Theoretically it might be possible to transfer the dependency,
- * e.g. by building bitmap for [a,b] and then combine it with [b,c]
- * by doing something like this:
- *
- * 1) build bitmap on [a,b] using [(a<10) & (b < 20)]
- * 2) for each element in [b,c] check the bitmap
- *
- * But that's certainly nontrivial - for example the statistics may
- * be different (MCV list vs. histogram) and/or the items may not
- * match (e.g. MCV items or histogram buckets will be built
- * differently). Also, for one value of 'b' there might be multiple
- * MCV items (because of the other column values) with different
- * bitmap values (some will match, some won't) - so it's not exactly
- * bitmap but a partial match.
- *
- * Maybe a hash table with number of matches and mismatches (or
- * maybe sums of frequencies) would work? The step (2) would then
- * lookup the values and use that to weight the item somehow.
- *
- * Currently the only solution is to build statistics on all three
- * columns.
- */
-static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
- List *clauses, List *conditions, bool is_or)
-{
- bool fullmatch = false;
- Selectivity s1 = 0.0, s2 = 0.0;
-
- /*
- * Lowest frequency in the MCV list (may be used as an upper bound
- * for full equality conditions that did not match any MCV item).
- */
- Selectivity mcv_low = 0.0;
-
- /* TODO Evaluate simple 1D selectivities, use the smallest one as
- * an upper bound, product as lower bound, and sort the
- * clauses in ascending order by selectivity (to optimize the
- * MCV/histogram evaluation).
- */
-
- /* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
- clauses, conditions, is_or,
- &fullmatch, &mcv_low);
-
- /*
- * If we got a full equality match on the MCV list, we're done (and
- * the estimate is pretty good).
- */
- if (fullmatch && (s1 > 0.0))
- return s1;
-
- /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
- * selectivity as upper bound */
-
- s2 = clauselist_mv_selectivity_histogram(root, mvstats,
- clauses, conditions, is_or);
-
- /* TODO clamp to <= 1.0 (or more strictly, when possible) */
- return s1 + s2;
-}
-
-/*
- * Collect attributes from mv-compatible clauses.
- */
-static Bitmapset *
-collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
- Index *relid, SpecialJoinInfo *sjinfo, int types)
-{
- Bitmapset *attnums = NULL;
- ListCell *l;
-
- /*
- * Walk through the clauses and identify the ones we can estimate
- * using multivariate stats, and remember the relid/columns. We'll
- * then cross-check if we have suitable stats, and only if needed
- * we'll split the clauses into multivariate and regular lists.
- *
- * For now we're only interested in RestrictInfo nodes with nested
- * OpExpr, using either a range or equality.
- */
- foreach (l, clauses)
- {
- Node *clause = (Node *) lfirst(l);
-
- /* ignore the result here - we only need the attnums */
- clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
- sjinfo, types);
- }
-
- /*
- * If there are not at least two attributes referenced by the clause(s),
- * we can throw everything out (as we'll revert to simple stats).
- */
- if (bms_num_members(attnums) <= 1)
- {
- bms_free(attnums);
- attnums = NULL;
- *relid = InvalidOid;
- }
-
- return attnums;
-}
-
-/*
- * Selects the best combination of multivariate statistics, in an
- * exhaustive way, where 'best' means:
- *
- * (a) covering the most attributes (referenced by clauses)
- * (b) using the least number of multivariate stats
- * (c) using the most conditions to exploit dependency
- *
- * There may be other optimality criteria, not considered in the initial
- * implementation (more on that 'weaknesses' section).
- *
- * This pretty much splits the probability of clauses (aka selectivity)
- * into a sequence of conditional probabilities, like this
- *
- * P(A,B,C,D) = P(A,B) * P(C|A,B) * P(D|A,B,C)
- *
- * and removing the attributes not referenced by the existing stats,
- * under the assumption that there's no dependency (otherwise the DBA
- * would create the stats).
- *
- * The last criteria means that when we have the choice to compute like
- * this
- *
- * P(A,B,C,D) = P(A,B,C) * P(D|B,C)
- *
- * or like this
- *
- * P(A,B,C,D) = P(A,B,C) * P(D|C)
- *
- * we should use the first option, as that exploits more dependencies.
- *
- * The order of statistics in the solution implicitly determines the
- * order of estimation of clauses, because as we apply a statistics,
- * we always use it to estimate all the clauses covered by it (and
- * then we use those clauses as conditions for the next statistics).
- *
- * Don't call this directly but through choose_mv_statistics().
- *
- *
- * Algorithm
- * ---------
- * The algorithm is a recursive implementation of backtracking, with
- * maximum 'depth' equal to the number of multi-variate statistics
- * available on the table.
- *
- * It explores all the possible permutations of the stats.
- *
- * Whenever it considers adding the next statistics, the clauses it
- * matches are divided into 'conditions' (clauses already matched by at
- * least one previous statistics) and clauses that are estimated.
- *
- * Then several checks are performed:
- *
- * (a) The statistics covers at least 2 columns, referenced in the
- * estimated clauses (otherwise multi-variate stats are useless).
- *
- * (b) The statistics covers at least 1 new column, i.e. column not
- * refefenced by the already used stats (and the new column has
- * to be referenced by the clauses, of couse). Otherwise the
- * statistics would not add any new information.
- *
- * There are some other sanity checks (e.g. that the stats must not be
- * used twice etc.).
- *
- * Finally the new solution is compared to the currently best one, and
- * if it's considered better, it's used instead.
- *
- *
- * Weaknesses
- * ----------
- * The current implemetation uses a somewhat simple optimality criteria,
- * suffering by the following weaknesses.
- *
- * (a) There may be multiple solutions with the same number of covered
- * attributes and number of statistics (e.g. the same solution but
- * with statistics in a different order). It's unclear which solution
- * is the best one - in a sense all of them are equal.
- *
- * TODO It might be possible to compute estimate for each of those
- * solutions, and then combine them to get the final estimate
- * (e.g. by using average or median).
- *
- * (b) Does not consider that some types of stats are a better match for
- * some types of clauses (e.g. MCV list is a good match for equality
- * than a histogram).
- *
- * XXX Maybe MCV is almost always better / more accurate?
- *
- * But maybe this is pointless - generally, each column is either
- * a label (it's not important whether because of the data type or
- * how it's used), or a value with ordering that makes sense. So
- * either a MCV list is more appropriate (labels) or a histogram
- * (values with orderings).
- *
- * Now sure what to do with statistics on columns mixing columns of
- * both types - maybe it'd be beeter to invent a new type of stats
- * combining MCV list and histogram (keeping a small histogram for
- * each MCV item, and a separate histogram for values not on the
- * MCV list). But that's not implemented at this moment.
- *
- * TODO The algorithm should probably count number of Vars (not just
- * attnums) when computing the 'score' of each solution. Computing
- * the ratio of (num of all vars) / (num of condition vars) as a
- * measure of how well the solution uses conditions might be
- * useful.
- */
-static void
-choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
- int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
- int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
- int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
- bool *cover_map, bool *condition_map, int *ruled_out,
- mv_solution_t *current, mv_solution_t **best)
-{
- int i, j;
-
- Assert(best != NULL);
- Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
-
- CHECK_FOR_INTERRUPTS();
-
- if (current == NULL)
- {
- current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
- current->stats = (int*)palloc0(sizeof(int)*nmvstats);
- current->nstats = 0;
- current->nclauses = 0;
- current->nconditions = 0;
- }
-
- /*
- * Now try to apply each statistics, matching at least two attributes,
- * unless it's already used in one of the previous steps.
- */
- for (i = 0; i < nmvstats; i++)
- {
- int c;
-
- int ncovered_clauses = 0; /* number of covered clauses */
- int ncovered_conditions = 0; /* number of covered conditions */
- int nattnums = 0; /* number of covered attributes */
-
- Bitmapset *all_attnums = NULL;
- Bitmapset *new_attnums = NULL;
-
- /* skip statistics that were already used or eliminated */
- if (ruled_out[i] != -1)
- continue;
-
- /*
- * See if we have clauses covered by this statistics, but not
- * yet covered by any of the preceding onces.
- */
- for (c = 0; c < nclauses; c++)
- {
- bool covered = false;
- Bitmapset *clause_attnums = clauses_attnums[c];
- Bitmapset *tmp = NULL;
-
- /*
- * If this clause is not covered by this stats, we can't
- * use the stats to estimate that at all.
- */
- if (! cover_map[i * nclauses + c])
- continue;
-
- /*
- * Now we know we'll use this clause - either as a condition
- * or as a new clause (the estimated one). So let's add the
- * attributes to the attnums from all the clauses usable with
- * this statistics.
- */
- tmp = bms_union(all_attnums, clause_attnums);
-
- /* free the old bitmap */
- bms_free(all_attnums);
- all_attnums = tmp;
-
- /* let's see if it's covered by any of the previous stats */
- for (j = 0; j < step; j++)
- {
- /* already covered by the previous stats */
- if (cover_map[current->stats[j] * nclauses + c])
- covered = true;
-
- if (covered)
- break;
- }
-
- /* if already covered, continue with the next clause */
- if (covered)
- {
- ncovered_conditions += 1;
- continue;
- }
-
- /*
- * OK, this clause is covered by this statistics (and not by
- * any of the previous ones)
- */
- ncovered_clauses += 1;
-
- /* add the attnums into attnums from 'new clauses' */
- // new_attnums = bms_union(new_attnums, clause_attnums);
- }
-
- /* can't have more new clauses than original clauses */
- Assert(nclauses >= ncovered_clauses);
- Assert(ncovered_clauses >= 0); /* mostly paranoia */
-
- nattnums = bms_num_members(all_attnums);
-
- /* free all the bitmapsets - we don't need them anymore */
- bms_free(all_attnums);
- bms_free(new_attnums);
-
- all_attnums = NULL;
- new_attnums = NULL;
-
- /*
- * See if we have clauses covered by this statistics, but not
- * yet covered by any of the preceding onces.
- */
- for (c = 0; c < nconditions; c++)
- {
- Bitmapset *clause_attnums = conditions_attnums[c];
- Bitmapset *tmp = NULL;
-
- /*
- * If this clause is not covered by this stats, we can't
- * use the stats to estimate that at all.
- */
- if (! condition_map[i * nconditions + c])
- continue;
-
- /* count this as a condition */
- ncovered_conditions += 1;
-
- /*
- * Now we know we'll use this clause - either as a condition
- * or as a new clause (the estimated one). So let's add the
- * attributes to the attnums from all the clauses usable with
- * this statistics.
- */
- tmp = bms_union(all_attnums, clause_attnums);
-
- /* free the old bitmap */
- bms_free(all_attnums);
- all_attnums = tmp;
- }
-
- /*
- * Let's mark the statistics as 'ruled out' - either we'll use
- * it (and proceed to the next step), or it's incompatible.
- */
- ruled_out[i] = step;
-
- /*
- * There are no clauses usable with this statistics (not already
- * covered by aome of the previous stats).
- *
- * Similarly, if the clauses only use a single attribute, we
- * can't really use that.
- */
- if ((ncovered_clauses == 0) || (nattnums < 2))
- continue;
-
- /*
- * TODO Not sure if it's possible to add a clause referencing
- * only attributes already covered by previous stats?
- * Introducing only some new dependency, not a new
- * attribute. Couldn't come up with an example, though.
- * Might be worth adding some assert.
- */
-
- /*
- * got a suitable statistics - let's update the current solution,
- * maybe use it as the best solution
- */
- current->nclauses += ncovered_clauses;
- current->nconditions += ncovered_conditions;
- current->nstats += 1;
- current->stats[step] = i;
-
- /*
- * We can never cover more clauses, or use more stats that we
- * actually have at the beginning.
- */
- Assert(nclauses >= current->nclauses);
- Assert(nmvstats >= current->nstats);
- Assert(step < nmvstats);
-
- /* we can't get more conditions that clauses and conditions combined
- *
- * FIXME This assert does not work because we count the conditions
- * repeatedly (once for each statistics covering it).
- */
- /* Assert((nconditions + nclauses) >= current->nconditions); */
-
- if (*best == NULL)
- {
- *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
- (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
- (*best)->nstats = 0;
- (*best)->nclauses = 0;
- (*best)->nconditions = 0;
- }
-
- /* see if it's better than the current 'best' solution */
- if ((current->nclauses > (*best)->nclauses) ||
- ((current->nclauses == (*best)->nclauses) &&
- ((current->nstats > (*best)->nstats))))
- {
- (*best)->nstats = current->nstats;
- (*best)->nclauses = current->nclauses;
- (*best)->nconditions = current->nconditions;
- memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
- }
-
- /*
- * The recursion only makes sense if we haven't covered all the
- * attributes (then adding stats is not really possible).
- */
- if ((step + 1) < nmvstats)
- choose_mv_statistics_exhaustive(root, step+1,
- nmvstats, mvstats, stats_attnums,
- nclauses, clauses, clauses_attnums,
- nconditions, conditions, conditions_attnums,
- cover_map, condition_map, ruled_out,
- current, best);
-
- /* reset the last step */
- current->nclauses -= ncovered_clauses;
- current->nconditions -= ncovered_conditions;
- current->nstats -= 1;
- current->stats[step] = 0;
-
- /* mark the statistics as usable again */
- ruled_out[i] = -1;
-
- Assert(current->nclauses >= 0);
- Assert(current->nstats >= 0);
- }
-
- /* reset all statistics as 'incompatible' in this step */
- for (i = 0; i < nmvstats; i++)
- if (ruled_out[i] == step)
- ruled_out[i] = -1;
-
-}
-
-/*
- * Greedy search for a multivariate solution - a sequence of statistics
- * covering the clauses. This chooses the "best" statistics at each step,
- * so the resulting solution may not be the best solution globally, but
- * this produces the solution in only N steps (where N is the number of
- * statistics), while the exhaustive approach may have to walk through
- * ~N! combinations (although some of those are terminated early).
- *
- * See the comments at choose_mv_statistics_exhaustive() as this does
- * the same thing (but in a different way).
- *
- * Don't call this directly, but through choose_mv_statistics().
- *
- * TODO There are probably other metrics we might use - e.g. using
- * number of columns (num_cond_columns / num_cov_columns), which
- * might work better with a mix of simple and complex clauses.
- *
- * TODO Also the choice at the very first step should be handled
- * in a special way, because there will be 0 conditions at that
- * moment, so there needs to be some other criteria - e.g. using
- * the simplest (or most complex?) clause might be a good idea.
- *
- * TODO We might also select multiple stats using different criteria,
- * and branch the search. This is however tricky, because if we
- * choose k statistics at each step, we get k^N branches to
- * walk through (with N steps). That's not really good with
- * large number of stats (yet better than exhaustive search).
- */
-static void
-choose_mv_statistics_greedy(PlannerInfo *root, int step,
- int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
- int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
- int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
- bool *cover_map, bool *condition_map, int *ruled_out,
- mv_solution_t *current, mv_solution_t **best)
-{
- int i, j;
- int best_stat = -1;
- double gain, max_gain = -1.0;
-
- /*
- * Bitmap tracking which clauses are already covered (by the previous
- * statistics) and may thus serve only as a condition in this step.
- */
- bool *covered_clauses = (bool*)palloc0(nclauses);
-
- /*
- * Number of clauses and columns covered by each statistics - this
- * includes both conditions and clauses covered by the statistics for
- * the first time. The number of columns may count some columns
- * repeatedly - if a column is shared by multiple clauses, it will
- * be counted once for each clause (covered by the statistics).
- * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
- * will be counted twice (if both clauses are covered).
- *
- * The values for reduded statistics (that can't be applied) are
- * not computed, because that'd be pointless.
- */
- int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
- int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
-
- /*
- * Same as above, but this only includes clauses that are already
- * covered by the previous stats (and the current one).
- */
- int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
- int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
-
- /*
- * Number of attributes for each clause.
- *
- * TODO Might be computed in choose_mv_statistics() and then passed
- * here, but then the function would not have the same signature
- * as _exhaustive().
- */
- int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
- int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
-
- CHECK_FOR_INTERRUPTS();
-
- Assert(best != NULL);
- Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
-
- /* compute attributes (columns) for each clause */
- for (i = 0; i < nclauses; i++)
- attnum_counts[i] = bms_num_members(clauses_attnums[i]);
-
- /* compute attributes (columns) for each condition */
- for (i = 0; i < nconditions; i++)
- attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
-
- /* see which clauses are already covered at this point (by previous stats) */
- for (i = 0; i < step; i++)
- for (j = 0; j < nclauses; j++)
- covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
-
- /* which remaining statistics covers most clauses / uses most conditions? */
- for (i = 0; i < nmvstats; i++)
- {
- Bitmapset *attnums_covered = NULL;
- Bitmapset *attnums_conditions = NULL;
-
- /* skip stats that are already ruled out (either used or inapplicable) */
- if (ruled_out[i] != -1)
- continue;
-
- /* count covered clauses and conditions (for the statistics) */
- for (j = 0; j < nclauses; j++)
- {
- if (cover_map[i * nclauses + j])
- {
- Bitmapset *attnums_new
- = bms_union(attnums_covered, clauses_attnums[j]);
-
- /* get rid of the old bitmap and keep the unified result */
- bms_free(attnums_covered);
- attnums_covered = attnums_new;
-
- num_cov_clauses[i] += 1;
- num_cov_columns[i] += attnum_counts[j];
-
- /* is the clause already covered (i.e. a condition)? */
- if (covered_clauses[j])
- {
- num_cond_clauses[i] += 1;
- num_cond_columns[i] += attnum_counts[j];
- attnums_new = bms_union(attnums_conditions,
- clauses_attnums[j]);
-
- bms_free(attnums_conditions);
- attnums_conditions = attnums_new;
- }
- }
- }
-
- /* if all covered clauses are covered by prev stats (thus conditions) */
- if (num_cov_clauses[i] == num_cond_clauses[i])
- ruled_out[i] = step;
-
- /* same if there are no new attributes */
- else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
- ruled_out[i] = step;
-
- bms_free(attnums_covered);
- bms_free(attnums_conditions);
-
- /* if the statistics is inapplicable, try the next one */
- if (ruled_out[i] != -1)
- continue;
-
- /* now let's walk through conditions and count the covered */
- for (j = 0; j < nconditions; j++)
- {
- if (condition_map[i * nconditions + j])
- {
- num_cond_clauses[i] += 1;
- num_cond_columns[i] += attnum_cond_counts[j];
- }
- }
-
- /* otherwise see if this improves the interesting metrics */
- gain = num_cond_columns[i] / (double)num_cov_columns[i];
-
- if (gain > max_gain)
- {
- max_gain = gain;
- best_stat = i;
- }
- }
-
- /*
- * Have we found a suitable statistics? Add it to the solution and
- * try next step.
- */
- if (best_stat != -1)
- {
- /* mark the statistics, so that we skip it in next steps */
- ruled_out[best_stat] = step;
-
- /* allocate current solution if necessary */
- if (current == NULL)
- {
- current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
- current->stats = (int*)palloc0(sizeof(int)*nmvstats);
- current->nstats = 0;
- current->nclauses = 0;
- current->nconditions = 0;
- }
-
- current->nclauses += num_cov_clauses[best_stat];
- current->nconditions += num_cond_clauses[best_stat];
- current->stats[step] = best_stat;
- current->nstats++;
-
- if (*best == NULL)
- {
- (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
- (*best)->nstats = current->nstats;
- (*best)->nclauses = current->nclauses;
- (*best)->nconditions = current->nconditions;
-
- (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
- memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
- }
- else
- {
- /* see if this is a better solution */
- double current_gain = (double)current->nconditions / current->nclauses;
- double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
-
- if ((current_gain > best_gain) ||
- ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
- {
- (*best)->nstats = current->nstats;
- (*best)->nclauses = current->nclauses;
- (*best)->nconditions = current->nconditions;
- memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
- }
- }
-
- /*
- * The recursion only makes sense if we haven't covered all the
- * attributes (then adding stats is not really possible).
- */
- if ((step + 1) < nmvstats)
- choose_mv_statistics_greedy(root, step+1,
- nmvstats, mvstats, stats_attnums,
- nclauses, clauses, clauses_attnums,
- nconditions, conditions, conditions_attnums,
- cover_map, condition_map, ruled_out,
- current, best);
-
- /* reset the last step */
- current->nclauses -= num_cov_clauses[best_stat];
- current->nconditions -= num_cond_clauses[best_stat];
- current->nstats -= 1;
- current->stats[step] = 0;
-
- /* mark the statistics as usable again */
- ruled_out[best_stat] = -1;
- }
-
- /* reset all statistics eliminated in this step */
- for (i = 0; i < nmvstats; i++)
- if (ruled_out[i] == step)
- ruled_out[i] = -1;
-
- /* free everything allocated in this step */
- pfree(covered_clauses);
- pfree(attnum_counts);
- pfree(num_cov_clauses);
- pfree(num_cov_columns);
- pfree(num_cond_clauses);
- pfree(num_cond_columns);
-}
-
-/*
- * Chooses the combination of statistics, optimal for estimation of
- * a particular clause list.
- *
- * This only handles a 'preparation' shared by the exhaustive and greedy
- * implementations (see the previous methods), mostly trying to reduce
- * the size of the problem (eliminate clauses/statistics that can't be
- * really used in the solution).
- *
- * It also precomputes bitmaps for attributes covered by clauses and
- * statistics, so that we don't need to do that over and over in the
- * actual optimizations (as it's both CPU and memory intensive).
- *
- * TODO This will probably have to consider compatibility of clauses,
- * because 'dependencies' will probably work only with equality
- * clauses.
- *
- * TODO Another way to make the optimization problems smaller might
- * be splitting the statistics into several disjoint subsets, i.e.
- * if we can split the graph of statistics (after the elimination)
- * into multiple components (so that stats in different components
- * share no attributes), we can do the optimization for each
- * component separately.
- *
- * TODO If we could compute what is a "perfect solution" maybe we could
- * terminate the search after reaching ~90% of it? Say, if we knew
- * that we can cover 10 clauses and reuse 8 dependencies, maybe
- * covering 9 clauses and 7 dependencies would be OK?
- */
-static List*
-choose_mv_statistics(PlannerInfo *root, List *stats,
- List *clauses, List *conditions,
- Oid varRelid, SpecialJoinInfo *sjinfo)
-{
- int i;
- mv_solution_t *best = NULL;
- List *result = NIL;
-
- int nmvstats;
- MVStatisticInfo *mvstats;
-
- /* we only work with MCV lists and histograms here */
- int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
-
- bool *clause_cover_map = NULL,
- *condition_cover_map = NULL;
- int *ruled_out = NULL;
-
- /* build bitmapsets for all stats and clauses */
- Bitmapset **stats_attnums;
- Bitmapset **clauses_attnums;
- Bitmapset **conditions_attnums;
-
- int nclauses, nconditions;
- Node ** clauses_array;
- Node ** conditions_array;
-
- /* copy lists, so that we can free them during elimination easily */
- clauses = list_copy(clauses);
- conditions = list_copy(conditions);
- stats = list_copy(stats);
-
- /*
- * Reduce the optimization problem size as much as possible.
- *
- * Eliminate clauses and conditions not covered by any statistics,
- * or statistics not matching at least two attributes (one of them
- * has to be in a regular clause).
- *
- * It's possible that removing a statistics in one iteration
- * eliminates clause in the next one, so we'll repeat this until we
- * eliminate no clauses/stats in that iteration.
- *
- * This can only happen after eliminating a statistics - clauses are
- * eliminated first, so statistics always reflect that.
- */
- while (true)
- {
- List *tmp;
-
- Bitmapset *compatible_attnums = NULL;
- Bitmapset *condition_attnums = NULL;
- Bitmapset *all_attnums = NULL;
-
- /*
- * Clauses
- *
- * Walk through clauses and keep only those covered by at least
- * one of the statistics we still have. We'll also keep info
- * about attnums in clauses (without conditions) so that we can
- * ignore stats covering just conditions (which is pointless).
- */
- tmp = filter_clauses(root, varRelid, sjinfo, type,
- stats, clauses, &compatible_attnums);
-
- /* discard the original list */
- list_free(clauses);
- clauses = tmp;
-
- /*
- * Conditions
- *
- * Walk through clauses and keep only those covered by at least
- * one of the statistics we still have. Also, collect bitmap of
- * attributes so that we can make sure we add at least one new
- * attribute (by comparing with clauses).
- */
- if (conditions != NIL)
- {
- tmp = filter_clauses(root, varRelid, sjinfo, type,
- stats, conditions, &condition_attnums);
-
- /* discard the original list */
- list_free(conditions);
- conditions = tmp;
- }
-
- /* get a union of attnums (from conditions and new clauses) */
- all_attnums = bms_union(compatible_attnums, condition_attnums);
-
- /*
- * Statisitics
- *
- * Walk through statistics and only keep those covering at least
- * one new attribute (excluding conditions) and at two attributes
- * in both clauses and conditions.
- */
- tmp = filter_stats(stats, compatible_attnums, all_attnums);
-
- /* if we've not eliminated anything, terminate */
- if (list_length(stats) == list_length(tmp))
- break;
-
- /* work only with filtered statistics from now */
- list_free(stats);
- stats = tmp;
- }
-
- /* only do the optimization if we have clauses/statistics */
- if ((list_length(stats) == 0) || (list_length(clauses) == 0))
- return NULL;
-
- /* remove redundant stats (stats covered by another stats) */
- stats = filter_redundant_stats(stats, clauses, conditions);
-
- /*
- * TODO We should sort the stats to make the order deterministic,
- * otherwise we may get different estimates on different
- * executions - if there are multiple "equally good" solutions,
- * we'll keep the first solution we see.
- *
- * Sorting by OID probably is not the right solution though,
- * because we'd like it to be somehow reproducible,
- * irrespectedly of the order of ADD STATISTICS commands.
- * So maybe statkeys?
- */
- mvstats = make_stats_array(stats, &nmvstats);
- stats_attnums = make_stats_attnums(mvstats, nmvstats);
-
- /* collect clauses an bitmap of attnums */
- clauses_array = make_clauses_array(clauses, &nclauses);
- clauses_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
- clauses_array, nclauses);
-
- /* collect conditions and bitmap of attnums */
- conditions_array = make_clauses_array(conditions, &nconditions);
- conditions_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
- conditions_array, nconditions);
-
- /*
- * Build bitmaps with info about which clauses/conditions are
- * covered by each statistics (so that we don't need to call the
- * bms_is_subset over and over again).
- */
- clause_cover_map = make_cover_map(stats_attnums, nmvstats,
- clauses_attnums, nclauses);
-
- condition_cover_map = make_cover_map(stats_attnums, nmvstats,
- conditions_attnums, nconditions);
-
- ruled_out = (int*)palloc0(nmvstats * sizeof(int));
-
- /* no stats are ruled out by default */
- for (i = 0; i < nmvstats; i++)
- ruled_out[i] = -1;
-
- /* do the optimization itself */
- if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
- choose_mv_statistics_exhaustive(root, 0,
- nmvstats, mvstats, stats_attnums,
- nclauses, clauses_array, clauses_attnums,
- nconditions, conditions_array, conditions_attnums,
- clause_cover_map, condition_cover_map,
- ruled_out, NULL, &best);
- else
- choose_mv_statistics_greedy(root, 0,
- nmvstats, mvstats, stats_attnums,
- nclauses, clauses_array, clauses_attnums,
- nconditions, conditions_array, conditions_attnums,
- clause_cover_map, condition_cover_map,
- ruled_out, NULL, &best);
-
- /* create a list of statistics from the array */
- if (best != NULL)
- {
- for (i = 0; i < best->nstats; i++)
- {
- MVStatisticInfo *info = makeNode(MVStatisticInfo);
- memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
- result = lappend(result, info);
- }
- pfree(best);
- }
-
- /* cleanup (maybe leave it up to the memory context?) */
- for (i = 0; i < nmvstats; i++)
- bms_free(stats_attnums[i]);
-
- for (i = 0; i < nclauses; i++)
- bms_free(clauses_attnums[i]);
-
- for (i = 0; i < nconditions; i++)
- bms_free(conditions_attnums[i]);
-
- pfree(stats_attnums);
- pfree(clauses_attnums);
- pfree(conditions_attnums);
-
- pfree(clauses_array);
- pfree(conditions_array);
- pfree(clause_cover_map);
- pfree(condition_cover_map);
- pfree(ruled_out);
- pfree(mvstats);
-
- list_free(clauses);
- list_free(conditions);
- list_free(stats);
-
- return result;
-}
-
-
-/*
- * This splits the clauses list into two parts - one containing clauses
- * that will be evaluated using the chosen statistics, and the remaining
- * clauses (either non-mvcompatible, or not related to the histogram).
- */
-static List *
-clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
- List *clauses, Oid varRelid, List **mvclauses,
- MVStatisticInfo *mvstats, int types)
-{
- int i;
- ListCell *l;
- List *non_mvclauses = NIL;
-
- /* FIXME is there a better way to get info on int2vector? */
- int2vector * attrs = mvstats->stakeys;
- int numattrs = mvstats->stakeys->dim1;
-
- Bitmapset *mvattnums = NULL;
-
- /* build bitmap of attributes covered by the stats, so we can
- * do bms_is_subset later */
- for (i = 0; i < numattrs; i++)
- mvattnums = bms_add_member(mvattnums, attrs->values[i]);
-
- /* erase the list of mv-compatible clauses */
- *mvclauses = NIL;
-
- foreach (l, clauses)
- {
- bool match = false; /* by default not mv-compatible */
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(l);
-
- if (clause_is_mv_compatible(root, clause, varRelid, NULL,
- &attnums, sjinfo, types))
- {
- /* are all the attributes part of the selected stats? */
- if (bms_is_subset(attnums, mvattnums))
- match = true;
- }
-
- /*
- * The clause matches the selected stats, so put it to the list
- * of mv-compatible clauses. Otherwise, keep it in the list of
- * 'regular' clauses (that may be selected later).
- */
- if (match)
- *mvclauses = lappend(*mvclauses, clause);
- else
- non_mvclauses = lappend(non_mvclauses, clause);
- }
-
- /*
- * Perform regular estimation using the clauses incompatible
- * with the chosen histogram (or MV stats in general).
- */
- return non_mvclauses;
-
-}
-
-/*
- * Determines whether the clause is compatible with multivariate stats,
- * and if it is, returns some additional information - varno (index
- * into simple_rte_array) and a bitmap of attributes. This is then
- * used to fetch related multivariate statistics.
- *
- * At this moment we only support basic conditions of the form
- *
- * variable OP constant
- *
- * where OP is one of [=,<,<=,>=,>] (which is however determined by
- * looking at the associated function for estimating selectivity, just
- * like with the single-dimensional case).
- *
- * TODO Support 'OR clauses' - shouldn't be all that difficult to
- * evaluate them using multivariate stats.
- */
-static bool
-clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
- int types)
-{
- Relids clause_relids;
- Relids left_relids;
- Relids right_relids;
-
- if (IsA(clause, RestrictInfo))
- {
- RestrictInfo *rinfo = (RestrictInfo *) clause;
-
- /* Pseudoconstants are not really interesting here. */
- if (rinfo->pseudoconstant)
- return false;
-
- /* get the actual clause from the RestrictInfo (it's not an OR clause) */
- clause = (Node*)rinfo->clause;
-
- /* we don't support join conditions at this moment */
- if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
- return false;
-
- clause_relids = rinfo->clause_relids;
- left_relids = rinfo->left_relids;
- right_relids = rinfo->right_relids;
- }
- else if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
- {
- left_relids = pull_varnos(get_leftop((Expr*)clause));
- right_relids = pull_varnos(get_rightop((Expr*)clause));
-
- clause_relids = bms_union(left_relids,
- right_relids);
- }
- else
- {
- /* Not a binary opclause, so mark left/right relid sets as empty */
- left_relids = NULL;
- right_relids = NULL;
- /* and get the total relid set the hard way */
- clause_relids = pull_varnos((Node *) clause);
- }
-
- /*
- * Only simple opclauses and IS NULL tests are compatible with
- * multivariate stats at this point.
- */
- if ((is_opclause(clause))
- && (list_length(((OpExpr *) clause)->args) == 2))
- {
- OpExpr *expr = (OpExpr *) clause;
- bool varonleft = true;
- bool ok;
-
- /* is it 'variable op constant' ? */
- ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
- (is_pseudo_constant_clause_relids(lsecond(expr->args),
- right_relids) ||
- (varonleft = false,
- is_pseudo_constant_clause_relids(linitial(expr->args),
- left_relids)));
-
- if (ok)
- {
- Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
-
- /*
- * Simple variables only - otherwise the planner_rt_fetch seems to fail
- * (return NULL).
- *
- * TODO Maybe use examine_variable() would fix that?
- */
- if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
- return false;
-
- /*
- * Only consider this variable if (varRelid == 0) or when the varno
- * matches varRelid (see explanation at clause_selectivity).
- *
- * FIXME I suspect this may not be really necessary. The (varRelid == 0)
- * part seems to be enforced by treat_as_join_clause().
- */
- if (! ((varRelid == 0) || (varRelid == var->varno)))
- return false;
-
- /* Also skip special varno values, and system attributes ... */
- if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
- return false;
-
- /* Lookup info about the base relation (we need to pass the OID out) */
- if (relid != NULL)
- *relid = var->varno;
-
- /*
- * If it's not a "<" or ">" or "=" operator, just ignore the
- * clause. Otherwise note the relid and attnum for the variable.
- * This uses the function for estimating selectivity, ont the
- * operator directly (a bit awkward, but well ...).
- */
- switch (get_oprrest(expr->opno))
- {
- case F_SCALARLTSEL:
- case F_SCALARGTSEL:
- /* not compatible with functional dependencies */
- if (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
- {
- *attnums = bms_add_member(*attnums, var->varattno);
- return (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
- }
- return false;
-
- case F_EQSEL:
- *attnums = bms_add_member(*attnums, var->varattno);
- return true;
- }
- }
- }
- else if (IsA(clause, NullTest)
- && IsA(((NullTest*)clause)->arg, Var))
- {
- Var * var = (Var*)((NullTest*)clause)->arg;
-
- /*
- * Simple variables only - otherwise the planner_rt_fetch seems to fail
- * (return NULL).
- *
- * TODO Maybe use examine_variable() would fix that?
- */
- if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
- return false;
-
- /*
- * Only consider this variable if (varRelid == 0) or when the varno
- * matches varRelid (see explanation at clause_selectivity).
- *
- * FIXME I suspect this may not be really necessary. The (varRelid == 0)
- * part seems to be enforced by treat_as_join_clause().
- */
- if (! ((varRelid == 0) || (varRelid == var->varno)))
- return false;
-
- /* Also skip special varno values, and system attributes ... */
- if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
- return false;
-
- /* Lookup info about the base relation (we need to pass the OID out) */
- if (relid != NULL)
- *relid = var->varno;
-
- *attnums = bms_add_member(*attnums, var->varattno);
-
- return true;
- }
- else if (or_clause(clause) || and_clause(clause))
- {
- /*
- * AND/OR-clauses are supported if all sub-clauses are supported
- *
- * TODO We might support mixed case, where some of the clauses
- * are supported and some are not, and treat all supported
- * subclauses as a single clause, compute it's selectivity
- * using mv stats, and compute the total selectivity using
- * the current algorithm.
- *
- * TODO For RestrictInfo above an OR-clause, we might use the
- * orclause with nested RestrictInfo - we won't have to
- * call pull_varnos() for each clause, saving time.
- */
- Bitmapset *tmp = NULL;
- ListCell *l;
- foreach (l, ((BoolExpr*)clause)->args)
- {
- if (! clause_is_mv_compatible(root, (Node*)lfirst(l),
- varRelid, relid, &tmp, sjinfo, types))
- return false;
- }
-
- /* add the attnums from the OR-clause to the set of attnums */
- *attnums = bms_join(*attnums, tmp);
-
- return true;
- }
-
- return false;
-}
-
-
-static Bitmapset *
-clause_mv_get_attnums(PlannerInfo *root, Node *clause)
-{
- Bitmapset * attnums = NULL;
-
- /* Extract clause from restrict info, if needed. */
- if (IsA(clause, RestrictInfo))
- clause = (Node*)((RestrictInfo*)clause)->clause;
-
- /*
- * Only simple opclauses and IS NULL tests are compatible with
- * multivariate stats at this point.
- */
- if ((is_opclause(clause))
- && (list_length(((OpExpr *) clause)->args) == 2))
- {
- OpExpr *expr = (OpExpr *) clause;
-
- if (IsA(linitial(expr->args), Var))
- attnums = bms_add_member(attnums,
- ((Var*)linitial(expr->args))->varattno);
- else
- attnums = bms_add_member(attnums,
- ((Var*)lsecond(expr->args))->varattno);
- }
- else if (IsA(clause, NullTest)
- && IsA(((NullTest*)clause)->arg, Var))
- {
- attnums = bms_add_member(attnums,
- ((Var*)((NullTest*)clause)->arg)->varattno);
- }
- else if (or_clause(clause) || and_clause(clause))
- {
- ListCell *l;
- foreach (l, ((BoolExpr*)clause)->args)
- {
- attnums = bms_join(attnums,
- clause_mv_get_attnums(root, (Node*)lfirst(l)));
- }
- }
-
- return attnums;
-}
-
-/*
- * Performs reduction of clauses using functional dependencies, i.e.
- * removes clauses that are considered redundant. It simply walks
- * through dependencies, and checks whether the dependency 'matches'
- * the clauses, i.e. if there's a clause matching the condition. If yes,
- * all clauses matching the implied part of the dependency are removed
- * from the list.
- *
- * This simply looks at attnums references by the clauses, not at the
- * type of the operator (equality, inequality, ...). This may not be the
- * right way to do - it certainly works best for equalities, which is
- * naturally consistent with functional dependencies (implications).
- * It's not clear that other operators are handled sensibly - for
- * example for inequalities, like
- *
- * WHERE (A >= 10) AND (B <= 20)
- *
- * and a trivial case where [A == B], resulting in symmetric pair of
- * rules [A => B], [B => A], it's rather clear we can't remove either of
- * those clauses.
- *
- * That only highlights that functional dependencies are most suitable
- * for label-like data, where using non-equality operators is very rare.
- * Using the common city/zipcode example, clauses like
- *
- * (zipcode <= 12345)
- *
- * or
- *
- * (cityname >= 'Washington')
- *
- * are rare. So restricting the reduction to equality should not harm
- * the usefulness / applicability.
- *
- * The other assumption is that this assumes 'compatible' clauses. For
- * example by using mismatching zip code and city name, this is unable
- * to identify the discrepancy and eliminates one of the clauses. The
- * usual approach (multiplying both selectivities) thus produces a more
- * accurate estimate, although mostly by luck - the multiplication
- * comes from assumption of statistical independence of the two
- * conditions (which is not not valid in this case), but moves the
- * estimate in the right direction (towards 0%).
- *
- * This might be somewhat improved by cross-checking the selectivities
- * against MCV and/or histogram.
- *
- * The implementation needs to be careful about cyclic rules, i.e. rules
- * like [A => B] and [B => A] at the same time. This must not reduce
- * clauses on both attributes at the same time.
- *
- * Technically we might consider selectivities here too, somehow. E.g.
- * when (A => B) and (B => A), we might use the clauses with minimum
- * selectivity.
- *
- * TODO Consider restricting the reduction to equality clauses. Or maybe
- * use equality classes somehow?
- *
- * TODO Merge this docs to dependencies.c, as it's saying mostly the
- * same things as the comments there.
- *
- * TODO Currently this is applied only to the top-level clauses, but
- * maybe we could apply it to lists at subtrees too, e.g. to the
- * two AND-clauses in
- *
- * (x=1 AND y=2) OR (z=3 AND q=10)
- *
- */
-static List *
-clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
- Oid varRelid, List *stats,
- SpecialJoinInfo *sjinfo)
-{
- List *reduced_clauses = NIL;
- Index relid;
-
- /*
- * matrix of (natts x natts), 1 means x=>y
- *
- * This serves two purposes - first, it merges dependencies from all
- * the statistics, second it makes generating all the transitive
- * dependencies easier.
- *
- * We need to build this only for attributes from the dependencies,
- * not for all attributes in the table.
- *
- * We can't do that only for attributes from the clauses, because we
- * want to build transitive dependencies (including those going
- * through attributes not listed in the stats).
- *
- * This only works for A=>B dependencies, not sure how to do that
- * for complex dependencies.
- */
- bool *deps_matrix;
- int deps_natts; /* size of the matric */
-
- /* mapping attnum <=> matrix index */
- int *deps_idx_to_attnum;
- int *deps_attnum_to_idx;
-
- /* attnums in dependencies and clauses (and intersection) */
- List *deps_clauses = NIL;
- Bitmapset *deps_attnums = NULL;
- Bitmapset *clause_attnums = NULL;
- Bitmapset *intersect_attnums = NULL;
-
- /*
- * Is there at least one statistics with functional dependencies?
- * If not, return the original clauses right away.
- *
- * XXX Isn't this pointless, thanks to exactly the same check in
- * clauselist_selectivity()? Can we trigger the condition here?
- */
- if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
- return clauses;
-
- /*
- * Build the dependency matrix, i.e. attribute adjacency matrix,
- * where 1 means (a=>b). Once we have the adjacency matrix, we'll
- * multiply it by itself, to get transitive dependencies.
- *
- * Note: This is pretty much transitive closure from graph theory.
- *
- * First, let's see what attributes are covered by functional
- * dependencies (sides of the adjacency matrix), and also a maximum
- * attribute (size of mapping to simple integer indexes);
- */
- deps_attnums = fdeps_collect_attnums(stats);
-
- /*
- * Walk through the clauses - clauses that are (one of)
- *
- * (a) not mv-compatible
- * (b) are using more than a single attnum
- * (c) using attnum not covered by functional depencencies
- *
- * may be copied directly to the result. The interesting clauses are
- * kept in 'deps_clauses' and will be processed later.
- */
- clause_attnums = fdeps_filter_clauses(root, clauses, deps_attnums,
- &reduced_clauses, &deps_clauses,
- varRelid, &relid, sjinfo);
-
- /*
- * we need at least two clauses referencing two different attributes
- * referencing to do the reduction
- */
- if ((list_length(deps_clauses) < 2) || (bms_num_members(clause_attnums) < 2))
- {
- bms_free(clause_attnums);
- list_free(reduced_clauses);
- list_free(deps_clauses);
-
- return clauses;
- }
-
-
- /*
- * We need at least two matching attributes in the clauses and
- * dependencies, otherwise we can't really reduce anything.
- */
- intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
- if (bms_num_members(intersect_attnums) < 2)
- {
- bms_free(clause_attnums);
- bms_free(deps_attnums);
- bms_free(intersect_attnums);
-
- list_free(deps_clauses);
- list_free(reduced_clauses);
-
- return clauses;
- }
-
- /*
- * Build mapping between matrix indexes and attnums, and then the
- * adjacency matrix itself.
- */
- deps_idx_to_attnum = make_idx_to_attnum_mapping(deps_attnums);
- deps_attnum_to_idx = make_attnum_to_idx_mapping(deps_attnums);
-
- /* build the adjacency matrix */
- deps_matrix = build_adjacency_matrix(stats, deps_attnums,
- deps_idx_to_attnum,
- deps_attnum_to_idx);
-
- deps_natts = bms_num_members(deps_attnums);
-
- /*
- * Multiply the matrix N-times (N = size of the matrix), so that we
- * get all the transitive dependencies. That makes the next step
- * much easier and faster.
- *
- * This is essentially an adjacency matrix from graph theory, and
- * by multiplying it we get transitive edges. We don't really care
- * about the exact number (number of paths between vertices) though,
- * so we can do the multiplication in-place (we don't care whether
- * we found the dependency in this round or in the previous one).
- *
- * Track how many new dependencies were added, and stop when 0, but
- * we can't multiply more than N-times (longest path in the graph).
- */
- multiply_adjacency_matrix(deps_matrix, deps_natts);
-
- /*
- * Walk through the clauses, and see which other clauses we may
- * reduce. The matrix contains all transitive dependencies, which
- * makes this very fast.
- *
- * We have to be careful not to reduce the clause using itself, or
- * reducing all clauses forming a cycle (so we have to skip already
- * eliminated clauses).
- *
- * I'm not sure whether this guarantees finding the best solution,
- * i.e. reducing the most clauses, but it probably does (thanks to
- * having all the transitive dependencies).
- */
- deps_clauses = fdeps_reduce_clauses(deps_clauses,
- deps_attnums, deps_matrix,
- deps_idx_to_attnum,
- deps_attnum_to_idx, relid);
-
- /* join the two lists of clauses */
- reduced_clauses = list_union(reduced_clauses, deps_clauses);
-
- pfree(deps_matrix);
- pfree(deps_idx_to_attnum);
- pfree(deps_attnum_to_idx);
-
- bms_free(deps_attnums);
- bms_free(clause_attnums);
- bms_free(intersect_attnums);
-
- return reduced_clauses;
-}
-
-static bool
-has_stats(List *stats, int type)
-{
- ListCell *s;
-
- foreach (s, stats)
- {
- MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
-
- if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
- return true;
-
- if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
- return true;
-
- if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
- return true;
- }
-
- return false;
-}
-
-/*
- * Determing relid (either from varRelid or from clauses) and then
- * lookup stats using the relid.
- */
-static List *
-find_stats(PlannerInfo *root, List *clauses, Oid varRelid, Index *relid)
-{
- /* unknown relid by default */
- *relid = InvalidOid;
-
- /*
- * First we need to find the relid (index info simple_rel_array).
- * If varRelid is not 0, we already have it, otherwise we have to
- * look it up from the clauses.
- */
- if (varRelid != 0)
- *relid = varRelid;
- else
- {
- Relids relids = pull_varnos((Node*)clauses);
-
- /*
- * We only expect 0 or 1 members in the bitmapset. If there are
- * no vars, we'll get empty bitmapset, otherwise we'll get the
- * relid as the single member.
- *
- * FIXME For some reason we can get 2 relids here (e.g. \d in
- * psql does that).
- */
- if (bms_num_members(relids) == 1)
- *relid = bms_singleton_member(relids);
-
- bms_free(relids);
- }
-
- /*
- * if we found the relid, we can get the stats from simple_rel_array
- *
- * This only gets stats that are already built, because that's how
- * we load it into RelOptInfo (see get_relation_info), but we don't
- * detoast the whole stats yet. That'll be done later, after we
- * decide which stats to use.
- */
- if (*relid != InvalidOid)
- return root->simple_rel_array[*relid]->mvstatlist;
-
- return NIL;
-}
-
-static Bitmapset*
-fdeps_collect_attnums(List *stats)
-{
- ListCell *lc;
- Bitmapset *attnums = NULL;
-
- foreach (lc, stats)
- {
- int j;
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
-
- int2vector *stakeys = info->stakeys;
-
- /* skip stats without functional dependencies built */
- if (! info->deps_built)
- continue;
-
- for (j = 0; j < stakeys->dim1; j++)
- attnums = bms_add_member(attnums, stakeys->values[j]);
- }
-
- return attnums;
-}
-
-
-static int*
-make_idx_to_attnum_mapping(Bitmapset *attnums)
-{
- int attidx = 0;
- int attnum = -1;
-
- int *mapping = (int*)palloc0(bms_num_members(attnums) * sizeof(int));
-
- while ((attnum = bms_next_member(attnums, attnum)) >= 0)
- mapping[attidx++] = attnum;
-
- Assert(attidx == bms_num_members(attnums));
-
- return mapping;
-}
-
-static int*
-make_attnum_to_idx_mapping(Bitmapset *attnums)
-{
- int attidx = 0;
- int attnum = -1;
- int maxattnum = -1;
- int *mapping;
-
- while ((attnum = bms_next_member(attnums, attnum)) >= 0)
- maxattnum = attnum;
-
- mapping = (int*)palloc0((maxattnum+1) * sizeof(int));
-
- attnum = -1;
- while ((attnum = bms_next_member(attnums, attnum)) >= 0)
- mapping[attnum] = attidx++;
-
- Assert(attidx == bms_num_members(attnums));
-
- return mapping;
-}
-
-static bool*
-build_adjacency_matrix(List *stats, Bitmapset *attnums,
- int *idx_to_attnum, int *attnum_to_idx)
-{
- ListCell *lc;
- int natts = bms_num_members(attnums);
- bool *matrix = (bool*)palloc0(natts * natts * sizeof(bool));
-
- foreach (lc, stats)
- {
- int j;
- MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
- MVDependencies dependencies = NULL;
-
- /* skip stats without functional dependencies built */
- if (! stat->deps_built)
- continue;
-
- /* fetch and deserialize dependencies */
- dependencies = load_mv_dependencies(stat->mvoid);
- if (dependencies == NULL)
- {
- elog(WARNING, "failed to deserialize func deps %d", stat->mvoid);
- continue;
- }
-
- /* set matrix[a,b] to 'true' if 'a=>b' */
- for (j = 0; j < dependencies->ndeps; j++)
- {
- int aidx = attnum_to_idx[dependencies->deps[j]->a];
- int bidx = attnum_to_idx[dependencies->deps[j]->b];
-
- /* a=> b */
- matrix[aidx * natts + bidx] = true;
- }
- }
-
- return matrix;
-}
-
-static void
-multiply_adjacency_matrix(bool *matrix, int natts)
-{
- int i;
+ else if (IsA(clause, CurrentOfExpr))
+ {
+ /* CURRENT OF selects at most one row of its table */
+ CurrentOfExpr *cexpr = (CurrentOfExpr *) clause;
+ RelOptInfo *crel = find_base_rel(root, cexpr->cvarno);
- for (i = 0; i < natts; i++)
+ if (crel->tuples > 0)
+ s1 = 1.0 / crel->tuples;
+ }
+ else if (IsA(clause, RelabelType))
+ {
+ /* Not sure this case is needed, but it can't hurt */
+ s1 = clause_selectivity(root,
+ (Node *) ((RelabelType *) clause)->arg,
+ varRelid,
+ jointype,
+ sjinfo);
+ }
+ else if (IsA(clause, CoerceToDomain))
{
- int k, l, m;
- int nchanges = 0;
+ /* Not sure this case is needed, but it can't hurt */
+ s1 = clause_selectivity(root,
+ (Node *) ((CoerceToDomain *) clause)->arg,
+ varRelid,
+ jointype,
+ sjinfo);
+ }
- /* k => l */
- for (k = 0; k < natts; k++)
- {
- for (l = 0; l < natts; l++)
- {
- /* we already have this dependency */
- if (matrix[k * natts + l])
- continue;
+ /* Cache the result if possible */
+ if (cacheable)
+ {
+ if (jointype == JOIN_INNER)
+ rinfo->norm_selec = s1;
+ else
+ rinfo->outer_selec = s1;
+ }
- /* we don't really care about the exact value, just 0/1 */
- for (m = 0; m < natts; m++)
- {
- if (matrix[k * natts + m] * matrix[m * natts + l])
- {
- matrix[k * natts + l] = true;
- nchanges += 1;
- break;
- }
- }
- }
- }
+#ifdef SELECTIVITY_DEBUG
+ elog(DEBUG4, "clause_selectivity: s1 %f", s1);
+#endif /* SELECTIVITY_DEBUG */
- /* no transitive dependency added here, so terminate */
- if (nchanges == 0)
- break;
- }
+ return s1;
}
+
static List*
fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
int *idx_to_attnum, int *attnum_to_idx, Index relid)
@@ -3447,55 +1210,6 @@ fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
}
-static Bitmapset *
-fdeps_filter_clauses(PlannerInfo *root,
- List *clauses, Bitmapset *deps_attnums,
- List **reduced_clauses, List **deps_clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo)
-{
- ListCell *lc;
- Bitmapset *clause_attnums = NULL;
-
- foreach (lc, clauses)
- {
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(lc);
-
- if (! clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
- sjinfo, MV_CLAUSE_TYPE_FDEP))
-
- /* clause incompatible with functional dependencies */
- *reduced_clauses = lappend(*reduced_clauses, clause);
-
- else if (bms_num_members(attnums) > 1)
-
- /*
- * clause referencing multiple attributes (strange, should
- * this be handled by clause_is_mv_compatible directly)
- */
- *reduced_clauses = lappend(*reduced_clauses, clause);
-
- else if (! bms_is_member(bms_singleton_member(attnums), deps_attnums))
-
- /* clause not covered by the dependencies */
- *reduced_clauses = lappend(*reduced_clauses, clause);
-
- else
- {
- /* ok, clause compatible with existing dependencies */
- Assert(bms_num_members(attnums) == 1);
-
- *deps_clauses = lappend(*deps_clauses, clause);
- clause_attnums = bms_add_member(clause_attnums,
- bms_singleton_member(attnums));
- }
-
- bms_free(attnums);
- }
-
- return clause_attnums;
-}
-
/*
* Pull varattnos from the clauses, similarly to pull_varattnos() but:
*
@@ -3529,162 +1243,6 @@ get_varattnos(Node * node, Index relid)
return result;
}
-/*
- * Estimate selectivity of clauses using a MCV list.
- *
- * If there's no MCV list for the stats, the function returns 0.0.
- *
- * While computing the estimate, the function checks whether all the
- * columns were matched with an equality condition. If that's the case,
- * we can skip processing the histogram, as there can be no rows in
- * it with the same values - all the rows matching the condition are
- * represented by the MCV item. This can only happen with equality
- * on all the attributes.
- *
- * The algorithm works like this:
- *
- * 1) mark all items as 'match'
- * 2) walk through all the clauses
- * 3) for a particular clause, walk through all the items
- * 4) skip items that are already 'no match'
- * 5) check clause for items that still match
- * 6) sum frequencies for items to get selectivity
- *
- * The function also returns the frequency of the least frequent item
- * on the MCV list, which may be useful for clamping estimate from the
- * histogram (all items not present in the MCV list are less frequent).
- * This however seems useful only for cases with conditions on all
- * attributes.
- *
- * TODO This only handles AND-ed clauses, but it might work for OR-ed
- * lists too - it just needs to reverse the logic a bit. I.e. start
- * with 'no match' for all items, and mark the items as a match
- * as the clauses are processed (and skip items that are 'match').
- */
-static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
- List *clauses, List *conditions, bool is_or,
- bool *fullmatch, Selectivity *lowsel)
-{
- int i;
- Selectivity s = 0.0;
- Selectivity t = 0.0;
- Selectivity u = 0.0;
-
- MCVList mcvlist = NULL;
-
- int nmatches = 0;
- int nconditions = 0;
-
- /* match/mismatch bitmap for each MCV item */
- char * matches = NULL;
- char * condition_matches = NULL;
-
- Assert(clauses != NIL);
- Assert(list_length(clauses) >= 1);
-
- /* there's no MCV list built yet */
- if (! mvstats->mcv_built)
- return 0.0;
-
- mcvlist = load_mv_mcvlist(mvstats->mvoid);
-
- Assert(mcvlist != NULL);
- Assert(mcvlist->nitems > 0);
-
- /* number of matching MCV items */
- nmatches = mcvlist->nitems;
- nconditions = mcvlist->nitems;
-
- /*
- * Bitmap of bucket matches (mismatch, partial, full).
- *
- * For AND clauses all buckets match (and we'll eliminate them).
- * For OR clauses no buckets match (and we'll add them).
- *
- * We only need to do the memset for AND clauses (for OR clauses
- * it's already set correctly by the palloc0).
- */
- matches = palloc0(sizeof(char) * nmatches);
-
- if (! is_or) /* AND-clause */
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
-
- /* Conditions are treated as AND clause, so match by default. */
- condition_matches = palloc0(sizeof(char) * nconditions);
- memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
-
- /*
- * build the match bitmap for the conditions (conditions are always
- * connected by AND)
- */
- if (conditions != NIL)
- nconditions = update_match_bitmap_mcvlist(root, conditions,
- mvstats->stakeys, mcvlist,
- nconditions, condition_matches,
- lowsel, fullmatch, false);
-
- /*
- * build the match bitmap for the estimated clauses
- *
- * TODO This evaluates the clauses for all MCV items, even those
- * ruled out by the conditions. The final result should be the
- * same, but it might be faster.
- */
- nmatches = update_match_bitmap_mcvlist(root, clauses,
- mvstats->stakeys, mcvlist,
- ((is_or) ? 0 : nmatches), matches,
- lowsel, fullmatch, is_or);
-
- /* sum frequencies for all the matching MCV items */
- for (i = 0; i < mcvlist->nitems; i++)
- {
- /*
- * Find out what part of the data is covered by the MCV list,
- * so that we can 'scale' the selectivity properly (e.g. when
- * only 50% of the sample items got into the MCV, and the rest
- * is either in a histogram, or not covered by stats).
- *
- * TODO This might be handled by keeping a global "frequency"
- * for the whole list, which might save us a bit of time
- * spent on accessing the not-matching part of the MCV list.
- * Although it's likely in a cache, so it's very fast.
- */
- u += mcvlist->items[i]->frequency;
-
- /* skit MCV items not matching the conditions */
- if (condition_matches[i] == MVSTATS_MATCH_NONE)
- continue;
-
- if (matches[i] != MVSTATS_MATCH_NONE)
- s += mcvlist->items[i]->frequency;
-
- t += mcvlist->items[i]->frequency;
- }
-
- pfree(matches);
- pfree(condition_matches);
- pfree(mcvlist);
-
- /* no condition matches */
- if (t == 0.0)
- return (Selectivity)0.0;
-
- return (s / t) * u;
-}
-
-/*
- * Evaluate clauses using the MCV list, and update the match bitmap.
- *
- * The bitmap may be already partially set, so this is really a way to
- * combine results of several clause lists - either when computing
- * conditional probability P(A|B) or a combination of AND/OR clauses.
- *
- * TODO This works with 'bitmap' where each bit is represented as a char,
- * which is slightly wasteful. Instead, we could use a regular
- * bitmap, reducing the size to ~1/8. Another thing is merging the
- * bitmaps using & and |, which might be faster than min/max.
- */
static int
update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -3972,216 +1530,59 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
/* match/mismatch bitmap for each MCV item */
int tmp_nmatches = 0;
- char * tmp_matches = NULL;
-
- Assert(tmp_clauses != NIL);
- Assert(list_length(tmp_clauses) >= 2);
-
- /* number of matching MCV items */
- tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
-
- /* by default none of the MCV items matches the clauses */
- tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
-
- /* AND clauses assume everything matches, initially */
- if (! or_clause(clause))
- memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
- /* build the match bitmap for the OR-clauses */
- tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
- stakeys, mcvlist,
- tmp_nmatches, tmp_matches,
- lowsel, fullmatch, or_clause(clause));
-
- /* merge the bitmap into the existing one*/
- for (i = 0; i < mcvlist->nitems; i++)
- {
- /*
- * To AND-merge the bitmaps, a MIN() semantics is used.
- * For OR-merge, use MAX().
- *
- * FIXME this does not decrease the number of matches
- */
- UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
- }
- pfree(tmp_matches);
-
- }
- else
- elog(ERROR, "unknown clause type: %d", clause->type);
- }
-
- /*
- * If all the columns were matched by equality, it's a full match.
- * In this case there can be just a single MCV item, matching the
- * clause (if there were two, both would match the other one).
- */
- *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
-
- /* free the allocated pieces */
- if (eqmatches)
- pfree(eqmatches);
-
- return nmatches;
-}
-
-/*
- * Estimate selectivity of clauses using a histogram.
- *
- * If there's no histogram for the stats, the function returns 0.0.
- *
- * The general idea of this method is similar to how MCV lists are
- * processed, except that this introduces the concept of a partial
- * match (MCV only works with full match / mismatch).
- *
- * The algorithm works like this:
- *
- * 1) mark all buckets as 'full match'
- * 2) walk through all the clauses
- * 3) for a particular clause, walk through all the buckets
- * 4) skip buckets that are already 'no match'
- * 5) check clause for buckets that still match (at least partially)
- * 6) sum frequencies for buckets to get selectivity
- *
- * Unlike MCV lists, histograms have a concept of a partial match. In
- * that case we use 1/2 the bucket, to minimize the average error. The
- * MV histograms are usually less detailed than the per-column ones,
- * meaning the sum is often quite high (thanks to combining a lot of
- * "partially hit" buckets).
- *
- * Maybe we could use per-bucket information with number of distinct
- * values it contains (for each dimension), and then use that to correct
- * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
- * frequency). We might also scale the value depending on the actual
- * ndistinct estimate (not just the values observed in the sample).
- *
- * Another option would be to multiply the selectivities, i.e. if we get
- * 'partial match' for a bucket for multiple conditions, we might use
- * 0.5^k (where k is the number of conditions), instead of 0.5. This
- * probably does not minimize the average error, though.
- *
- * TODO This might use a similar shortcut to MCV lists - count buckets
- * marked as partial/full match, and terminate once this drop to 0.
- * Not sure if it's really worth it - for MCV lists a situation like
- * this is not uncommon, but for histograms it's not that clear.
- */
-static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
- List *clauses, List *conditions, bool is_or)
-{
- int i;
- Selectivity s = 0.0;
- Selectivity t = 0.0;
- Selectivity u = 0.0;
-
- int nmatches = 0;
- int nconditions = 0;
- char *matches = NULL;
- char *condition_matches = NULL;
-
- MVSerializedHistogram mvhist = NULL;
-
- /* there's no histogram */
- if (! mvstats->hist_built)
- return 0.0;
-
- /* There may be no histogram in the stats (check hist_built flag) */
- mvhist = load_mv_histogram(mvstats->mvoid);
-
- Assert (mvhist != NULL);
- Assert (clauses != NIL);
- Assert (list_length(clauses) >= 1);
-
- nmatches = mvhist->nbuckets;
- nconditions = mvhist->nbuckets;
-
- /*
- * Bitmap of bucket matches (mismatch, partial, full).
- *
- * For AND clauses all buckets match (and we'll eliminate them).
- * For OR clauses no buckets match (and we'll add them).
- *
- * We only need to do the memset for AND clauses (for OR clauses
- * it's already set correctly by the palloc0).
- */
- matches = palloc0(sizeof(char) * nmatches);
-
- if (! is_or) /* AND-clause */
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
-
- /* Conditions are treated as AND clause, so match by default. */
- condition_matches = palloc0(sizeof(char)*nconditions);
- memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+ char * tmp_matches = NULL;
- /*
- * build the match bitmap for the conditions (conditions are always
- * connected by AND)
- */
- if (conditions != NIL)
- update_match_bitmap_histogram(root, conditions,
- mvstats->stakeys, mvhist,
- nconditions, condition_matches, false);
+ Assert(tmp_clauses != NIL);
+ Assert(list_length(tmp_clauses) >= 2);
- /*
- * build the match bitmap for the estimated clauses
- *
- * TODO This evaluates the clauses for all buckets, even those
- * ruled out by the conditions. The final result should be
- * the same, but it might be faster.
- */
- update_match_bitmap_histogram(root, clauses,
- mvstats->stakeys, mvhist,
- ((is_or) ? 0 : nmatches), matches,
- is_or);
+ /* number of matching MCV items */
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
- /* now, walk through the buckets and sum the selectivities */
- for (i = 0; i < mvhist->nbuckets; i++)
- {
- float coeff = 1.0;
+ /* by default none of the MCV items matches the clauses */
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- /*
- * Find out what part of the data is covered by the histogram,
- * so that we can 'scale' the selectivity properly (e.g. when
- * only 50% of the sample got into the histogram, and the rest
- * is in a MCV list).
- *
- * TODO This might be handled by keeping a global "frequency"
- * for the whole histogram, which might save us some time
- * spent accessing the not-matching part of the histogram.
- * Although it's likely in a cache, so it's very fast.
- */
- u += mvhist->buckets[i]->ntuples;
+ /* AND clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
- /* skip buckets not matching the conditions */
- if (condition_matches[i] == MVSTATS_MATCH_NONE)
- continue;
- else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
- coeff = 0.5;
+ /* build the match bitmap for the OR-clauses */
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
+ stakeys, mcvlist,
+ tmp_nmatches, tmp_matches,
+ lowsel, fullmatch, or_clause(clause));
- t += coeff * mvhist->buckets[i]->ntuples;
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
+ }
- if (matches[i] == MVSTATS_MATCH_FULL)
- s += coeff * mvhist->buckets[i]->ntuples;
- else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- /*
- * TODO If both conditions and clauses match partially, this
- * will use 0.25 match - not sure if that's the right
- * thing solution, but seems about right.
- */
- s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
+ pfree(tmp_matches);
+
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
}
- /* release the allocated bitmap and deserialized histogram */
- pfree(matches);
- pfree(condition_matches);
- pfree(mvhist);
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
- /* no condition matches */
- if (t == 0.0)
- return (Selectivity)0.0;
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
- return (s / t) * u;
+ return nmatches;
}
/*
@@ -4691,362 +2092,479 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
return nmatches;
}
-/*
- * Walk through clauses and keep only those covered by at least
- * one of the statistics.
- */
-static List *
-filter_clauses(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
- int type, List *stats, List *clauses, Bitmapset **attnums)
+static Node *
+stripRestrictStatData(List *clauses, BoolExprType boolop, Bitmapset **attrs)
{
- ListCell *c;
- ListCell *s;
-
- /* results (list of compatible clauses, attnums) */
- List *rclauses = NIL;
+ Expr *newexpr;
+ ListCell *lc;
- foreach (c, clauses)
+ if (attrs) *attrs = NULL;
+
+ if (list_length(clauses) == 0)
+ newexpr = NULL;
+ else if (list_length(clauses) == 1)
{
- Node *clause = (Node*)lfirst(c);
- Bitmapset *clause_attnums = NULL;
- Index relid;
+ RestrictStatData *rsd = (RestrictStatData *) linitial(clauses);
+ Assert(IsA(rsd, RestrictStatData));
- /*
- * The clause has to be mv-compatible (suitable operators etc.).
- */
- if (! clause_is_mv_compatible(root, clause, varRelid,
- &relid, &clause_attnums, sjinfo, type))
- elog(ERROR, "should not get non-mv-compatible cluase");
+ newexpr = (Expr*)(rsd->clause);
+ if (attrs) *attrs = rsd->mvattrs;
+ }
+ else
+ {
+ BoolExpr *newboolexpr;
+ newboolexpr = makeNode(BoolExpr);
+ newboolexpr->boolop = boolop;
- /* is there a statistics covering this clause? */
- foreach (s, stats)
+ foreach (lc, clauses)
{
- int k, matches = 0;
- MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
-
- for (k = 0; k < stat->stakeys->dim1; k++)
- {
- if (bms_is_member(stat->stakeys->values[k],
- clause_attnums))
- matches += 1;
- }
-
- /*
- * The clause is compatible if all attributes it references
- * are covered by the statistics.
- */
- if (bms_num_members(clause_attnums) == matches)
- {
- *attnums = bms_union(*attnums, clause_attnums);
- rclauses = lappend(rclauses, clause);
- break;
- }
+ RestrictStatData *rsd = (RestrictStatData *) lfirst(lc);
+ Assert(IsA(rsd, RestrictStatData));
+ newboolexpr->args =
+ lappend(newboolexpr->args, rsd->clause);
+ if (attrs)
+ *attrs = bms_add_members(*attrs, rsd->mvattrs);
}
-
- bms_free(clause_attnums);
+ newexpr = (Expr*) newboolexpr;
}
- /* we can't have more compatible conditions than source conditions */
- Assert(list_length(clauses) >= list_length(rclauses));
-
- return rclauses;
+ return (Node*)newexpr;
}
-
-/*
- * Walk through statistics and only keep those covering at least
- * one new attribute (excluding conditions) and at two attributes
- * in both clauses and conditions.
- *
- * This check might be made more strict by checking against individual
- * clauses, because by using the bitmapsets of all attnums we may
- * actually use attnums from clauses that are not covered by the
- * statistics. For example, we may have a condition
- *
- * (a=1 AND b=2)
- *
- * and a new clause
- *
- * (c=1 AND d=1)
- *
- * With only bitmapsets, statistics on [b,c] will pass through this
- * (assuming there are some statistics covering both clases).
- *
- * TODO Do the more strict check.
- */
-static List *
-filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+RestrictStatData *
+transformRestrictInfoForEstimate(PlannerInfo *root, List *clauses,
+ int relid, SpecialJoinInfo *sjinfo)
{
- ListCell *s;
- List *stats_filtered = NIL;
+ static int level = 0;
+ int i = -1;
+ char head[100];
+ RestrictStatData *rdata = makeNode(RestrictStatData);
+ Node *clause;
- foreach (s, stats)
+ memset(head, '.', 100);
+ head[level] = 0;
+
+ if (list_length(clauses) == 1 &&
+ !IsA((Node*)linitial(clauses), RestrictInfo))
{
- int k;
- int matches_new = 0,
- matches_all = 0;
-
- MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
-
- /* see how many attributes the statistics covers */
- for (k = 0; k < stat->stakeys->dim1; k++)
- {
- /* attributes from new clauses */
- if (bms_is_member(stat->stakeys->values[k], new_attnums))
- matches_new += 1;
-
- /* attributes from onditions */
- if (bms_is_member(stat->stakeys->values[k], all_attnums))
- matches_all += 1;
- }
-
- /* check we have enough attributes for this statistics */
- if ((matches_new >= 1) && (matches_all >= 2))
- stats_filtered = lappend(stats_filtered, stat);
+ Assert(relid > 0);
+ clause = (Node*)linitial(clauses);
}
+ else
+ {
+ /* This is top level clauselist. Convert it to and expression */
+ ListCell *lc;
+ Index clauserelid = 0;
+ Relids relids = pull_varnos((Node*)clauses);
- /* we can't have more useful stats than we had originally */
- Assert(list_length(stats) >= list_length(stats_filtered));
-
- return stats_filtered;
-}
+ if (bms_num_members(relids) != 1)
+ return NULL;
-static MVStatisticInfo *
-make_stats_array(List *stats, int *nmvstats)
-{
- int i;
- ListCell *l;
+ clauserelid = bms_singleton_member(relids);
+ if (relid != 0 && relid != clauserelid)
+ return NULL;
- MVStatisticInfo *mvstats = NULL;
- *nmvstats = list_length(stats);
+ relid = clauserelid;
- mvstats
- = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
+ if (list_length(clauses) == 1)
+ {
+ /*
+ * If the clauselist had only 1 element, it should be a toplevel
+ * RestrictInfo.
+ */
+ RestrictInfo *rinfo = (RestrictInfo *) linitial(clauses);
+ Assert(IsA(rinfo, RestrictInfo));
+
+ /* The only RestrictInfo is a join clause. Bail out. */
+ if (rinfo->pseudoconstant ||
+ treat_as_join_clause((Node*)rinfo->clause,
+ rinfo, 0, sjinfo))
+ return NULL;
+
+ clause = (Node*) rinfo->clause;
+ }
+ else
+ {
+ BoolExpr *andexpr = makeNode(BoolExpr);
+ andexpr->boolop = AND_EXPR;
+ foreach (lc, clauses)
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+
+ Assert(IsA(rinfo, RestrictInfo));
+
+ /* stash caluses unrelated to multivariate statistics. */
+ if (rinfo->pseudoconstant ||
+ treat_as_join_clause((Node*)rinfo->clause,
+ rinfo, 0, sjinfo))
+ rdata->unusedrinfos = lappend(rdata->unusedrinfos,
+ rinfo);
+ else
+ andexpr->args = lappend(andexpr->args, rinfo->clause);
+ }
+ clause = (Node*)andexpr;
+ }
- i = 0;
- foreach (l, stats)
- {
- MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
- memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
}
- return mvstats;
-}
-
-static Bitmapset **
-make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
-{
- int i, j;
- Bitmapset **stats_attnums = NULL;
+ Assert(!IsA(clause, RestrictInfo));
- Assert(nmvstats > 0);
+ rdata->clause = clause;
+ rdata->boolop = AND_EXPR;
- /* build bitmaps of attnums for the stats (easier to compare) */
- stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+ if (and_clause(clause) || or_clause(clause))
+ {
+ BoolExpr *boolexpr = (BoolExpr *)clause;
+ ListCell *lc;
+ List *mvclauses = NIL;
+ List *nonmvclauses = NIL;
+ List *partialclauses = NIL;
+ Bitmapset *resultattrs = NULL;
+ List *resultstats = NIL;
- for (i = 0; i < nmvstats; i++)
- for (j = 0; j < mvstats[i].stakeys->dim1; j++)
- stats_attnums[i]
- = bms_add_member(stats_attnums[i],
- mvstats[i].stakeys->values[j]);
+ rdata->boolop = boolexpr->boolop;
+ ereport(DEBUG1,
+ (errmsg ("%s%s[%d][%d](%d)",
+ head,
+ and_clause(clause)?"AND":
+ (or_clause(clause)?"OR":"NOT"),
+ level, i, list_length(boolexpr->args)),
+ errhidestmt(level)));
- return stats_attnums;
-}
+ /* Recursively process the subexpressions */
+ level++;
+ foreach (lc, (boolexpr->args))
+ {
+ Node *nd = (Node*) lfirst(lc);
+ RestrictStatData *tmpsd;
+ tmpsd = transformRestrictInfoForEstimate(root,
+ list_make1(nd),
+ relid, sjinfo);
+ /*
+ * mvclauses is to hold the child RestrictStatData that
+ * potentially can be pulled-up to this node's mvclause, which is
+ * to be estimated using multivariate statistics.
+ *
+ * partialclauses is to hold the child RestrictStatData that
+ * cannot be pulled-up.
+ *
+ * nonmvclauses is to hold the child RestrictStatData to be
+ * pulled-up into the clause to be estimated in the normal way.
+ */
+ if (tmpsd->mvattrs)
+ mvclauses = lappend(mvclauses, tmpsd);
+ else if (tmpsd->mvclause)
+ partialclauses = lappend(partialclauses, tmpsd);
+ else
+ nonmvclauses = lappend(nonmvclauses, tmpsd);
+ }
+ level--;
-/*
- * Now let's remove redundant statistics, covering the same columns
- * as some other stats, when restricted to the attributes from
- * remaining clauses.
- *
- * If statistics S1 covers S2 (covers S2 attributes and possibly
- * some more), we can probably remove S2. What actually matters are
- * attributes from covered clauses (not all the attributes). This
- * might however prefer larger, and thus less accurate, statistics.
- *
- * When a redundancy is detected, we simply keep the smaller
- * statistics (less number of columns), on the assumption that it's
- * more accurate and faster to process. That might be incorrect for
- * two reasons - first, the accuracy really depends on number of
- * buckets/MCV items, not the number of columns. Second, we might
- * prefer MCV lists over histograms or something like that.
- */
-static List*
-filter_redundant_stats(List *stats, List *clauses, List *conditions)
-{
- int i, j, nmvstats;
- MVStatisticInfo *mvstats;
- bool *redundant;
- Bitmapset **stats_attnums;
- Bitmapset *varattnos;
- Index relid;
+ if (list_length(mvclauses) == 1)
+ {
+ /*
+ * If this boolean clause has only one mv clause, pull it up for
+ * now.
+ */
+ RestrictStatData *rsd = (RestrictStatData *) linitial(mvclauses);
+ resultattrs = rsd->mvattrs;
+ resultstats = rsd->mvstats;
+ }
+ if (list_length(mvclauses) > 1)
+ {
+ /*
+ * Pick up the smallest mv-stats that covers as large part as
+ * possible of the attrutes appeard in the subclauses, then remove
+ * clauses that is not covered by the selected mv-stats.
+ */
+ int nmvstats = 0;
+ ListCell *lc;
+ bm_mvstat *mvstatslist[16];
+ int maxnattrs = 0;
+ int i;
- Assert(list_length(stats) > 0);
- Assert(list_length(clauses) > 0);
+ /*
+ * Collect all mvstats from all subclauses. Attribute set should
+ * be unique so use it as key. There should be not so many stats.
+ */
+ foreach (lc, mvclauses)
+ {
+ RestrictStatData *rsd = (RestrictStatData *) lfirst(lc);
+ Bitmapset *mvattrs = rsd->mvattrs;
+ ListCell *lcs;
- /*
- * We'll convert the list of statistics into an array now, because
- * the reduction of redundant statistics is easier to do that way
- * (we can mark previous stats as redundant, etc.).
- */
- mvstats = make_stats_array(stats, &nmvstats);
- stats_attnums = make_stats_attnums(mvstats, nmvstats);
+ /* make a covering attribute set of all cluases */
+ resultattrs = bms_add_members(resultattrs, mvattrs);
- /* by default, none of the stats is redundant (so palloc0) */
- redundant = palloc0(nmvstats * sizeof(bool));
+ /* pick up new mv stats from lower clauses */
+ foreach (lcs, rsd->mvstats)
+ {
+ bm_mvstat *mvs = (bm_mvstat*) lfirst(lcs);
+ bool found = false;
- /*
- * We only expect a single relid here, and also we should get the
- * same relid from clauses and conditions (but we get it from
- * clauses, because those are certainly non-empty).
- */
- relid = bms_singleton_member(pull_varnos((Node*)clauses));
+ for (i = 0 ; !found && i < nmvstats ; i++)
+ {
+ if(bms_equal(mvstatslist[i]->attrs, mvs->attrs))
+ found = true;
+ }
+ if (!found)
+ {
+ mvstatslist[nmvstats] = mvs;
+ nmvstats++;
+ }
- /*
- * Get the varattnos from both conditions and clauses.
- *
- * This skips system attributes, although that should be impossible
- * thanks to previous filtering out of incompatible clauses.
- *
- * XXX Is that really true?
- */
- varattnos = bms_union(get_varattnos((Node*)clauses, relid),
- get_varattnos((Node*)conditions, relid));
+ /* ignore more than 15(!) stats for a clause */
+ if (nmvstats > 15)
+ break;
+ }
+ }
- for (i = 1; i < nmvstats; i++)
- {
- /* intersect with current statistics */
- Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
+ /* Check functional dependency first, maybe.. */
+// if (list_length(mvclauses) == 2)
+// {
+// RestrictStatData *rsd1 =
+// (RestrictStatData *) linitial(mvclauses);
+// RestrictStatData *rsd2 =
+// (RestrictStatData *) lsecond(mvclauses);
+// /* To do more...*/
+// }
- /* walk through 'previous' stats and check redundancy */
- for (j = 0; j < i; j++)
- {
- /* intersect with current statistics */
- Bitmapset *prev;
+ //if (clauseboolop == AND_EXPR && ...
+
+ maxnattrs = 0;
- /* skip stats already identified as redundant */
- if (redundant[j])
- continue;
+ /*
+ * Find stats covering largest number of attributes in this
+ * clause
+ */
+ for (i = 0 ; i < nmvstats ; i++)
+ {
+ Bitmapset *matchattr =
+ bms_intersect(resultattrs, mvstatslist[i]->attrs);
+ int nmatchattrs = bms_num_members(matchattr);
- prev = bms_intersect(stats_attnums[j], varattnos);
+ if (maxnattrs < nmatchattrs)
+ {
+ /* The candidates so far is not maximum */
+ if (nmvstats - i > 0)
+ memmove(mvstatslist, mvstatslist + i,
+ (nmvstats - i) * sizeof(bm_mvstat*));
+ maxnattrs = nmatchattrs;
+ nmvstats =- i;
+ i = 0; /* Restart from the first */
+ }
+ else if (maxnattrs > nmatchattrs)
+ {
+ /* Remove this stats */
+ if (nmvstats - i - 1> 0)
+ memmove(mvstatslist + i, mvstatslist + i + 1,
+ (nmvstats - i - 1) * sizeof(bm_mvstat*));
+ nmvstats--;
+ }
+ }
- switch (bms_subset_compare(curr, prev))
+ if (maxnattrs < 2)
+ {
+ /* mv stats dosn't apply one attribute */
+ mvclauses = NIL;
+ nonmvclauses = NIL;
+ resultattrs = NULL;
+ resultstats = NIL;
+ }
+ else
{
- case BMS_EQUAL:
+ /* Consider only the first stats for now.. */
+ if (nmvstats > 1)
+ elog(LOG, "Some mv stats are ignored");
+
+ if (!bms_is_subset(resultattrs,
+ mvstatslist[0]->attrs))
+ {
/*
- * Use the smaller one (hopefully more accurate).
- * If both have the same size, use the first one.
+ * move out the clauses that is not covered by the
+ * candidate stats
*/
- if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
- redundant[i] = TRUE;
- else
- redundant[j] = TRUE;
-
- break;
-
- case BMS_SUBSET1: /* curr is subset of prev */
- redundant[i] = TRUE;
- break;
+ List *old_mvclauses = mvclauses;
+ ListCell *lc;
+ Bitmapset *statsattrs =
+ mvstatslist[0]->attrs;
+ mvclauses = NIL;
- case BMS_SUBSET2: /* prev is subset of curr */
- redundant[j] = TRUE;
- break;
+ foreach(lc, old_mvclauses)
+ {
+ RestrictStatData *rsd = (RestrictStatData *) lfirst(lc);
+ Assert(IsA(rsd, RestrictStatData));
- case BMS_DIFFERENT:
- /* do nothing - keep both stats */
- break;
+ if (bms_is_subset(rsd->mvattrs, statsattrs))
+ mvclauses = lappend(mvclauses, rsd);
+ else
+ nonmvclauses = lappend(nonmvclauses, rsd);
+ }
+ resultattrs = bms_intersect(resultattrs,
+ mvstatslist[0]->attrs);
+ }
+ resultstats = list_make1(mvstatslist[0]);
}
-
- bms_free(prev);
}
- bms_free(curr);
- }
-
- /* can't reduce all statistics (at least one has to remain) */
- Assert(nmvstats > 0);
+ if (bms_num_members(resultattrs) < 2)
+ {
+ /*
+ * make this non-mv if mvclause covers only one mv-attribute.
+ */
+ nonmvclauses = list_concat(nonmvclauses, mvclauses);
+ mvclauses = NULL;
+ resultattrs = NULL;
+ resultstats = NIL;
+ }
- /* now, let's remove the reduced statistics from the arrays */
- list_free(stats);
- stats = NIL;
+ /*
+ * All mvclauses are covered by the candidate stats here.
+ */
+ rdata->mvclause =
+ stripRestrictStatData(mvclauses, rdata->boolop, NULL);
+ rdata->children = partialclauses;
+ rdata->mvattrs = resultattrs;
+ rdata->nonmvclause =
+ stripRestrictStatData(nonmvclauses, rdata->boolop, NULL);
+ rdata->mvstats = resultstats;
- for (i = 0; i < nmvstats; i++)
+ }
+ else if (not_clause(clause))
{
- MVStatisticInfo *info;
-
- pfree(stats_attnums[i]);
+ Node *nd = (Node *) linitial(((BoolExpr*)clause)->args);
+ RestrictStatData *tmpsd;
- if (redundant[i])
- continue;
-
- info = makeNode(MVStatisticInfo);
- memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
-
- stats = lappend(stats, info);
+ tmpsd = transformRestrictInfoForEstimate(root, list_make1(nd),
+ relid, sjinfo);
+ rdata->children = list_make1(tmpsd);
}
-
- pfree(mvstats);
- pfree(stats_attnums);
- pfree(redundant);
-
- return stats;
-}
-
-static Node**
-make_clauses_array(List *clauses, int *nclauses)
-{
- int i;
- ListCell *l;
-
- Node** clauses_array;
-
- *nclauses = list_length(clauses);
- clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
-
- i = 0;
- foreach (l, clauses)
- clauses_array[i++] = (Node *)lfirst(l);
-
- *nclauses = i;
-
- return clauses_array;
-}
-
-static Bitmapset **
-make_clauses_attnums(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
- int type, Node **clauses, int nclauses)
-{
- int i;
- Index relid;
- Bitmapset **clauses_attnums
- = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
-
- for (i = 0; i < nclauses; i++)
+ else if (is_opclause(clause) &&
+ list_length(((OpExpr *) clause)->args) == 2)
{
- Bitmapset * attnums = NULL;
+ Node *varnode = get_leftop((Expr*)clause);
+ Node *nonvarnode = get_rightop((Expr*)clause);
- if (! clause_is_mv_compatible(root, clauses[i], varRelid,
- &relid, &attnums, sjinfo, type))
- elog(ERROR, "should not get non-mv-compatible cluase");
+ /* Place var on vernode if any */
+ if (!IsA(varnode, Var))
+ {
+ Node *tmp = nonvarnode;
+ nonvarnode = varnode;
+ varnode = tmp;
+ }
+
+ if (IsA(varnode, Var) && is_pseudo_constant_clause(nonvarnode))
+ {
+ Var *var = (Var *)varnode;
+ List *statslist = root->simple_rel_array[relid]->mvstatlist;
+ Oid opno = ((OpExpr*)clause)->opno;
+ int varmvbitmap = get_oprmvstat(opno);
+
+ if (varmvbitmap &&
+ !IS_SPECIAL_VARNO(var->varno) &&
+ AttrNumberIsForUserDefinedAttr(var->varattno))
+ {
+ List *mvstats = NIL;
+ ListCell *lc;
+ Bitmapset *varattrs = bms_make_singleton(var->varattno);
- clauses_attnums[i] = attnums;
+ /*
+ * Add mv statistics if it is applicable on this expression
+ */
+ foreach (lc, statslist)
+ {
+ int k;
+ MVStatisticInfo *stats = (MVStatisticInfo *) lfirst(lc);
+ Bitmapset *statsattrs = NULL;
+ int statsmvbitmap =
+ (stats->mcv_built ? MVSTATISTIC_MCV : 0) |
+ (stats->hist_built ? MVSTATISTIC_HIST : 0) |
+ (stats->deps_built ? MVSTATISTIC_FDEP : 0);
+
+ for (k = 0 ; k < stats->stakeys->dim1 ; k++)
+ statsattrs = bms_add_member(statsattrs,
+ stats->stakeys->values[k]);
+ /* XXX: Does this work as expected? */
+ if (bms_is_subset(varattrs, statsattrs) &&
+ (statsmvbitmap & varmvbitmap))
+ {
+ bm_mvstat *mvstatsent = palloc0(sizeof(bm_mvstat));
+ mvstatsent->attrs = statsattrs;
+ mvstatsent->stats = stats;
+ mvstatsent->mvkind = statsmvbitmap;
+ mvstats = lappend(mvstats, mvstatsent);
+ }
+ }
+ if (mvstats)
+ {
+ /* MV stats is potentially applicable on this expression */
+ ereport(DEBUG1,
+ (errmsg ("%sMATCH[%d][%d](varno = %d, attno = %d)",
+ head, level, i,
+ var->varno, var->varattno),
+ errhidestmt(level)));
+
+ rdata->mvstats = mvstats;
+ rdata->mvattrs = varattrs;
+ }
+ }
+ }
+ else
+ {
+ ereport(DEBUG1,
+ (errmsg ("%sno match BinOp[%d][%d]: r=%d, l=%d",
+ head, level, i,
+ varnode->type, nonvarnode->type),
+ errhidestmt(level)));
+ }
}
+ else if (IsA(clause, NullTest))
+ {
+ NullTest *expr = (NullTest*)clause;
+ Var *var = (Var *)(expr->arg);
- return clauses_attnums;
-}
-
-static bool*
-make_cover_map(Bitmapset **stats_attnums, int nmvstats,
- Bitmapset **clauses_attnums, int nclauses)
-{
- int i, j;
- bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+ if (IsA(var, Var) &&
+ !IS_SPECIAL_VARNO(var->varno) &&
+ AttrNumberIsForUserDefinedAttr(var->varattno))
+ {
+ Bitmapset *varattrs = bms_make_singleton(var->varattno);
+ List *mvstats = NIL;
+ ListCell *lc;
- for (i = 0; i < nmvstats; i++)
- for (j = 0; j < nclauses; j++)
- cover_map[i * nclauses + j]
- = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+ foreach(lc, root->simple_rel_array[relid]->mvstatlist)
+ {
+ MVStatisticInfo *stats = (MVStatisticInfo *) lfirst(lc);
+ Bitmapset *statsattrs = NULL;
+ int k;
+
+ for (k = 0 ; k < stats->stakeys->dim1 ; k++)
+ statsattrs = bms_add_member(statsattrs,
+ stats->stakeys->values[k]);
+ if (bms_is_subset(varattrs, statsattrs))
+ {
+ bm_mvstat *mvstatsent = palloc0(sizeof(bm_mvstat));
+ mvstatsent->stats = stats;
+ mvstatsent->attrs = statsattrs;
+ mvstatsent->mvkind = (MVSTATISTIC_MCV |MVSTATISTIC_HIST);
+ mvstats = lappend(mvstats, mvstatsent);
+ }
+ }
+ if (mvstats)
+ {
+ rdata->mvstats = mvstats;
+ rdata->mvattrs = varattrs;
+ }
+ }
+ }
+ else
+ {
+ ereport(DEBUG1,
+ (errmsg ("%sno match node(%d)[%d][%d]",
+ head, clause->type, level, i),
+ errhidestmt(level)));
+ }
- return cover_map;
+ return rdata;
}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 6837364..7069f60 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3380,8 +3380,7 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo,
- NIL);
+ sjinfo);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3404,8 +3403,7 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo,
- NIL);
+ &norm_sjinfo);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3572,7 +3570,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
}
/* Apply it to the input relation sizes */
@@ -3608,8 +3606,7 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL,
- NIL);
+ NULL);
rel->rows = clamp_row_est(nrows);
@@ -3646,8 +3643,7 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL,
- NIL);
+ NULL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3785,14 +3781,12 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo,
- NIL);
+ sjinfo);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo,
- NIL);
+ sjinfo);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3804,8 +3798,7 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo,
- NIL);
+ sjinfo);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index e41508b..f0acc14 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL, NIL);
+ 0, JOIN_INNER, NULL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo, NIL);
+ 0, JOIN_INNER, &sjinfo);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index cba54a4..64b6ae4 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1580,15 +1580,13 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo,
- NIL);
+ jointype, sjinfo);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo,
- NIL);
+ jointype, sjinfo);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6208,8 +6206,7 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL,
- NIL);
+ NULL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6534,8 +6531,7 @@ btcostestimate(PG_FUNCTION_ARGS)
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL,
- NIL);
+ NULL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7278,8 +7274,7 @@ gincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL,
- NIL);
+ NULL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7511,7 +7506,7 @@ brincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL, NIL);
+ JOIN_INNER, NULL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 1dc2932..6e3a0c7 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -44,6 +44,7 @@
#include "utils/rel.h"
#include "utils/syscache.h"
#include "utils/typcache.h"
+#include "utils/mvstats.h"
/* Hook for plugins to get control in get_attavgwidth() */
get_attavgwidth_hook_type get_attavgwidth_hook = NULL;
@@ -1344,6 +1345,45 @@ get_oprjoin(Oid opno)
return (RegProcedure) InvalidOid;
}
+/*
+ * get_oprmvstat
+ *
+ * Returns mv stats compatibility for computing selectivity
+ * Return valueis bitwise or of MVSTATISTIC_* symbols
+ */
+int
+get_oprmvstat(Oid opno)
+{
+ HeapTuple tp;
+
+ tp = SearchSysCache1(OPEROID, ObjectIdGetDatum(opno));
+ if (HeapTupleIsValid(tp))
+ {
+ Datum tmp;
+ bool isnull;
+ char *str;
+ int result = 0;
+
+ tmp = SysCacheGetAttr(OPEROID, tp,
+ Anum_pg_operator_oprmvstat, &isnull);
+ if (!isnull)
+ {
+ str = TextDatumGetCString(tmp);
+ if (strlen(str) == 3)
+ {
+ if (str[0] != '-') result |= MVSTATISTIC_MCV;
+ if (str[1] != '-') result |= MVSTATISTIC_HIST;
+ if (str[2] != '-') result |= MVSTATISTIC_FDEP;
+ }
+ }
+ ReleaseSysCache(tp);
+ return result;
+ }
+ else
+ return 0;
+}
+
+
/* ---------- FUNCTION CACHE ---------- */
/*
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index 26c9d4e..c75ac72 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -49,6 +49,9 @@ CATALOG(pg_operator,2617)
regproc oprcode; /* OID of underlying function */
regproc oprrest; /* OID of restriction estimator, or 0 */
regproc oprjoin; /* OID of join estimator, or 0 */
+#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ text oprmvstat; /* MV stat compatibility in '[m-][h-][f-]' */
+#endif
} FormData_pg_operator;
/* ----------------
@@ -63,7 +66,7 @@ typedef FormData_pg_operator *Form_pg_operator;
* ----------------
*/
-#define Natts_pg_operator 14
+#define Natts_pg_operator 15
#define Anum_pg_operator_oprname 1
#define Anum_pg_operator_oprnamespace 2
#define Anum_pg_operator_oprowner 3
@@ -78,6 +81,7 @@ typedef FormData_pg_operator *Form_pg_operator;
#define Anum_pg_operator_oprcode 12
#define Anum_pg_operator_oprrest 13
#define Anum_pg_operator_oprjoin 14
+#define Anum_pg_operator_oprmvstat 15
/* ----------------
* initial contents of pg_operator
@@ -91,1735 +95,1735 @@ typedef FormData_pg_operator *Form_pg_operator;
* for the underlying function.
*/
-DATA(insert OID = 15 ( "=" PGNSP PGUID b t t 23 20 16 416 36 int48eq eqsel eqjoinsel ));
+DATA(insert OID = 15 ( "=" PGNSP PGUID b t t 23 20 16 416 36 int48eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 36 ( "<>" PGNSP PGUID b f f 23 20 16 417 15 int48ne neqsel neqjoinsel ));
+DATA(insert OID = 36 ( "<>" PGNSP PGUID b f f 23 20 16 417 15 int48ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 37 ( "<" PGNSP PGUID b f f 23 20 16 419 82 int48lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 37 ( "<" PGNSP PGUID b f f 23 20 16 419 82 int48lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 76 ( ">" PGNSP PGUID b f f 23 20 16 418 80 int48gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 76 ( ">" PGNSP PGUID b f f 23 20 16 418 80 int48gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 80 ( "<=" PGNSP PGUID b f f 23 20 16 430 76 int48le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 80 ( "<=" PGNSP PGUID b f f 23 20 16 430 76 int48le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 82 ( ">=" PGNSP PGUID b f f 23 20 16 420 37 int48ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 82 ( ">=" PGNSP PGUID b f f 23 20 16 420 37 int48ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 58 ( "<" PGNSP PGUID b f f 16 16 16 59 1695 boollt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 58 ( "<" PGNSP PGUID b f f 16 16 16 59 1695 boollt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 59 ( ">" PGNSP PGUID b f f 16 16 16 58 1694 boolgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 59 ( ">" PGNSP PGUID b f f 16 16 16 58 1694 boolgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 85 ( "<>" PGNSP PGUID b f f 16 16 16 85 91 boolne neqsel neqjoinsel ));
+DATA(insert OID = 85 ( "<>" PGNSP PGUID b f f 16 16 16 85 91 boolne neqsel neqjoinsel "mhf"));
DESCR("not equal");
#define BooleanNotEqualOperator 85
-DATA(insert OID = 91 ( "=" PGNSP PGUID b t t 16 16 16 91 85 booleq eqsel eqjoinsel ));
+DATA(insert OID = 91 ( "=" PGNSP PGUID b t t 16 16 16 91 85 booleq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define BooleanEqualOperator 91
-DATA(insert OID = 1694 ( "<=" PGNSP PGUID b f f 16 16 16 1695 59 boolle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1694 ( "<=" PGNSP PGUID b f f 16 16 16 1695 59 boolle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1695 ( ">=" PGNSP PGUID b f f 16 16 16 1694 58 boolge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1695 ( ">=" PGNSP PGUID b f f 16 16 16 1694 58 boolge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 92 ( "=" PGNSP PGUID b t t 18 18 16 92 630 chareq eqsel eqjoinsel ));
+DATA(insert OID = 92 ( "=" PGNSP PGUID b t t 18 18 16 92 630 chareq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 93 ( "=" PGNSP PGUID b t t 19 19 16 93 643 nameeq eqsel eqjoinsel ));
+DATA(insert OID = 93 ( "=" PGNSP PGUID b t t 19 19 16 93 643 nameeq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 94 ( "=" PGNSP PGUID b t t 21 21 16 94 519 int2eq eqsel eqjoinsel ));
+DATA(insert OID = 94 ( "=" PGNSP PGUID b t t 21 21 16 94 519 int2eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 95 ( "<" PGNSP PGUID b f f 21 21 16 520 524 int2lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 95 ( "<" PGNSP PGUID b f f 21 21 16 520 524 int2lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 96 ( "=" PGNSP PGUID b t t 23 23 16 96 518 int4eq eqsel eqjoinsel ));
+DATA(insert OID = 96 ( "=" PGNSP PGUID b t t 23 23 16 96 518 int4eq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define Int4EqualOperator 96
-DATA(insert OID = 97 ( "<" PGNSP PGUID b f f 23 23 16 521 525 int4lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 97 ( "<" PGNSP PGUID b f f 23 23 16 521 525 int4lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define Int4LessOperator 97
-DATA(insert OID = 98 ( "=" PGNSP PGUID b t t 25 25 16 98 531 texteq eqsel eqjoinsel ));
+DATA(insert OID = 98 ( "=" PGNSP PGUID b t t 25 25 16 98 531 texteq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define TextEqualOperator 98
-DATA(insert OID = 349 ( "||" PGNSP PGUID b f f 2277 2283 2277 0 0 array_append - - ));
+DATA(insert OID = 349 ( "||" PGNSP PGUID b f f 2277 2283 2277 0 0 array_append - - "---"));
DESCR("append element onto end of array");
-DATA(insert OID = 374 ( "||" PGNSP PGUID b f f 2283 2277 2277 0 0 array_prepend - - ));
+DATA(insert OID = 374 ( "||" PGNSP PGUID b f f 2283 2277 2277 0 0 array_prepend - - "---"));
DESCR("prepend element onto front of array");
-DATA(insert OID = 375 ( "||" PGNSP PGUID b f f 2277 2277 2277 0 0 array_cat - - ));
+DATA(insert OID = 375 ( "||" PGNSP PGUID b f f 2277 2277 2277 0 0 array_cat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 352 ( "=" PGNSP PGUID b f t 28 28 16 352 0 xideq eqsel eqjoinsel ));
+DATA(insert OID = 352 ( "=" PGNSP PGUID b f t 28 28 16 352 0 xideq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 353 ( "=" PGNSP PGUID b f f 28 23 16 0 0 xideqint4 eqsel eqjoinsel ));
+DATA(insert OID = 353 ( "=" PGNSP PGUID b f f 28 23 16 0 0 xideqint4 eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 388 ( "!" PGNSP PGUID r f f 20 0 1700 0 0 numeric_fac - - ));
+DATA(insert OID = 388 ( "!" PGNSP PGUID r f f 20 0 1700 0 0 numeric_fac - - "---"));
DESCR("factorial");
-DATA(insert OID = 389 ( "!!" PGNSP PGUID l f f 0 20 1700 0 0 numeric_fac - - ));
+DATA(insert OID = 389 ( "!!" PGNSP PGUID l f f 0 20 1700 0 0 numeric_fac - - "---"));
DESCR("deprecated, use ! instead");
-DATA(insert OID = 385 ( "=" PGNSP PGUID b f t 29 29 16 385 0 cideq eqsel eqjoinsel ));
+DATA(insert OID = 385 ( "=" PGNSP PGUID b f t 29 29 16 385 0 cideq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 386 ( "=" PGNSP PGUID b f t 22 22 16 386 0 int2vectoreq eqsel eqjoinsel ));
+DATA(insert OID = 386 ( "=" PGNSP PGUID b f t 22 22 16 386 0 int2vectoreq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 387 ( "=" PGNSP PGUID b t f 27 27 16 387 402 tideq eqsel eqjoinsel ));
+DATA(insert OID = 387 ( "=" PGNSP PGUID b t f 27 27 16 387 402 tideq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define TIDEqualOperator 387
-DATA(insert OID = 402 ( "<>" PGNSP PGUID b f f 27 27 16 402 387 tidne neqsel neqjoinsel ));
+DATA(insert OID = 402 ( "<>" PGNSP PGUID b f f 27 27 16 402 387 tidne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2799 ( "<" PGNSP PGUID b f f 27 27 16 2800 2802 tidlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2799 ( "<" PGNSP PGUID b f f 27 27 16 2800 2802 tidlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define TIDLessOperator 2799
-DATA(insert OID = 2800 ( ">" PGNSP PGUID b f f 27 27 16 2799 2801 tidgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2800 ( ">" PGNSP PGUID b f f 27 27 16 2799 2801 tidgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2801 ( "<=" PGNSP PGUID b f f 27 27 16 2802 2800 tidle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2801 ( "<=" PGNSP PGUID b f f 27 27 16 2802 2800 tidle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2802 ( ">=" PGNSP PGUID b f f 27 27 16 2801 2799 tidge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2802 ( ">=" PGNSP PGUID b f f 27 27 16 2801 2799 tidge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 410 ( "=" PGNSP PGUID b t t 20 20 16 410 411 int8eq eqsel eqjoinsel ));
+DATA(insert OID = 410 ( "=" PGNSP PGUID b t t 20 20 16 410 411 int8eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 411 ( "<>" PGNSP PGUID b f f 20 20 16 411 410 int8ne neqsel neqjoinsel ));
+DATA(insert OID = 411 ( "<>" PGNSP PGUID b f f 20 20 16 411 410 int8ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 412 ( "<" PGNSP PGUID b f f 20 20 16 413 415 int8lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 412 ( "<" PGNSP PGUID b f f 20 20 16 413 415 int8lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define Int8LessOperator 412
-DATA(insert OID = 413 ( ">" PGNSP PGUID b f f 20 20 16 412 414 int8gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 413 ( ">" PGNSP PGUID b f f 20 20 16 412 414 int8gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 414 ( "<=" PGNSP PGUID b f f 20 20 16 415 413 int8le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 414 ( "<=" PGNSP PGUID b f f 20 20 16 415 413 int8le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 415 ( ">=" PGNSP PGUID b f f 20 20 16 414 412 int8ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 415 ( ">=" PGNSP PGUID b f f 20 20 16 414 412 int8ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 416 ( "=" PGNSP PGUID b t t 20 23 16 15 417 int84eq eqsel eqjoinsel ));
+DATA(insert OID = 416 ( "=" PGNSP PGUID b t t 20 23 16 15 417 int84eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 417 ( "<>" PGNSP PGUID b f f 20 23 16 36 416 int84ne neqsel neqjoinsel ));
+DATA(insert OID = 417 ( "<>" PGNSP PGUID b f f 20 23 16 36 416 int84ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 418 ( "<" PGNSP PGUID b f f 20 23 16 76 430 int84lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 418 ( "<" PGNSP PGUID b f f 20 23 16 76 430 int84lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 419 ( ">" PGNSP PGUID b f f 20 23 16 37 420 int84gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 419 ( ">" PGNSP PGUID b f f 20 23 16 37 420 int84gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 420 ( "<=" PGNSP PGUID b f f 20 23 16 82 419 int84le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 420 ( "<=" PGNSP PGUID b f f 20 23 16 82 419 int84le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 430 ( ">=" PGNSP PGUID b f f 20 23 16 80 418 int84ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 430 ( ">=" PGNSP PGUID b f f 20 23 16 80 418 int84ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 439 ( "%" PGNSP PGUID b f f 20 20 20 0 0 int8mod - - ));
+DATA(insert OID = 439 ( "%" PGNSP PGUID b f f 20 20 20 0 0 int8mod - - "---"));
DESCR("modulus");
-DATA(insert OID = 473 ( "@" PGNSP PGUID l f f 0 20 20 0 0 int8abs - - ));
+DATA(insert OID = 473 ( "@" PGNSP PGUID l f f 0 20 20 0 0 int8abs - - "---"));
DESCR("absolute value");
-DATA(insert OID = 484 ( "-" PGNSP PGUID l f f 0 20 20 0 0 int8um - - ));
+DATA(insert OID = 484 ( "-" PGNSP PGUID l f f 0 20 20 0 0 int8um - - "---"));
DESCR("negate");
-DATA(insert OID = 485 ( "<<" PGNSP PGUID b f f 604 604 16 0 0 poly_left positionsel positionjoinsel ));
+DATA(insert OID = 485 ( "<<" PGNSP PGUID b f f 604 604 16 0 0 poly_left positionsel positionjoinsel "---"));
DESCR("is left of");
-DATA(insert OID = 486 ( "&<" PGNSP PGUID b f f 604 604 16 0 0 poly_overleft positionsel positionjoinsel ));
+DATA(insert OID = 486 ( "&<" PGNSP PGUID b f f 604 604 16 0 0 poly_overleft positionsel positionjoinsel "---"));
DESCR("overlaps or is left of");
-DATA(insert OID = 487 ( "&>" PGNSP PGUID b f f 604 604 16 0 0 poly_overright positionsel positionjoinsel ));
+DATA(insert OID = 487 ( "&>" PGNSP PGUID b f f 604 604 16 0 0 poly_overright positionsel positionjoinsel "---"));
DESCR("overlaps or is right of");
-DATA(insert OID = 488 ( ">>" PGNSP PGUID b f f 604 604 16 0 0 poly_right positionsel positionjoinsel ));
+DATA(insert OID = 488 ( ">>" PGNSP PGUID b f f 604 604 16 0 0 poly_right positionsel positionjoinsel "---"));
DESCR("is right of");
-DATA(insert OID = 489 ( "<@" PGNSP PGUID b f f 604 604 16 490 0 poly_contained contsel contjoinsel ));
+DATA(insert OID = 489 ( "<@" PGNSP PGUID b f f 604 604 16 490 0 poly_contained contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 490 ( "@>" PGNSP PGUID b f f 604 604 16 489 0 poly_contain contsel contjoinsel ));
+DATA(insert OID = 490 ( "@>" PGNSP PGUID b f f 604 604 16 489 0 poly_contain contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 491 ( "~=" PGNSP PGUID b f f 604 604 16 491 0 poly_same eqsel eqjoinsel ));
+DATA(insert OID = 491 ( "~=" PGNSP PGUID b f f 604 604 16 491 0 poly_same eqsel eqjoinsel "mhf"));
DESCR("same as");
-DATA(insert OID = 492 ( "&&" PGNSP PGUID b f f 604 604 16 492 0 poly_overlap areasel areajoinsel ));
+DATA(insert OID = 492 ( "&&" PGNSP PGUID b f f 604 604 16 492 0 poly_overlap areasel areajoinsel "---"));
DESCR("overlaps");
-DATA(insert OID = 493 ( "<<" PGNSP PGUID b f f 603 603 16 0 0 box_left positionsel positionjoinsel ));
+DATA(insert OID = 493 ( "<<" PGNSP PGUID b f f 603 603 16 0 0 box_left positionsel positionjoinsel "---"));
DESCR("is left of");
-DATA(insert OID = 494 ( "&<" PGNSP PGUID b f f 603 603 16 0 0 box_overleft positionsel positionjoinsel ));
+DATA(insert OID = 494 ( "&<" PGNSP PGUID b f f 603 603 16 0 0 box_overleft positionsel positionjoinsel "---"));
DESCR("overlaps or is left of");
-DATA(insert OID = 495 ( "&>" PGNSP PGUID b f f 603 603 16 0 0 box_overright positionsel positionjoinsel ));
+DATA(insert OID = 495 ( "&>" PGNSP PGUID b f f 603 603 16 0 0 box_overright positionsel positionjoinsel "---"));
DESCR("overlaps or is right of");
-DATA(insert OID = 496 ( ">>" PGNSP PGUID b f f 603 603 16 0 0 box_right positionsel positionjoinsel ));
+DATA(insert OID = 496 ( ">>" PGNSP PGUID b f f 603 603 16 0 0 box_right positionsel positionjoinsel "---"));
DESCR("is right of");
-DATA(insert OID = 497 ( "<@" PGNSP PGUID b f f 603 603 16 498 0 box_contained contsel contjoinsel ));
+DATA(insert OID = 497 ( "<@" PGNSP PGUID b f f 603 603 16 498 0 box_contained contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 498 ( "@>" PGNSP PGUID b f f 603 603 16 497 0 box_contain contsel contjoinsel ));
+DATA(insert OID = 498 ( "@>" PGNSP PGUID b f f 603 603 16 497 0 box_contain contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 499 ( "~=" PGNSP PGUID b f f 603 603 16 499 0 box_same eqsel eqjoinsel ));
+DATA(insert OID = 499 ( "~=" PGNSP PGUID b f f 603 603 16 499 0 box_same eqsel eqjoinsel "mhf"));
DESCR("same as");
-DATA(insert OID = 500 ( "&&" PGNSP PGUID b f f 603 603 16 500 0 box_overlap areasel areajoinsel ));
+DATA(insert OID = 500 ( "&&" PGNSP PGUID b f f 603 603 16 500 0 box_overlap areasel areajoinsel "---"));
DESCR("overlaps");
-DATA(insert OID = 501 ( ">=" PGNSP PGUID b f f 603 603 16 505 504 box_ge areasel areajoinsel ));
+DATA(insert OID = 501 ( ">=" PGNSP PGUID b f f 603 603 16 505 504 box_ge areasel areajoinsel "---"));
DESCR("greater than or equal by area");
-DATA(insert OID = 502 ( ">" PGNSP PGUID b f f 603 603 16 504 505 box_gt areasel areajoinsel ));
+DATA(insert OID = 502 ( ">" PGNSP PGUID b f f 603 603 16 504 505 box_gt areasel areajoinsel "---"));
DESCR("greater than by area");
-DATA(insert OID = 503 ( "=" PGNSP PGUID b f f 603 603 16 503 0 box_eq eqsel eqjoinsel ));
+DATA(insert OID = 503 ( "=" PGNSP PGUID b f f 603 603 16 503 0 box_eq eqsel eqjoinsel "mhf"));
DESCR("equal by area");
-DATA(insert OID = 504 ( "<" PGNSP PGUID b f f 603 603 16 502 501 box_lt areasel areajoinsel ));
+DATA(insert OID = 504 ( "<" PGNSP PGUID b f f 603 603 16 502 501 box_lt areasel areajoinsel "---"));
DESCR("less than by area");
-DATA(insert OID = 505 ( "<=" PGNSP PGUID b f f 603 603 16 501 502 box_le areasel areajoinsel ));
+DATA(insert OID = 505 ( "<=" PGNSP PGUID b f f 603 603 16 501 502 box_le areasel areajoinsel "---"));
DESCR("less than or equal by area");
-DATA(insert OID = 506 ( ">^" PGNSP PGUID b f f 600 600 16 0 0 point_above positionsel positionjoinsel ));
+DATA(insert OID = 506 ( ">^" PGNSP PGUID b f f 600 600 16 0 0 point_above positionsel positionjoinsel "---"));
DESCR("is above");
-DATA(insert OID = 507 ( "<<" PGNSP PGUID b f f 600 600 16 0 0 point_left positionsel positionjoinsel ));
+DATA(insert OID = 507 ( "<<" PGNSP PGUID b f f 600 600 16 0 0 point_left positionsel positionjoinsel "---"));
DESCR("is left of");
-DATA(insert OID = 508 ( ">>" PGNSP PGUID b f f 600 600 16 0 0 point_right positionsel positionjoinsel ));
+DATA(insert OID = 508 ( ">>" PGNSP PGUID b f f 600 600 16 0 0 point_right positionsel positionjoinsel "---"));
DESCR("is right of");
-DATA(insert OID = 509 ( "<^" PGNSP PGUID b f f 600 600 16 0 0 point_below positionsel positionjoinsel ));
+DATA(insert OID = 509 ( "<^" PGNSP PGUID b f f 600 600 16 0 0 point_below positionsel positionjoinsel "---"));
DESCR("is below");
-DATA(insert OID = 510 ( "~=" PGNSP PGUID b f f 600 600 16 510 713 point_eq eqsel eqjoinsel ));
+DATA(insert OID = 510 ( "~=" PGNSP PGUID b f f 600 600 16 510 713 point_eq eqsel eqjoinsel "mhf"));
DESCR("same as");
-DATA(insert OID = 511 ( "<@" PGNSP PGUID b f f 600 603 16 433 0 on_pb contsel contjoinsel ));
+DATA(insert OID = 511 ( "<@" PGNSP PGUID b f f 600 603 16 433 0 on_pb contsel contjoinsel "---"));
DESCR("point inside box");
-DATA(insert OID = 433 ( "@>" PGNSP PGUID b f f 603 600 16 511 0 box_contain_pt contsel contjoinsel ));
+DATA(insert OID = 433 ( "@>" PGNSP PGUID b f f 603 600 16 511 0 box_contain_pt contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 512 ( "<@" PGNSP PGUID b f f 600 602 16 755 0 on_ppath - - ));
+DATA(insert OID = 512 ( "<@" PGNSP PGUID b f f 600 602 16 755 0 on_ppath - - "---"));
DESCR("point within closed path, or point on open path");
-DATA(insert OID = 513 ( "@@" PGNSP PGUID l f f 0 603 600 0 0 box_center - - ));
+DATA(insert OID = 513 ( "@@" PGNSP PGUID l f f 0 603 600 0 0 box_center - - "---"));
DESCR("center of");
-DATA(insert OID = 514 ( "*" PGNSP PGUID b f f 23 23 23 514 0 int4mul - - ));
+DATA(insert OID = 514 ( "*" PGNSP PGUID b f f 23 23 23 514 0 int4mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 517 ( "<->" PGNSP PGUID b f f 600 600 701 517 0 point_distance - - ));
+DATA(insert OID = 517 ( "<->" PGNSP PGUID b f f 600 600 701 517 0 point_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 518 ( "<>" PGNSP PGUID b f f 23 23 16 518 96 int4ne neqsel neqjoinsel ));
+DATA(insert OID = 518 ( "<>" PGNSP PGUID b f f 23 23 16 518 96 int4ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 519 ( "<>" PGNSP PGUID b f f 21 21 16 519 94 int2ne neqsel neqjoinsel ));
+DATA(insert OID = 519 ( "<>" PGNSP PGUID b f f 21 21 16 519 94 int2ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 520 ( ">" PGNSP PGUID b f f 21 21 16 95 522 int2gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 520 ( ">" PGNSP PGUID b f f 21 21 16 95 522 int2gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 521 ( ">" PGNSP PGUID b f f 23 23 16 97 523 int4gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 521 ( ">" PGNSP PGUID b f f 23 23 16 97 523 int4gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 522 ( "<=" PGNSP PGUID b f f 21 21 16 524 520 int2le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 522 ( "<=" PGNSP PGUID b f f 21 21 16 524 520 int2le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 523 ( "<=" PGNSP PGUID b f f 23 23 16 525 521 int4le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 523 ( "<=" PGNSP PGUID b f f 23 23 16 525 521 int4le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 524 ( ">=" PGNSP PGUID b f f 21 21 16 522 95 int2ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 524 ( ">=" PGNSP PGUID b f f 21 21 16 522 95 int2ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 525 ( ">=" PGNSP PGUID b f f 23 23 16 523 97 int4ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 525 ( ">=" PGNSP PGUID b f f 23 23 16 523 97 int4ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 526 ( "*" PGNSP PGUID b f f 21 21 21 526 0 int2mul - - ));
+DATA(insert OID = 526 ( "*" PGNSP PGUID b f f 21 21 21 526 0 int2mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 527 ( "/" PGNSP PGUID b f f 21 21 21 0 0 int2div - - ));
+DATA(insert OID = 527 ( "/" PGNSP PGUID b f f 21 21 21 0 0 int2div - - "---"));
DESCR("divide");
-DATA(insert OID = 528 ( "/" PGNSP PGUID b f f 23 23 23 0 0 int4div - - ));
+DATA(insert OID = 528 ( "/" PGNSP PGUID b f f 23 23 23 0 0 int4div - - "---"));
DESCR("divide");
-DATA(insert OID = 529 ( "%" PGNSP PGUID b f f 21 21 21 0 0 int2mod - - ));
+DATA(insert OID = 529 ( "%" PGNSP PGUID b f f 21 21 21 0 0 int2mod - - "---"));
DESCR("modulus");
-DATA(insert OID = 530 ( "%" PGNSP PGUID b f f 23 23 23 0 0 int4mod - - ));
+DATA(insert OID = 530 ( "%" PGNSP PGUID b f f 23 23 23 0 0 int4mod - - "---"));
DESCR("modulus");
-DATA(insert OID = 531 ( "<>" PGNSP PGUID b f f 25 25 16 531 98 textne neqsel neqjoinsel ));
+DATA(insert OID = 531 ( "<>" PGNSP PGUID b f f 25 25 16 531 98 textne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 532 ( "=" PGNSP PGUID b t t 21 23 16 533 538 int24eq eqsel eqjoinsel ));
+DATA(insert OID = 532 ( "=" PGNSP PGUID b t t 21 23 16 533 538 int24eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 533 ( "=" PGNSP PGUID b t t 23 21 16 532 539 int42eq eqsel eqjoinsel ));
+DATA(insert OID = 533 ( "=" PGNSP PGUID b t t 23 21 16 532 539 int42eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 534 ( "<" PGNSP PGUID b f f 21 23 16 537 542 int24lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 534 ( "<" PGNSP PGUID b f f 21 23 16 537 542 int24lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 535 ( "<" PGNSP PGUID b f f 23 21 16 536 543 int42lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 535 ( "<" PGNSP PGUID b f f 23 21 16 536 543 int42lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 536 ( ">" PGNSP PGUID b f f 21 23 16 535 540 int24gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 536 ( ">" PGNSP PGUID b f f 21 23 16 535 540 int24gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 537 ( ">" PGNSP PGUID b f f 23 21 16 534 541 int42gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 537 ( ">" PGNSP PGUID b f f 23 21 16 534 541 int42gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 538 ( "<>" PGNSP PGUID b f f 21 23 16 539 532 int24ne neqsel neqjoinsel ));
+DATA(insert OID = 538 ( "<>" PGNSP PGUID b f f 21 23 16 539 532 int24ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 539 ( "<>" PGNSP PGUID b f f 23 21 16 538 533 int42ne neqsel neqjoinsel ));
+DATA(insert OID = 539 ( "<>" PGNSP PGUID b f f 23 21 16 538 533 int42ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 540 ( "<=" PGNSP PGUID b f f 21 23 16 543 536 int24le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 540 ( "<=" PGNSP PGUID b f f 21 23 16 543 536 int24le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 541 ( "<=" PGNSP PGUID b f f 23 21 16 542 537 int42le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 541 ( "<=" PGNSP PGUID b f f 23 21 16 542 537 int42le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 542 ( ">=" PGNSP PGUID b f f 21 23 16 541 534 int24ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 542 ( ">=" PGNSP PGUID b f f 21 23 16 541 534 int24ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 543 ( ">=" PGNSP PGUID b f f 23 21 16 540 535 int42ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 543 ( ">=" PGNSP PGUID b f f 23 21 16 540 535 int42ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 544 ( "*" PGNSP PGUID b f f 21 23 23 545 0 int24mul - - ));
+DATA(insert OID = 544 ( "*" PGNSP PGUID b f f 21 23 23 545 0 int24mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 545 ( "*" PGNSP PGUID b f f 23 21 23 544 0 int42mul - - ));
+DATA(insert OID = 545 ( "*" PGNSP PGUID b f f 23 21 23 544 0 int42mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 546 ( "/" PGNSP PGUID b f f 21 23 23 0 0 int24div - - ));
+DATA(insert OID = 546 ( "/" PGNSP PGUID b f f 21 23 23 0 0 int24div - - "---"));
DESCR("divide");
-DATA(insert OID = 547 ( "/" PGNSP PGUID b f f 23 21 23 0 0 int42div - - ));
+DATA(insert OID = 547 ( "/" PGNSP PGUID b f f 23 21 23 0 0 int42div - - "---"));
DESCR("divide");
-DATA(insert OID = 550 ( "+" PGNSP PGUID b f f 21 21 21 550 0 int2pl - - ));
+DATA(insert OID = 550 ( "+" PGNSP PGUID b f f 21 21 21 550 0 int2pl - - "---"));
DESCR("add");
-DATA(insert OID = 551 ( "+" PGNSP PGUID b f f 23 23 23 551 0 int4pl - - ));
+DATA(insert OID = 551 ( "+" PGNSP PGUID b f f 23 23 23 551 0 int4pl - - "---"));
DESCR("add");
-DATA(insert OID = 552 ( "+" PGNSP PGUID b f f 21 23 23 553 0 int24pl - - ));
+DATA(insert OID = 552 ( "+" PGNSP PGUID b f f 21 23 23 553 0 int24pl - - "---"));
DESCR("add");
-DATA(insert OID = 553 ( "+" PGNSP PGUID b f f 23 21 23 552 0 int42pl - - ));
+DATA(insert OID = 553 ( "+" PGNSP PGUID b f f 23 21 23 552 0 int42pl - - "---"));
DESCR("add");
-DATA(insert OID = 554 ( "-" PGNSP PGUID b f f 21 21 21 0 0 int2mi - - ));
+DATA(insert OID = 554 ( "-" PGNSP PGUID b f f 21 21 21 0 0 int2mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 555 ( "-" PGNSP PGUID b f f 23 23 23 0 0 int4mi - - ));
+DATA(insert OID = 555 ( "-" PGNSP PGUID b f f 23 23 23 0 0 int4mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 556 ( "-" PGNSP PGUID b f f 21 23 23 0 0 int24mi - - ));
+DATA(insert OID = 556 ( "-" PGNSP PGUID b f f 21 23 23 0 0 int24mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 557 ( "-" PGNSP PGUID b f f 23 21 23 0 0 int42mi - - ));
+DATA(insert OID = 557 ( "-" PGNSP PGUID b f f 23 21 23 0 0 int42mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 558 ( "-" PGNSP PGUID l f f 0 23 23 0 0 int4um - - ));
+DATA(insert OID = 558 ( "-" PGNSP PGUID l f f 0 23 23 0 0 int4um - - "---"));
DESCR("negate");
-DATA(insert OID = 559 ( "-" PGNSP PGUID l f f 0 21 21 0 0 int2um - - ));
+DATA(insert OID = 559 ( "-" PGNSP PGUID l f f 0 21 21 0 0 int2um - - "---"));
DESCR("negate");
-DATA(insert OID = 560 ( "=" PGNSP PGUID b t t 702 702 16 560 561 abstimeeq eqsel eqjoinsel ));
+DATA(insert OID = 560 ( "=" PGNSP PGUID b t t 702 702 16 560 561 abstimeeq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 561 ( "<>" PGNSP PGUID b f f 702 702 16 561 560 abstimene neqsel neqjoinsel ));
+DATA(insert OID = 561 ( "<>" PGNSP PGUID b f f 702 702 16 561 560 abstimene neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 562 ( "<" PGNSP PGUID b f f 702 702 16 563 565 abstimelt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 562 ( "<" PGNSP PGUID b f f 702 702 16 563 565 abstimelt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 563 ( ">" PGNSP PGUID b f f 702 702 16 562 564 abstimegt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 563 ( ">" PGNSP PGUID b f f 702 702 16 562 564 abstimegt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 564 ( "<=" PGNSP PGUID b f f 702 702 16 565 563 abstimele scalarltsel scalarltjoinsel ));
+DATA(insert OID = 564 ( "<=" PGNSP PGUID b f f 702 702 16 565 563 abstimele scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 565 ( ">=" PGNSP PGUID b f f 702 702 16 564 562 abstimege scalargtsel scalargtjoinsel ));
+DATA(insert OID = 565 ( ">=" PGNSP PGUID b f f 702 702 16 564 562 abstimege scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 566 ( "=" PGNSP PGUID b t t 703 703 16 566 567 reltimeeq eqsel eqjoinsel ));
+DATA(insert OID = 566 ( "=" PGNSP PGUID b t t 703 703 16 566 567 reltimeeq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 567 ( "<>" PGNSP PGUID b f f 703 703 16 567 566 reltimene neqsel neqjoinsel ));
+DATA(insert OID = 567 ( "<>" PGNSP PGUID b f f 703 703 16 567 566 reltimene neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 568 ( "<" PGNSP PGUID b f f 703 703 16 569 571 reltimelt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 568 ( "<" PGNSP PGUID b f f 703 703 16 569 571 reltimelt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 569 ( ">" PGNSP PGUID b f f 703 703 16 568 570 reltimegt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 569 ( ">" PGNSP PGUID b f f 703 703 16 568 570 reltimegt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 570 ( "<=" PGNSP PGUID b f f 703 703 16 571 569 reltimele scalarltsel scalarltjoinsel ));
+DATA(insert OID = 570 ( "<=" PGNSP PGUID b f f 703 703 16 571 569 reltimele scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 571 ( ">=" PGNSP PGUID b f f 703 703 16 570 568 reltimege scalargtsel scalargtjoinsel ));
+DATA(insert OID = 571 ( ">=" PGNSP PGUID b f f 703 703 16 570 568 reltimege scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 572 ( "~=" PGNSP PGUID b f f 704 704 16 572 0 tintervalsame eqsel eqjoinsel ));
+DATA(insert OID = 572 ( "~=" PGNSP PGUID b f f 704 704 16 572 0 tintervalsame eqsel eqjoinsel "mhf"));
DESCR("same as");
-DATA(insert OID = 573 ( "<<" PGNSP PGUID b f f 704 704 16 0 0 tintervalct - - ));
+DATA(insert OID = 573 ( "<<" PGNSP PGUID b f f 704 704 16 0 0 tintervalct - - "---"));
DESCR("contains");
-DATA(insert OID = 574 ( "&&" PGNSP PGUID b f f 704 704 16 574 0 tintervalov - - ));
+DATA(insert OID = 574 ( "&&" PGNSP PGUID b f f 704 704 16 574 0 tintervalov - - "---"));
DESCR("overlaps");
-DATA(insert OID = 575 ( "#=" PGNSP PGUID b f f 704 703 16 0 576 tintervalleneq - - ));
+DATA(insert OID = 575 ( "#=" PGNSP PGUID b f f 704 703 16 0 576 tintervalleneq - - "---"));
DESCR("equal by length");
-DATA(insert OID = 576 ( "#<>" PGNSP PGUID b f f 704 703 16 0 575 tintervallenne - - ));
+DATA(insert OID = 576 ( "#<>" PGNSP PGUID b f f 704 703 16 0 575 tintervallenne - - "---"));
DESCR("not equal by length");
-DATA(insert OID = 577 ( "#<" PGNSP PGUID b f f 704 703 16 0 580 tintervallenlt - - ));
+DATA(insert OID = 577 ( "#<" PGNSP PGUID b f f 704 703 16 0 580 tintervallenlt - - "---"));
DESCR("less than by length");
-DATA(insert OID = 578 ( "#>" PGNSP PGUID b f f 704 703 16 0 579 tintervallengt - - ));
+DATA(insert OID = 578 ( "#>" PGNSP PGUID b f f 704 703 16 0 579 tintervallengt - - "---"));
DESCR("greater than by length");
-DATA(insert OID = 579 ( "#<=" PGNSP PGUID b f f 704 703 16 0 578 tintervallenle - - ));
+DATA(insert OID = 579 ( "#<=" PGNSP PGUID b f f 704 703 16 0 578 tintervallenle - - "---"));
DESCR("less than or equal by length");
-DATA(insert OID = 580 ( "#>=" PGNSP PGUID b f f 704 703 16 0 577 tintervallenge - - ));
+DATA(insert OID = 580 ( "#>=" PGNSP PGUID b f f 704 703 16 0 577 tintervallenge - - "---"));
DESCR("greater than or equal by length");
-DATA(insert OID = 581 ( "+" PGNSP PGUID b f f 702 703 702 0 0 timepl - - ));
+DATA(insert OID = 581 ( "+" PGNSP PGUID b f f 702 703 702 0 0 timepl - - "---"));
DESCR("add");
-DATA(insert OID = 582 ( "-" PGNSP PGUID b f f 702 703 702 0 0 timemi - - ));
+DATA(insert OID = 582 ( "-" PGNSP PGUID b f f 702 703 702 0 0 timemi - - "---"));
DESCR("subtract");
-DATA(insert OID = 583 ( "<?>" PGNSP PGUID b f f 702 704 16 0 0 intinterval - - ));
+DATA(insert OID = 583 ( "<?>" PGNSP PGUID b f f 702 704 16 0 0 intinterval - - "---"));
DESCR("is contained by");
-DATA(insert OID = 584 ( "-" PGNSP PGUID l f f 0 700 700 0 0 float4um - - ));
+DATA(insert OID = 584 ( "-" PGNSP PGUID l f f 0 700 700 0 0 float4um - - "---"));
DESCR("negate");
-DATA(insert OID = 585 ( "-" PGNSP PGUID l f f 0 701 701 0 0 float8um - - ));
+DATA(insert OID = 585 ( "-" PGNSP PGUID l f f 0 701 701 0 0 float8um - - "---"));
DESCR("negate");
-DATA(insert OID = 586 ( "+" PGNSP PGUID b f f 700 700 700 586 0 float4pl - - ));
+DATA(insert OID = 586 ( "+" PGNSP PGUID b f f 700 700 700 586 0 float4pl - - "---"));
DESCR("add");
-DATA(insert OID = 587 ( "-" PGNSP PGUID b f f 700 700 700 0 0 float4mi - - ));
+DATA(insert OID = 587 ( "-" PGNSP PGUID b f f 700 700 700 0 0 float4mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 588 ( "/" PGNSP PGUID b f f 700 700 700 0 0 float4div - - ));
+DATA(insert OID = 588 ( "/" PGNSP PGUID b f f 700 700 700 0 0 float4div - - "---"));
DESCR("divide");
-DATA(insert OID = 589 ( "*" PGNSP PGUID b f f 700 700 700 589 0 float4mul - - ));
+DATA(insert OID = 589 ( "*" PGNSP PGUID b f f 700 700 700 589 0 float4mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 590 ( "@" PGNSP PGUID l f f 0 700 700 0 0 float4abs - - ));
+DATA(insert OID = 590 ( "@" PGNSP PGUID l f f 0 700 700 0 0 float4abs - - "---"));
DESCR("absolute value");
-DATA(insert OID = 591 ( "+" PGNSP PGUID b f f 701 701 701 591 0 float8pl - - ));
+DATA(insert OID = 591 ( "+" PGNSP PGUID b f f 701 701 701 591 0 float8pl - - "---"));
DESCR("add");
-DATA(insert OID = 592 ( "-" PGNSP PGUID b f f 701 701 701 0 0 float8mi - - ));
+DATA(insert OID = 592 ( "-" PGNSP PGUID b f f 701 701 701 0 0 float8mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 593 ( "/" PGNSP PGUID b f f 701 701 701 0 0 float8div - - ));
+DATA(insert OID = 593 ( "/" PGNSP PGUID b f f 701 701 701 0 0 float8div - - "---"));
DESCR("divide");
-DATA(insert OID = 594 ( "*" PGNSP PGUID b f f 701 701 701 594 0 float8mul - - ));
+DATA(insert OID = 594 ( "*" PGNSP PGUID b f f 701 701 701 594 0 float8mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 595 ( "@" PGNSP PGUID l f f 0 701 701 0 0 float8abs - - ));
+DATA(insert OID = 595 ( "@" PGNSP PGUID l f f 0 701 701 0 0 float8abs - - "---"));
DESCR("absolute value");
-DATA(insert OID = 596 ( "|/" PGNSP PGUID l f f 0 701 701 0 0 dsqrt - - ));
+DATA(insert OID = 596 ( "|/" PGNSP PGUID l f f 0 701 701 0 0 dsqrt - - "---"));
DESCR("square root");
-DATA(insert OID = 597 ( "||/" PGNSP PGUID l f f 0 701 701 0 0 dcbrt - - ));
+DATA(insert OID = 597 ( "||/" PGNSP PGUID l f f 0 701 701 0 0 dcbrt - - "---"));
DESCR("cube root");
-DATA(insert OID = 1284 ( "|" PGNSP PGUID l f f 0 704 702 0 0 tintervalstart - - ));
+DATA(insert OID = 1284 ( "|" PGNSP PGUID l f f 0 704 702 0 0 tintervalstart - - "---"));
DESCR("start of interval");
-DATA(insert OID = 606 ( "<#>" PGNSP PGUID b f f 702 702 704 0 0 mktinterval - - ));
+DATA(insert OID = 606 ( "<#>" PGNSP PGUID b f f 702 702 704 0 0 mktinterval - - "---"));
DESCR("convert to tinterval");
-DATA(insert OID = 607 ( "=" PGNSP PGUID b t t 26 26 16 607 608 oideq eqsel eqjoinsel ));
+DATA(insert OID = 607 ( "=" PGNSP PGUID b t t 26 26 16 607 608 oideq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 608 ( "<>" PGNSP PGUID b f f 26 26 16 608 607 oidne neqsel neqjoinsel ));
+DATA(insert OID = 608 ( "<>" PGNSP PGUID b f f 26 26 16 608 607 oidne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 609 ( "<" PGNSP PGUID b f f 26 26 16 610 612 oidlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 609 ( "<" PGNSP PGUID b f f 26 26 16 610 612 oidlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 610 ( ">" PGNSP PGUID b f f 26 26 16 609 611 oidgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 610 ( ">" PGNSP PGUID b f f 26 26 16 609 611 oidgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 611 ( "<=" PGNSP PGUID b f f 26 26 16 612 610 oidle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 611 ( "<=" PGNSP PGUID b f f 26 26 16 612 610 oidle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 612 ( ">=" PGNSP PGUID b f f 26 26 16 611 609 oidge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 612 ( ">=" PGNSP PGUID b f f 26 26 16 611 609 oidge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 644 ( "<>" PGNSP PGUID b f f 30 30 16 644 649 oidvectorne neqsel neqjoinsel ));
+DATA(insert OID = 644 ( "<>" PGNSP PGUID b f f 30 30 16 644 649 oidvectorne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 645 ( "<" PGNSP PGUID b f f 30 30 16 646 648 oidvectorlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 645 ( "<" PGNSP PGUID b f f 30 30 16 646 648 oidvectorlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 646 ( ">" PGNSP PGUID b f f 30 30 16 645 647 oidvectorgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 646 ( ">" PGNSP PGUID b f f 30 30 16 645 647 oidvectorgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 647 ( "<=" PGNSP PGUID b f f 30 30 16 648 646 oidvectorle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 647 ( "<=" PGNSP PGUID b f f 30 30 16 648 646 oidvectorle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 648 ( ">=" PGNSP PGUID b f f 30 30 16 647 645 oidvectorge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 648 ( ">=" PGNSP PGUID b f f 30 30 16 647 645 oidvectorge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 649 ( "=" PGNSP PGUID b t t 30 30 16 649 644 oidvectoreq eqsel eqjoinsel ));
+DATA(insert OID = 649 ( "=" PGNSP PGUID b t t 30 30 16 649 644 oidvectoreq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 613 ( "<->" PGNSP PGUID b f f 600 628 701 0 0 dist_pl - - ));
+DATA(insert OID = 613 ( "<->" PGNSP PGUID b f f 600 628 701 0 0 dist_pl - - "---"));
DESCR("distance between");
-DATA(insert OID = 614 ( "<->" PGNSP PGUID b f f 600 601 701 0 0 dist_ps - - ));
+DATA(insert OID = 614 ( "<->" PGNSP PGUID b f f 600 601 701 0 0 dist_ps - - "---"));
DESCR("distance between");
-DATA(insert OID = 615 ( "<->" PGNSP PGUID b f f 600 603 701 0 0 dist_pb - - ));
+DATA(insert OID = 615 ( "<->" PGNSP PGUID b f f 600 603 701 0 0 dist_pb - - "---"));
DESCR("distance between");
-DATA(insert OID = 616 ( "<->" PGNSP PGUID b f f 601 628 701 0 0 dist_sl - - ));
+DATA(insert OID = 616 ( "<->" PGNSP PGUID b f f 601 628 701 0 0 dist_sl - - "---"));
DESCR("distance between");
-DATA(insert OID = 617 ( "<->" PGNSP PGUID b f f 601 603 701 0 0 dist_sb - - ));
+DATA(insert OID = 617 ( "<->" PGNSP PGUID b f f 601 603 701 0 0 dist_sb - - "---"));
DESCR("distance between");
-DATA(insert OID = 618 ( "<->" PGNSP PGUID b f f 600 602 701 0 0 dist_ppath - - ));
+DATA(insert OID = 618 ( "<->" PGNSP PGUID b f f 600 602 701 0 0 dist_ppath - - "---"));
DESCR("distance between");
-DATA(insert OID = 620 ( "=" PGNSP PGUID b t t 700 700 16 620 621 float4eq eqsel eqjoinsel ));
+DATA(insert OID = 620 ( "=" PGNSP PGUID b t t 700 700 16 620 621 float4eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 621 ( "<>" PGNSP PGUID b f f 700 700 16 621 620 float4ne neqsel neqjoinsel ));
+DATA(insert OID = 621 ( "<>" PGNSP PGUID b f f 700 700 16 621 620 float4ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 622 ( "<" PGNSP PGUID b f f 700 700 16 623 625 float4lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 622 ( "<" PGNSP PGUID b f f 700 700 16 623 625 float4lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 623 ( ">" PGNSP PGUID b f f 700 700 16 622 624 float4gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 623 ( ">" PGNSP PGUID b f f 700 700 16 622 624 float4gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 624 ( "<=" PGNSP PGUID b f f 700 700 16 625 623 float4le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 624 ( "<=" PGNSP PGUID b f f 700 700 16 625 623 float4le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 625 ( ">=" PGNSP PGUID b f f 700 700 16 624 622 float4ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 625 ( ">=" PGNSP PGUID b f f 700 700 16 624 622 float4ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 630 ( "<>" PGNSP PGUID b f f 18 18 16 630 92 charne neqsel neqjoinsel ));
+DATA(insert OID = 630 ( "<>" PGNSP PGUID b f f 18 18 16 630 92 charne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 631 ( "<" PGNSP PGUID b f f 18 18 16 633 634 charlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 631 ( "<" PGNSP PGUID b f f 18 18 16 633 634 charlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 632 ( "<=" PGNSP PGUID b f f 18 18 16 634 633 charle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 632 ( "<=" PGNSP PGUID b f f 18 18 16 634 633 charle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 633 ( ">" PGNSP PGUID b f f 18 18 16 631 632 chargt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 633 ( ">" PGNSP PGUID b f f 18 18 16 631 632 chargt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 634 ( ">=" PGNSP PGUID b f f 18 18 16 632 631 charge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 634 ( ">=" PGNSP PGUID b f f 18 18 16 632 631 charge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 639 ( "~" PGNSP PGUID b f f 19 25 16 0 640 nameregexeq regexeqsel regexeqjoinsel ));
+DATA(insert OID = 639 ( "~" PGNSP PGUID b f f 19 25 16 0 640 nameregexeq regexeqsel regexeqjoinsel "mhf"));
DESCR("matches regular expression, case-sensitive");
#define OID_NAME_REGEXEQ_OP 639
-DATA(insert OID = 640 ( "!~" PGNSP PGUID b f f 19 25 16 0 639 nameregexne regexnesel regexnejoinsel ));
+DATA(insert OID = 640 ( "!~" PGNSP PGUID b f f 19 25 16 0 639 nameregexne regexnesel regexnejoinsel "---"));
DESCR("does not match regular expression, case-sensitive");
-DATA(insert OID = 641 ( "~" PGNSP PGUID b f f 25 25 16 0 642 textregexeq regexeqsel regexeqjoinsel ));
+DATA(insert OID = 641 ( "~" PGNSP PGUID b f f 25 25 16 0 642 textregexeq regexeqsel regexeqjoinsel "mhf"));
DESCR("matches regular expression, case-sensitive");
#define OID_TEXT_REGEXEQ_OP 641
-DATA(insert OID = 642 ( "!~" PGNSP PGUID b f f 25 25 16 0 641 textregexne regexnesel regexnejoinsel ));
+DATA(insert OID = 642 ( "!~" PGNSP PGUID b f f 25 25 16 0 641 textregexne regexnesel regexnejoinsel "---"));
DESCR("does not match regular expression, case-sensitive");
-DATA(insert OID = 643 ( "<>" PGNSP PGUID b f f 19 19 16 643 93 namene neqsel neqjoinsel ));
+DATA(insert OID = 643 ( "<>" PGNSP PGUID b f f 19 19 16 643 93 namene neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 654 ( "||" PGNSP PGUID b f f 25 25 25 0 0 textcat - - ));
+DATA(insert OID = 654 ( "||" PGNSP PGUID b f f 25 25 25 0 0 textcat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 660 ( "<" PGNSP PGUID b f f 19 19 16 662 663 namelt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 660 ( "<" PGNSP PGUID b f f 19 19 16 662 663 namelt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 661 ( "<=" PGNSP PGUID b f f 19 19 16 663 662 namele scalarltsel scalarltjoinsel ));
+DATA(insert OID = 661 ( "<=" PGNSP PGUID b f f 19 19 16 663 662 namele scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 662 ( ">" PGNSP PGUID b f f 19 19 16 660 661 namegt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 662 ( ">" PGNSP PGUID b f f 19 19 16 660 661 namegt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 663 ( ">=" PGNSP PGUID b f f 19 19 16 661 660 namege scalargtsel scalargtjoinsel ));
+DATA(insert OID = 663 ( ">=" PGNSP PGUID b f f 19 19 16 661 660 namege scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 664 ( "<" PGNSP PGUID b f f 25 25 16 666 667 text_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 664 ( "<" PGNSP PGUID b f f 25 25 16 666 667 text_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 665 ( "<=" PGNSP PGUID b f f 25 25 16 667 666 text_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 665 ( "<=" PGNSP PGUID b f f 25 25 16 667 666 text_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 666 ( ">" PGNSP PGUID b f f 25 25 16 664 665 text_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 666 ( ">" PGNSP PGUID b f f 25 25 16 664 665 text_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 667 ( ">=" PGNSP PGUID b f f 25 25 16 665 664 text_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 667 ( ">=" PGNSP PGUID b f f 25 25 16 665 664 text_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 670 ( "=" PGNSP PGUID b t t 701 701 16 670 671 float8eq eqsel eqjoinsel ));
+DATA(insert OID = 670 ( "=" PGNSP PGUID b t t 701 701 16 670 671 float8eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 671 ( "<>" PGNSP PGUID b f f 701 701 16 671 670 float8ne neqsel neqjoinsel ));
+DATA(insert OID = 671 ( "<>" PGNSP PGUID b f f 701 701 16 671 670 float8ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 672 ( "<" PGNSP PGUID b f f 701 701 16 674 675 float8lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 672 ( "<" PGNSP PGUID b f f 701 701 16 674 675 float8lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define Float8LessOperator 672
-DATA(insert OID = 673 ( "<=" PGNSP PGUID b f f 701 701 16 675 674 float8le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 673 ( "<=" PGNSP PGUID b f f 701 701 16 675 674 float8le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 674 ( ">" PGNSP PGUID b f f 701 701 16 672 673 float8gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 674 ( ">" PGNSP PGUID b f f 701 701 16 672 673 float8gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 675 ( ">=" PGNSP PGUID b f f 701 701 16 673 672 float8ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 675 ( ">=" PGNSP PGUID b f f 701 701 16 673 672 float8ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 682 ( "@" PGNSP PGUID l f f 0 21 21 0 0 int2abs - - ));
+DATA(insert OID = 682 ( "@" PGNSP PGUID l f f 0 21 21 0 0 int2abs - - "---"));
DESCR("absolute value");
-DATA(insert OID = 684 ( "+" PGNSP PGUID b f f 20 20 20 684 0 int8pl - - ));
+DATA(insert OID = 684 ( "+" PGNSP PGUID b f f 20 20 20 684 0 int8pl - - "---"));
DESCR("add");
-DATA(insert OID = 685 ( "-" PGNSP PGUID b f f 20 20 20 0 0 int8mi - - ));
+DATA(insert OID = 685 ( "-" PGNSP PGUID b f f 20 20 20 0 0 int8mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 686 ( "*" PGNSP PGUID b f f 20 20 20 686 0 int8mul - - ));
+DATA(insert OID = 686 ( "*" PGNSP PGUID b f f 20 20 20 686 0 int8mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 687 ( "/" PGNSP PGUID b f f 20 20 20 0 0 int8div - - ));
+DATA(insert OID = 687 ( "/" PGNSP PGUID b f f 20 20 20 0 0 int8div - - "---"));
DESCR("divide");
-DATA(insert OID = 688 ( "+" PGNSP PGUID b f f 20 23 20 692 0 int84pl - - ));
+DATA(insert OID = 688 ( "+" PGNSP PGUID b f f 20 23 20 692 0 int84pl - - "---"));
DESCR("add");
-DATA(insert OID = 689 ( "-" PGNSP PGUID b f f 20 23 20 0 0 int84mi - - ));
+DATA(insert OID = 689 ( "-" PGNSP PGUID b f f 20 23 20 0 0 int84mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 690 ( "*" PGNSP PGUID b f f 20 23 20 694 0 int84mul - - ));
+DATA(insert OID = 690 ( "*" PGNSP PGUID b f f 20 23 20 694 0 int84mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 691 ( "/" PGNSP PGUID b f f 20 23 20 0 0 int84div - - ));
+DATA(insert OID = 691 ( "/" PGNSP PGUID b f f 20 23 20 0 0 int84div - - "---"));
DESCR("divide");
-DATA(insert OID = 692 ( "+" PGNSP PGUID b f f 23 20 20 688 0 int48pl - - ));
+DATA(insert OID = 692 ( "+" PGNSP PGUID b f f 23 20 20 688 0 int48pl - - "---"));
DESCR("add");
-DATA(insert OID = 693 ( "-" PGNSP PGUID b f f 23 20 20 0 0 int48mi - - ));
+DATA(insert OID = 693 ( "-" PGNSP PGUID b f f 23 20 20 0 0 int48mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 694 ( "*" PGNSP PGUID b f f 23 20 20 690 0 int48mul - - ));
+DATA(insert OID = 694 ( "*" PGNSP PGUID b f f 23 20 20 690 0 int48mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 695 ( "/" PGNSP PGUID b f f 23 20 20 0 0 int48div - - ));
+DATA(insert OID = 695 ( "/" PGNSP PGUID b f f 23 20 20 0 0 int48div - - "---"));
DESCR("divide");
-DATA(insert OID = 818 ( "+" PGNSP PGUID b f f 20 21 20 822 0 int82pl - - ));
+DATA(insert OID = 818 ( "+" PGNSP PGUID b f f 20 21 20 822 0 int82pl - - "---"));
DESCR("add");
-DATA(insert OID = 819 ( "-" PGNSP PGUID b f f 20 21 20 0 0 int82mi - - ));
+DATA(insert OID = 819 ( "-" PGNSP PGUID b f f 20 21 20 0 0 int82mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 820 ( "*" PGNSP PGUID b f f 20 21 20 824 0 int82mul - - ));
+DATA(insert OID = 820 ( "*" PGNSP PGUID b f f 20 21 20 824 0 int82mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 821 ( "/" PGNSP PGUID b f f 20 21 20 0 0 int82div - - ));
+DATA(insert OID = 821 ( "/" PGNSP PGUID b f f 20 21 20 0 0 int82div - - "---"));
DESCR("divide");
-DATA(insert OID = 822 ( "+" PGNSP PGUID b f f 21 20 20 818 0 int28pl - - ));
+DATA(insert OID = 822 ( "+" PGNSP PGUID b f f 21 20 20 818 0 int28pl - - "---"));
DESCR("add");
-DATA(insert OID = 823 ( "-" PGNSP PGUID b f f 21 20 20 0 0 int28mi - - ));
+DATA(insert OID = 823 ( "-" PGNSP PGUID b f f 21 20 20 0 0 int28mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 824 ( "*" PGNSP PGUID b f f 21 20 20 820 0 int28mul - - ));
+DATA(insert OID = 824 ( "*" PGNSP PGUID b f f 21 20 20 820 0 int28mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 825 ( "/" PGNSP PGUID b f f 21 20 20 0 0 int28div - - ));
+DATA(insert OID = 825 ( "/" PGNSP PGUID b f f 21 20 20 0 0 int28div - - "---"));
DESCR("divide");
-DATA(insert OID = 706 ( "<->" PGNSP PGUID b f f 603 603 701 706 0 box_distance - - ));
+DATA(insert OID = 706 ( "<->" PGNSP PGUID b f f 603 603 701 706 0 box_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 707 ( "<->" PGNSP PGUID b f f 602 602 701 707 0 path_distance - - ));
+DATA(insert OID = 707 ( "<->" PGNSP PGUID b f f 602 602 701 707 0 path_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 708 ( "<->" PGNSP PGUID b f f 628 628 701 708 0 line_distance - - ));
+DATA(insert OID = 708 ( "<->" PGNSP PGUID b f f 628 628 701 708 0 line_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 709 ( "<->" PGNSP PGUID b f f 601 601 701 709 0 lseg_distance - - ));
+DATA(insert OID = 709 ( "<->" PGNSP PGUID b f f 601 601 701 709 0 lseg_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 712 ( "<->" PGNSP PGUID b f f 604 604 701 712 0 poly_distance - - ));
+DATA(insert OID = 712 ( "<->" PGNSP PGUID b f f 604 604 701 712 0 poly_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 713 ( "<>" PGNSP PGUID b f f 600 600 16 713 510 point_ne neqsel neqjoinsel ));
+DATA(insert OID = 713 ( "<>" PGNSP PGUID b f f 600 600 16 713 510 point_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
/* add translation/rotation/scaling operators for geometric types. - thomas 97/05/10 */
-DATA(insert OID = 731 ( "+" PGNSP PGUID b f f 600 600 600 731 0 point_add - - ));
+DATA(insert OID = 731 ( "+" PGNSP PGUID b f f 600 600 600 731 0 point_add - - "---"));
DESCR("add points (translate)");
-DATA(insert OID = 732 ( "-" PGNSP PGUID b f f 600 600 600 0 0 point_sub - - ));
+DATA(insert OID = 732 ( "-" PGNSP PGUID b f f 600 600 600 0 0 point_sub - - "---"));
DESCR("subtract points (translate)");
-DATA(insert OID = 733 ( "*" PGNSP PGUID b f f 600 600 600 733 0 point_mul - - ));
+DATA(insert OID = 733 ( "*" PGNSP PGUID b f f 600 600 600 733 0 point_mul - - "---"));
DESCR("multiply points (scale/rotate)");
-DATA(insert OID = 734 ( "/" PGNSP PGUID b f f 600 600 600 0 0 point_div - - ));
+DATA(insert OID = 734 ( "/" PGNSP PGUID b f f 600 600 600 0 0 point_div - - "---"));
DESCR("divide points (scale/rotate)");
-DATA(insert OID = 735 ( "+" PGNSP PGUID b f f 602 602 602 735 0 path_add - - ));
+DATA(insert OID = 735 ( "+" PGNSP PGUID b f f 602 602 602 735 0 path_add - - "---"));
DESCR("concatenate");
-DATA(insert OID = 736 ( "+" PGNSP PGUID b f f 602 600 602 0 0 path_add_pt - - ));
+DATA(insert OID = 736 ( "+" PGNSP PGUID b f f 602 600 602 0 0 path_add_pt - - "---"));
DESCR("add (translate path)");
-DATA(insert OID = 737 ( "-" PGNSP PGUID b f f 602 600 602 0 0 path_sub_pt - - ));
+DATA(insert OID = 737 ( "-" PGNSP PGUID b f f 602 600 602 0 0 path_sub_pt - - "---"));
DESCR("subtract (translate path)");
-DATA(insert OID = 738 ( "*" PGNSP PGUID b f f 602 600 602 0 0 path_mul_pt - - ));
+DATA(insert OID = 738 ( "*" PGNSP PGUID b f f 602 600 602 0 0 path_mul_pt - - "---"));
DESCR("multiply (rotate/scale path)");
-DATA(insert OID = 739 ( "/" PGNSP PGUID b f f 602 600 602 0 0 path_div_pt - - ));
+DATA(insert OID = 739 ( "/" PGNSP PGUID b f f 602 600 602 0 0 path_div_pt - - "---"));
DESCR("divide (rotate/scale path)");
-DATA(insert OID = 755 ( "@>" PGNSP PGUID b f f 602 600 16 512 0 path_contain_pt - - ));
+DATA(insert OID = 755 ( "@>" PGNSP PGUID b f f 602 600 16 512 0 path_contain_pt - - "---"));
DESCR("contains");
-DATA(insert OID = 756 ( "<@" PGNSP PGUID b f f 600 604 16 757 0 pt_contained_poly contsel contjoinsel ));
+DATA(insert OID = 756 ( "<@" PGNSP PGUID b f f 600 604 16 757 0 pt_contained_poly contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 757 ( "@>" PGNSP PGUID b f f 604 600 16 756 0 poly_contain_pt contsel contjoinsel ));
+DATA(insert OID = 757 ( "@>" PGNSP PGUID b f f 604 600 16 756 0 poly_contain_pt contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 758 ( "<@" PGNSP PGUID b f f 600 718 16 759 0 pt_contained_circle contsel contjoinsel ));
+DATA(insert OID = 758 ( "<@" PGNSP PGUID b f f 600 718 16 759 0 pt_contained_circle contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 759 ( "@>" PGNSP PGUID b f f 718 600 16 758 0 circle_contain_pt contsel contjoinsel ));
+DATA(insert OID = 759 ( "@>" PGNSP PGUID b f f 718 600 16 758 0 circle_contain_pt contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 773 ( "@" PGNSP PGUID l f f 0 23 23 0 0 int4abs - - ));
+DATA(insert OID = 773 ( "@" PGNSP PGUID l f f 0 23 23 0 0 int4abs - - "---"));
DESCR("absolute value");
/* additional operators for geometric types - thomas 1997-07-09 */
-DATA(insert OID = 792 ( "=" PGNSP PGUID b f f 602 602 16 792 0 path_n_eq eqsel eqjoinsel ));
+DATA(insert OID = 792 ( "=" PGNSP PGUID b f f 602 602 16 792 0 path_n_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 793 ( "<" PGNSP PGUID b f f 602 602 16 794 0 path_n_lt - - ));
+DATA(insert OID = 793 ( "<" PGNSP PGUID b f f 602 602 16 794 0 path_n_lt - - "---"));
DESCR("less than");
-DATA(insert OID = 794 ( ">" PGNSP PGUID b f f 602 602 16 793 0 path_n_gt - - ));
+DATA(insert OID = 794 ( ">" PGNSP PGUID b f f 602 602 16 793 0 path_n_gt - - "---"));
DESCR("greater than");
-DATA(insert OID = 795 ( "<=" PGNSP PGUID b f f 602 602 16 796 0 path_n_le - - ));
+DATA(insert OID = 795 ( "<=" PGNSP PGUID b f f 602 602 16 796 0 path_n_le - - "---"));
DESCR("less than or equal");
-DATA(insert OID = 796 ( ">=" PGNSP PGUID b f f 602 602 16 795 0 path_n_ge - - ));
+DATA(insert OID = 796 ( ">=" PGNSP PGUID b f f 602 602 16 795 0 path_n_ge - - "---"));
DESCR("greater than or equal");
-DATA(insert OID = 797 ( "#" PGNSP PGUID l f f 0 602 23 0 0 path_npoints - - ));
+DATA(insert OID = 797 ( "#" PGNSP PGUID l f f 0 602 23 0 0 path_npoints - - "---"));
DESCR("number of points");
-DATA(insert OID = 798 ( "?#" PGNSP PGUID b f f 602 602 16 0 0 path_inter - - ));
+DATA(insert OID = 798 ( "?#" PGNSP PGUID b f f 602 602 16 0 0 path_inter - - "---"));
DESCR("intersect");
-DATA(insert OID = 799 ( "@-@" PGNSP PGUID l f f 0 602 701 0 0 path_length - - ));
+DATA(insert OID = 799 ( "@-@" PGNSP PGUID l f f 0 602 701 0 0 path_length - - "---"));
DESCR("sum of path segment lengths");
-DATA(insert OID = 800 ( ">^" PGNSP PGUID b f f 603 603 16 0 0 box_above_eq positionsel positionjoinsel ));
+DATA(insert OID = 800 ( ">^" PGNSP PGUID b f f 603 603 16 0 0 box_above_eq positionsel positionjoinsel "---"));
DESCR("is above (allows touching)");
-DATA(insert OID = 801 ( "<^" PGNSP PGUID b f f 603 603 16 0 0 box_below_eq positionsel positionjoinsel ));
+DATA(insert OID = 801 ( "<^" PGNSP PGUID b f f 603 603 16 0 0 box_below_eq positionsel positionjoinsel "---"));
DESCR("is below (allows touching)");
-DATA(insert OID = 802 ( "?#" PGNSP PGUID b f f 603 603 16 0 0 box_overlap areasel areajoinsel ));
+DATA(insert OID = 802 ( "?#" PGNSP PGUID b f f 603 603 16 0 0 box_overlap areasel areajoinsel "---"));
DESCR("deprecated, use && instead");
-DATA(insert OID = 803 ( "#" PGNSP PGUID b f f 603 603 603 0 0 box_intersect - - ));
+DATA(insert OID = 803 ( "#" PGNSP PGUID b f f 603 603 603 0 0 box_intersect - - "---"));
DESCR("box intersection");
-DATA(insert OID = 804 ( "+" PGNSP PGUID b f f 603 600 603 0 0 box_add - - ));
+DATA(insert OID = 804 ( "+" PGNSP PGUID b f f 603 600 603 0 0 box_add - - "---"));
DESCR("add point to box (translate)");
-DATA(insert OID = 805 ( "-" PGNSP PGUID b f f 603 600 603 0 0 box_sub - - ));
+DATA(insert OID = 805 ( "-" PGNSP PGUID b f f 603 600 603 0 0 box_sub - - "---"));
DESCR("subtract point from box (translate)");
-DATA(insert OID = 806 ( "*" PGNSP PGUID b f f 603 600 603 0 0 box_mul - - ));
+DATA(insert OID = 806 ( "*" PGNSP PGUID b f f 603 600 603 0 0 box_mul - - "---"));
DESCR("multiply box by point (scale)");
-DATA(insert OID = 807 ( "/" PGNSP PGUID b f f 603 600 603 0 0 box_div - - ));
+DATA(insert OID = 807 ( "/" PGNSP PGUID b f f 603 600 603 0 0 box_div - - "---"));
DESCR("divide box by point (scale)");
-DATA(insert OID = 808 ( "?-" PGNSP PGUID b f f 600 600 16 808 0 point_horiz - - ));
+DATA(insert OID = 808 ( "?-" PGNSP PGUID b f f 600 600 16 808 0 point_horiz - - "---"));
DESCR("horizontally aligned");
-DATA(insert OID = 809 ( "?|" PGNSP PGUID b f f 600 600 16 809 0 point_vert - - ));
+DATA(insert OID = 809 ( "?|" PGNSP PGUID b f f 600 600 16 809 0 point_vert - - "---"));
DESCR("vertically aligned");
-DATA(insert OID = 811 ( "=" PGNSP PGUID b t f 704 704 16 811 812 tintervaleq eqsel eqjoinsel ));
+DATA(insert OID = 811 ( "=" PGNSP PGUID b t f 704 704 16 811 812 tintervaleq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 812 ( "<>" PGNSP PGUID b f f 704 704 16 812 811 tintervalne neqsel neqjoinsel ));
+DATA(insert OID = 812 ( "<>" PGNSP PGUID b f f 704 704 16 812 811 tintervalne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 813 ( "<" PGNSP PGUID b f f 704 704 16 814 816 tintervallt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 813 ( "<" PGNSP PGUID b f f 704 704 16 814 816 tintervallt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 814 ( ">" PGNSP PGUID b f f 704 704 16 813 815 tintervalgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 814 ( ">" PGNSP PGUID b f f 704 704 16 813 815 tintervalgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 815 ( "<=" PGNSP PGUID b f f 704 704 16 816 814 tintervalle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 815 ( "<=" PGNSP PGUID b f f 704 704 16 816 814 tintervalle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 816 ( ">=" PGNSP PGUID b f f 704 704 16 815 813 tintervalge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 816 ( ">=" PGNSP PGUID b f f 704 704 16 815 813 tintervalge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 843 ( "*" PGNSP PGUID b f f 790 700 790 845 0 cash_mul_flt4 - - ));
+DATA(insert OID = 843 ( "*" PGNSP PGUID b f f 790 700 790 845 0 cash_mul_flt4 - - "---"));
DESCR("multiply");
-DATA(insert OID = 844 ( "/" PGNSP PGUID b f f 790 700 790 0 0 cash_div_flt4 - - ));
+DATA(insert OID = 844 ( "/" PGNSP PGUID b f f 790 700 790 0 0 cash_div_flt4 - - "---"));
DESCR("divide");
-DATA(insert OID = 845 ( "*" PGNSP PGUID b f f 700 790 790 843 0 flt4_mul_cash - - ));
+DATA(insert OID = 845 ( "*" PGNSP PGUID b f f 700 790 790 843 0 flt4_mul_cash - - "---"));
DESCR("multiply");
-DATA(insert OID = 900 ( "=" PGNSP PGUID b t f 790 790 16 900 901 cash_eq eqsel eqjoinsel ));
+DATA(insert OID = 900 ( "=" PGNSP PGUID b t f 790 790 16 900 901 cash_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 901 ( "<>" PGNSP PGUID b f f 790 790 16 901 900 cash_ne neqsel neqjoinsel ));
+DATA(insert OID = 901 ( "<>" PGNSP PGUID b f f 790 790 16 901 900 cash_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 902 ( "<" PGNSP PGUID b f f 790 790 16 903 905 cash_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 902 ( "<" PGNSP PGUID b f f 790 790 16 903 905 cash_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 903 ( ">" PGNSP PGUID b f f 790 790 16 902 904 cash_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 903 ( ">" PGNSP PGUID b f f 790 790 16 902 904 cash_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 904 ( "<=" PGNSP PGUID b f f 790 790 16 905 903 cash_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 904 ( "<=" PGNSP PGUID b f f 790 790 16 905 903 cash_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 905 ( ">=" PGNSP PGUID b f f 790 790 16 904 902 cash_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 905 ( ">=" PGNSP PGUID b f f 790 790 16 904 902 cash_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 906 ( "+" PGNSP PGUID b f f 790 790 790 906 0 cash_pl - - ));
+DATA(insert OID = 906 ( "+" PGNSP PGUID b f f 790 790 790 906 0 cash_pl - - "---"));
DESCR("add");
-DATA(insert OID = 907 ( "-" PGNSP PGUID b f f 790 790 790 0 0 cash_mi - - ));
+DATA(insert OID = 907 ( "-" PGNSP PGUID b f f 790 790 790 0 0 cash_mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 908 ( "*" PGNSP PGUID b f f 790 701 790 916 0 cash_mul_flt8 - - ));
+DATA(insert OID = 908 ( "*" PGNSP PGUID b f f 790 701 790 916 0 cash_mul_flt8 - - "---"));
DESCR("multiply");
-DATA(insert OID = 909 ( "/" PGNSP PGUID b f f 790 701 790 0 0 cash_div_flt8 - - ));
+DATA(insert OID = 909 ( "/" PGNSP PGUID b f f 790 701 790 0 0 cash_div_flt8 - - "---"));
DESCR("divide");
-DATA(insert OID = 912 ( "*" PGNSP PGUID b f f 790 23 790 917 0 cash_mul_int4 - - ));
+DATA(insert OID = 912 ( "*" PGNSP PGUID b f f 790 23 790 917 0 cash_mul_int4 - - "---"));
DESCR("multiply");
-DATA(insert OID = 913 ( "/" PGNSP PGUID b f f 790 23 790 0 0 cash_div_int4 - - ));
+DATA(insert OID = 913 ( "/" PGNSP PGUID b f f 790 23 790 0 0 cash_div_int4 - - "---"));
DESCR("divide");
-DATA(insert OID = 914 ( "*" PGNSP PGUID b f f 790 21 790 918 0 cash_mul_int2 - - ));
+DATA(insert OID = 914 ( "*" PGNSP PGUID b f f 790 21 790 918 0 cash_mul_int2 - - "---"));
DESCR("multiply");
-DATA(insert OID = 915 ( "/" PGNSP PGUID b f f 790 21 790 0 0 cash_div_int2 - - ));
+DATA(insert OID = 915 ( "/" PGNSP PGUID b f f 790 21 790 0 0 cash_div_int2 - - "---"));
DESCR("divide");
-DATA(insert OID = 916 ( "*" PGNSP PGUID b f f 701 790 790 908 0 flt8_mul_cash - - ));
+DATA(insert OID = 916 ( "*" PGNSP PGUID b f f 701 790 790 908 0 flt8_mul_cash - - "---"));
DESCR("multiply");
-DATA(insert OID = 917 ( "*" PGNSP PGUID b f f 23 790 790 912 0 int4_mul_cash - - ));
+DATA(insert OID = 917 ( "*" PGNSP PGUID b f f 23 790 790 912 0 int4_mul_cash - - "---"));
DESCR("multiply");
-DATA(insert OID = 918 ( "*" PGNSP PGUID b f f 21 790 790 914 0 int2_mul_cash - - ));
+DATA(insert OID = 918 ( "*" PGNSP PGUID b f f 21 790 790 914 0 int2_mul_cash - - "---"));
DESCR("multiply");
-DATA(insert OID = 3825 ( "/" PGNSP PGUID b f f 790 790 701 0 0 cash_div_cash - - ));
+DATA(insert OID = 3825 ( "/" PGNSP PGUID b f f 790 790 701 0 0 cash_div_cash - - "---"));
DESCR("divide");
-DATA(insert OID = 965 ( "^" PGNSP PGUID b f f 701 701 701 0 0 dpow - - ));
+DATA(insert OID = 965 ( "^" PGNSP PGUID b f f 701 701 701 0 0 dpow - - "---"));
DESCR("exponentiation");
-DATA(insert OID = 966 ( "+" PGNSP PGUID b f f 1034 1033 1034 0 0 aclinsert - - ));
+DATA(insert OID = 966 ( "+" PGNSP PGUID b f f 1034 1033 1034 0 0 aclinsert - - "---"));
DESCR("add/update ACL item");
-DATA(insert OID = 967 ( "-" PGNSP PGUID b f f 1034 1033 1034 0 0 aclremove - - ));
+DATA(insert OID = 967 ( "-" PGNSP PGUID b f f 1034 1033 1034 0 0 aclremove - - "---"));
DESCR("remove ACL item");
-DATA(insert OID = 968 ( "@>" PGNSP PGUID b f f 1034 1033 16 0 0 aclcontains - - ));
+DATA(insert OID = 968 ( "@>" PGNSP PGUID b f f 1034 1033 16 0 0 aclcontains - - "---"));
DESCR("contains");
-DATA(insert OID = 974 ( "=" PGNSP PGUID b f t 1033 1033 16 974 0 aclitemeq eqsel eqjoinsel ));
+DATA(insert OID = 974 ( "=" PGNSP PGUID b f t 1033 1033 16 974 0 aclitemeq eqsel eqjoinsel "mhf"));
DESCR("equal");
/* additional geometric operators - thomas 1997-07-09 */
-DATA(insert OID = 969 ( "@@" PGNSP PGUID l f f 0 601 600 0 0 lseg_center - - ));
+DATA(insert OID = 969 ( "@@" PGNSP PGUID l f f 0 601 600 0 0 lseg_center - - "---"));
DESCR("center of");
-DATA(insert OID = 970 ( "@@" PGNSP PGUID l f f 0 602 600 0 0 path_center - - ));
+DATA(insert OID = 970 ( "@@" PGNSP PGUID l f f 0 602 600 0 0 path_center - - "---"));
DESCR("center of");
-DATA(insert OID = 971 ( "@@" PGNSP PGUID l f f 0 604 600 0 0 poly_center - - ));
+DATA(insert OID = 971 ( "@@" PGNSP PGUID l f f 0 604 600 0 0 poly_center - - "---"));
DESCR("center of");
-DATA(insert OID = 1054 ( "=" PGNSP PGUID b t t 1042 1042 16 1054 1057 bpchareq eqsel eqjoinsel ));
+DATA(insert OID = 1054 ( "=" PGNSP PGUID b t t 1042 1042 16 1054 1057 bpchareq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1055 ( "~" PGNSP PGUID b f f 1042 25 16 0 1056 bpcharregexeq regexeqsel regexeqjoinsel ));
+DATA(insert OID = 1055 ( "~" PGNSP PGUID b f f 1042 25 16 0 1056 bpcharregexeq regexeqsel regexeqjoinsel "mhf"));
DESCR("matches regular expression, case-sensitive");
#define OID_BPCHAR_REGEXEQ_OP 1055
-DATA(insert OID = 1056 ( "!~" PGNSP PGUID b f f 1042 25 16 0 1055 bpcharregexne regexnesel regexnejoinsel ));
+DATA(insert OID = 1056 ( "!~" PGNSP PGUID b f f 1042 25 16 0 1055 bpcharregexne regexnesel regexnejoinsel "---"));
DESCR("does not match regular expression, case-sensitive");
-DATA(insert OID = 1057 ( "<>" PGNSP PGUID b f f 1042 1042 16 1057 1054 bpcharne neqsel neqjoinsel ));
+DATA(insert OID = 1057 ( "<>" PGNSP PGUID b f f 1042 1042 16 1057 1054 bpcharne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1058 ( "<" PGNSP PGUID b f f 1042 1042 16 1060 1061 bpcharlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1058 ( "<" PGNSP PGUID b f f 1042 1042 16 1060 1061 bpcharlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1059 ( "<=" PGNSP PGUID b f f 1042 1042 16 1061 1060 bpcharle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1059 ( "<=" PGNSP PGUID b f f 1042 1042 16 1061 1060 bpcharle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1060 ( ">" PGNSP PGUID b f f 1042 1042 16 1058 1059 bpchargt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1060 ( ">" PGNSP PGUID b f f 1042 1042 16 1058 1059 bpchargt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1061 ( ">=" PGNSP PGUID b f f 1042 1042 16 1059 1058 bpcharge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1061 ( ">=" PGNSP PGUID b f f 1042 1042 16 1059 1058 bpcharge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* generic array comparison operators */
-DATA(insert OID = 1070 ( "=" PGNSP PGUID b t t 2277 2277 16 1070 1071 array_eq eqsel eqjoinsel ));
+DATA(insert OID = 1070 ( "=" PGNSP PGUID b t t 2277 2277 16 1070 1071 array_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define ARRAY_EQ_OP 1070
-DATA(insert OID = 1071 ( "<>" PGNSP PGUID b f f 2277 2277 16 1071 1070 array_ne neqsel neqjoinsel ));
+DATA(insert OID = 1071 ( "<>" PGNSP PGUID b f f 2277 2277 16 1071 1070 array_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1072 ( "<" PGNSP PGUID b f f 2277 2277 16 1073 1075 array_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1072 ( "<" PGNSP PGUID b f f 2277 2277 16 1073 1075 array_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define ARRAY_LT_OP 1072
-DATA(insert OID = 1073 ( ">" PGNSP PGUID b f f 2277 2277 16 1072 1074 array_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1073 ( ">" PGNSP PGUID b f f 2277 2277 16 1072 1074 array_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
#define ARRAY_GT_OP 1073
-DATA(insert OID = 1074 ( "<=" PGNSP PGUID b f f 2277 2277 16 1075 1073 array_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1074 ( "<=" PGNSP PGUID b f f 2277 2277 16 1075 1073 array_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1075 ( ">=" PGNSP PGUID b f f 2277 2277 16 1074 1072 array_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1075 ( ">=" PGNSP PGUID b f f 2277 2277 16 1074 1072 array_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* date operators */
-DATA(insert OID = 1076 ( "+" PGNSP PGUID b f f 1082 1186 1114 2551 0 date_pl_interval - - ));
+DATA(insert OID = 1076 ( "+" PGNSP PGUID b f f 1082 1186 1114 2551 0 date_pl_interval - - "---"));
DESCR("add");
-DATA(insert OID = 1077 ( "-" PGNSP PGUID b f f 1082 1186 1114 0 0 date_mi_interval - - ));
+DATA(insert OID = 1077 ( "-" PGNSP PGUID b f f 1082 1186 1114 0 0 date_mi_interval - - "---"));
DESCR("subtract");
-DATA(insert OID = 1093 ( "=" PGNSP PGUID b t t 1082 1082 16 1093 1094 date_eq eqsel eqjoinsel ));
+DATA(insert OID = 1093 ( "=" PGNSP PGUID b t t 1082 1082 16 1093 1094 date_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1094 ( "<>" PGNSP PGUID b f f 1082 1082 16 1094 1093 date_ne neqsel neqjoinsel ));
+DATA(insert OID = 1094 ( "<>" PGNSP PGUID b f f 1082 1082 16 1094 1093 date_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1095 ( "<" PGNSP PGUID b f f 1082 1082 16 1097 1098 date_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1095 ( "<" PGNSP PGUID b f f 1082 1082 16 1097 1098 date_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1096 ( "<=" PGNSP PGUID b f f 1082 1082 16 1098 1097 date_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1096 ( "<=" PGNSP PGUID b f f 1082 1082 16 1098 1097 date_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1097 ( ">" PGNSP PGUID b f f 1082 1082 16 1095 1096 date_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1097 ( ">" PGNSP PGUID b f f 1082 1082 16 1095 1096 date_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1098 ( ">=" PGNSP PGUID b f f 1082 1082 16 1096 1095 date_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1098 ( ">=" PGNSP PGUID b f f 1082 1082 16 1096 1095 date_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1099 ( "-" PGNSP PGUID b f f 1082 1082 23 0 0 date_mi - - ));
+DATA(insert OID = 1099 ( "-" PGNSP PGUID b f f 1082 1082 23 0 0 date_mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 1100 ( "+" PGNSP PGUID b f f 1082 23 1082 2555 0 date_pli - - ));
+DATA(insert OID = 1100 ( "+" PGNSP PGUID b f f 1082 23 1082 2555 0 date_pli - - "---"));
DESCR("add");
-DATA(insert OID = 1101 ( "-" PGNSP PGUID b f f 1082 23 1082 0 0 date_mii - - ));
+DATA(insert OID = 1101 ( "-" PGNSP PGUID b f f 1082 23 1082 0 0 date_mii - - "---"));
DESCR("subtract");
/* time operators */
-DATA(insert OID = 1108 ( "=" PGNSP PGUID b t t 1083 1083 16 1108 1109 time_eq eqsel eqjoinsel ));
+DATA(insert OID = 1108 ( "=" PGNSP PGUID b t t 1083 1083 16 1108 1109 time_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1109 ( "<>" PGNSP PGUID b f f 1083 1083 16 1109 1108 time_ne neqsel neqjoinsel ));
+DATA(insert OID = 1109 ( "<>" PGNSP PGUID b f f 1083 1083 16 1109 1108 time_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1110 ( "<" PGNSP PGUID b f f 1083 1083 16 1112 1113 time_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1110 ( "<" PGNSP PGUID b f f 1083 1083 16 1112 1113 time_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1111 ( "<=" PGNSP PGUID b f f 1083 1083 16 1113 1112 time_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1111 ( "<=" PGNSP PGUID b f f 1083 1083 16 1113 1112 time_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1112 ( ">" PGNSP PGUID b f f 1083 1083 16 1110 1111 time_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1112 ( ">" PGNSP PGUID b f f 1083 1083 16 1110 1111 time_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1113 ( ">=" PGNSP PGUID b f f 1083 1083 16 1111 1110 time_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1113 ( ">=" PGNSP PGUID b f f 1083 1083 16 1111 1110 time_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* timetz operators */
-DATA(insert OID = 1550 ( "=" PGNSP PGUID b t t 1266 1266 16 1550 1551 timetz_eq eqsel eqjoinsel ));
+DATA(insert OID = 1550 ( "=" PGNSP PGUID b t t 1266 1266 16 1550 1551 timetz_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1551 ( "<>" PGNSP PGUID b f f 1266 1266 16 1551 1550 timetz_ne neqsel neqjoinsel ));
+DATA(insert OID = 1551 ( "<>" PGNSP PGUID b f f 1266 1266 16 1551 1550 timetz_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1552 ( "<" PGNSP PGUID b f f 1266 1266 16 1554 1555 timetz_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1552 ( "<" PGNSP PGUID b f f 1266 1266 16 1554 1555 timetz_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1553 ( "<=" PGNSP PGUID b f f 1266 1266 16 1555 1554 timetz_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1553 ( "<=" PGNSP PGUID b f f 1266 1266 16 1555 1554 timetz_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1554 ( ">" PGNSP PGUID b f f 1266 1266 16 1552 1553 timetz_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1554 ( ">" PGNSP PGUID b f f 1266 1266 16 1552 1553 timetz_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1555 ( ">=" PGNSP PGUID b f f 1266 1266 16 1553 1552 timetz_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1555 ( ">=" PGNSP PGUID b f f 1266 1266 16 1553 1552 timetz_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* float48 operators */
-DATA(insert OID = 1116 ( "+" PGNSP PGUID b f f 700 701 701 1126 0 float48pl - - ));
+DATA(insert OID = 1116 ( "+" PGNSP PGUID b f f 700 701 701 1126 0 float48pl - - "---"));
DESCR("add");
-DATA(insert OID = 1117 ( "-" PGNSP PGUID b f f 700 701 701 0 0 float48mi - - ));
+DATA(insert OID = 1117 ( "-" PGNSP PGUID b f f 700 701 701 0 0 float48mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 1118 ( "/" PGNSP PGUID b f f 700 701 701 0 0 float48div - - ));
+DATA(insert OID = 1118 ( "/" PGNSP PGUID b f f 700 701 701 0 0 float48div - - "---"));
DESCR("divide");
-DATA(insert OID = 1119 ( "*" PGNSP PGUID b f f 700 701 701 1129 0 float48mul - - ));
+DATA(insert OID = 1119 ( "*" PGNSP PGUID b f f 700 701 701 1129 0 float48mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 1120 ( "=" PGNSP PGUID b t t 700 701 16 1130 1121 float48eq eqsel eqjoinsel ));
+DATA(insert OID = 1120 ( "=" PGNSP PGUID b t t 700 701 16 1130 1121 float48eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1121 ( "<>" PGNSP PGUID b f f 700 701 16 1131 1120 float48ne neqsel neqjoinsel ));
+DATA(insert OID = 1121 ( "<>" PGNSP PGUID b f f 700 701 16 1131 1120 float48ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1122 ( "<" PGNSP PGUID b f f 700 701 16 1133 1125 float48lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1122 ( "<" PGNSP PGUID b f f 700 701 16 1133 1125 float48lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1123 ( ">" PGNSP PGUID b f f 700 701 16 1132 1124 float48gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1123 ( ">" PGNSP PGUID b f f 700 701 16 1132 1124 float48gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1124 ( "<=" PGNSP PGUID b f f 700 701 16 1135 1123 float48le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1124 ( "<=" PGNSP PGUID b f f 700 701 16 1135 1123 float48le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1125 ( ">=" PGNSP PGUID b f f 700 701 16 1134 1122 float48ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1125 ( ">=" PGNSP PGUID b f f 700 701 16 1134 1122 float48ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* float84 operators */
-DATA(insert OID = 1126 ( "+" PGNSP PGUID b f f 701 700 701 1116 0 float84pl - - ));
+DATA(insert OID = 1126 ( "+" PGNSP PGUID b f f 701 700 701 1116 0 float84pl - - "---"));
DESCR("add");
-DATA(insert OID = 1127 ( "-" PGNSP PGUID b f f 701 700 701 0 0 float84mi - - ));
+DATA(insert OID = 1127 ( "-" PGNSP PGUID b f f 701 700 701 0 0 float84mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 1128 ( "/" PGNSP PGUID b f f 701 700 701 0 0 float84div - - ));
+DATA(insert OID = 1128 ( "/" PGNSP PGUID b f f 701 700 701 0 0 float84div - - "---"));
DESCR("divide");
-DATA(insert OID = 1129 ( "*" PGNSP PGUID b f f 701 700 701 1119 0 float84mul - - ));
+DATA(insert OID = 1129 ( "*" PGNSP PGUID b f f 701 700 701 1119 0 float84mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 1130 ( "=" PGNSP PGUID b t t 701 700 16 1120 1131 float84eq eqsel eqjoinsel ));
+DATA(insert OID = 1130 ( "=" PGNSP PGUID b t t 701 700 16 1120 1131 float84eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1131 ( "<>" PGNSP PGUID b f f 701 700 16 1121 1130 float84ne neqsel neqjoinsel ));
+DATA(insert OID = 1131 ( "<>" PGNSP PGUID b f f 701 700 16 1121 1130 float84ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1132 ( "<" PGNSP PGUID b f f 701 700 16 1123 1135 float84lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1132 ( "<" PGNSP PGUID b f f 701 700 16 1123 1135 float84lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1133 ( ">" PGNSP PGUID b f f 701 700 16 1122 1134 float84gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1133 ( ">" PGNSP PGUID b f f 701 700 16 1122 1134 float84gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1134 ( "<=" PGNSP PGUID b f f 701 700 16 1125 1133 float84le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1134 ( "<=" PGNSP PGUID b f f 701 700 16 1125 1133 float84le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1135 ( ">=" PGNSP PGUID b f f 701 700 16 1124 1132 float84ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1135 ( ">=" PGNSP PGUID b f f 701 700 16 1124 1132 float84ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* LIKE hacks by Keith Parks. */
-DATA(insert OID = 1207 ( "~~" PGNSP PGUID b f f 19 25 16 0 1208 namelike likesel likejoinsel ));
+DATA(insert OID = 1207 ( "~~" PGNSP PGUID b f f 19 25 16 0 1208 namelike likesel likejoinsel "---"));
DESCR("matches LIKE expression");
#define OID_NAME_LIKE_OP 1207
-DATA(insert OID = 1208 ( "!~~" PGNSP PGUID b f f 19 25 16 0 1207 namenlike nlikesel nlikejoinsel ));
+DATA(insert OID = 1208 ( "!~~" PGNSP PGUID b f f 19 25 16 0 1207 namenlike nlikesel nlikejoinsel "---"));
DESCR("does not match LIKE expression");
-DATA(insert OID = 1209 ( "~~" PGNSP PGUID b f f 25 25 16 0 1210 textlike likesel likejoinsel ));
+DATA(insert OID = 1209 ( "~~" PGNSP PGUID b f f 25 25 16 0 1210 textlike likesel likejoinsel "---"));
DESCR("matches LIKE expression");
#define OID_TEXT_LIKE_OP 1209
-DATA(insert OID = 1210 ( "!~~" PGNSP PGUID b f f 25 25 16 0 1209 textnlike nlikesel nlikejoinsel ));
+DATA(insert OID = 1210 ( "!~~" PGNSP PGUID b f f 25 25 16 0 1209 textnlike nlikesel nlikejoinsel "---"));
DESCR("does not match LIKE expression");
-DATA(insert OID = 1211 ( "~~" PGNSP PGUID b f f 1042 25 16 0 1212 bpcharlike likesel likejoinsel ));
+DATA(insert OID = 1211 ( "~~" PGNSP PGUID b f f 1042 25 16 0 1212 bpcharlike likesel likejoinsel "---"));
DESCR("matches LIKE expression");
#define OID_BPCHAR_LIKE_OP 1211
-DATA(insert OID = 1212 ( "!~~" PGNSP PGUID b f f 1042 25 16 0 1211 bpcharnlike nlikesel nlikejoinsel ));
+DATA(insert OID = 1212 ( "!~~" PGNSP PGUID b f f 1042 25 16 0 1211 bpcharnlike nlikesel nlikejoinsel "---"));
DESCR("does not match LIKE expression");
/* case-insensitive regex hacks */
-DATA(insert OID = 1226 ( "~*" PGNSP PGUID b f f 19 25 16 0 1227 nameicregexeq icregexeqsel icregexeqjoinsel ));
+DATA(insert OID = 1226 ( "~*" PGNSP PGUID b f f 19 25 16 0 1227 nameicregexeq icregexeqsel icregexeqjoinsel "mhf"));
DESCR("matches regular expression, case-insensitive");
#define OID_NAME_ICREGEXEQ_OP 1226
-DATA(insert OID = 1227 ( "!~*" PGNSP PGUID b f f 19 25 16 0 1226 nameicregexne icregexnesel icregexnejoinsel ));
+DATA(insert OID = 1227 ( "!~*" PGNSP PGUID b f f 19 25 16 0 1226 nameicregexne icregexnesel icregexnejoinsel "---"));
DESCR("does not match regular expression, case-insensitive");
-DATA(insert OID = 1228 ( "~*" PGNSP PGUID b f f 25 25 16 0 1229 texticregexeq icregexeqsel icregexeqjoinsel ));
+DATA(insert OID = 1228 ( "~*" PGNSP PGUID b f f 25 25 16 0 1229 texticregexeq icregexeqsel icregexeqjoinsel "mhf"));
DESCR("matches regular expression, case-insensitive");
#define OID_TEXT_ICREGEXEQ_OP 1228
-DATA(insert OID = 1229 ( "!~*" PGNSP PGUID b f f 25 25 16 0 1228 texticregexne icregexnesel icregexnejoinsel ));
+DATA(insert OID = 1229 ( "!~*" PGNSP PGUID b f f 25 25 16 0 1228 texticregexne icregexnesel icregexnejoinsel "---"));
DESCR("does not match regular expression, case-insensitive");
-DATA(insert OID = 1234 ( "~*" PGNSP PGUID b f f 1042 25 16 0 1235 bpcharicregexeq icregexeqsel icregexeqjoinsel ));
+DATA(insert OID = 1234 ( "~*" PGNSP PGUID b f f 1042 25 16 0 1235 bpcharicregexeq icregexeqsel icregexeqjoinsel "mhf"));
DESCR("matches regular expression, case-insensitive");
#define OID_BPCHAR_ICREGEXEQ_OP 1234
-DATA(insert OID = 1235 ( "!~*" PGNSP PGUID b f f 1042 25 16 0 1234 bpcharicregexne icregexnesel icregexnejoinsel ));
+DATA(insert OID = 1235 ( "!~*" PGNSP PGUID b f f 1042 25 16 0 1234 bpcharicregexne icregexnesel icregexnejoinsel "---"));
DESCR("does not match regular expression, case-insensitive");
/* timestamptz operators */
-DATA(insert OID = 1320 ( "=" PGNSP PGUID b t t 1184 1184 16 1320 1321 timestamptz_eq eqsel eqjoinsel ));
+DATA(insert OID = 1320 ( "=" PGNSP PGUID b t t 1184 1184 16 1320 1321 timestamptz_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1321 ( "<>" PGNSP PGUID b f f 1184 1184 16 1321 1320 timestamptz_ne neqsel neqjoinsel ));
+DATA(insert OID = 1321 ( "<>" PGNSP PGUID b f f 1184 1184 16 1321 1320 timestamptz_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1322 ( "<" PGNSP PGUID b f f 1184 1184 16 1324 1325 timestamptz_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1322 ( "<" PGNSP PGUID b f f 1184 1184 16 1324 1325 timestamptz_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1323 ( "<=" PGNSP PGUID b f f 1184 1184 16 1325 1324 timestamptz_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1323 ( "<=" PGNSP PGUID b f f 1184 1184 16 1325 1324 timestamptz_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1324 ( ">" PGNSP PGUID b f f 1184 1184 16 1322 1323 timestamptz_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1324 ( ">" PGNSP PGUID b f f 1184 1184 16 1322 1323 timestamptz_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1325 ( ">=" PGNSP PGUID b f f 1184 1184 16 1323 1322 timestamptz_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1325 ( ">=" PGNSP PGUID b f f 1184 1184 16 1323 1322 timestamptz_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1327 ( "+" PGNSP PGUID b f f 1184 1186 1184 2554 0 timestamptz_pl_interval - - ));
+DATA(insert OID = 1327 ( "+" PGNSP PGUID b f f 1184 1186 1184 2554 0 timestamptz_pl_interval - - "---"));
DESCR("add");
-DATA(insert OID = 1328 ( "-" PGNSP PGUID b f f 1184 1184 1186 0 0 timestamptz_mi - - ));
+DATA(insert OID = 1328 ( "-" PGNSP PGUID b f f 1184 1184 1186 0 0 timestamptz_mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 1329 ( "-" PGNSP PGUID b f f 1184 1186 1184 0 0 timestamptz_mi_interval - - ));
+DATA(insert OID = 1329 ( "-" PGNSP PGUID b f f 1184 1186 1184 0 0 timestamptz_mi_interval - - "---"));
DESCR("subtract");
/* interval operators */
-DATA(insert OID = 1330 ( "=" PGNSP PGUID b t t 1186 1186 16 1330 1331 interval_eq eqsel eqjoinsel ));
+DATA(insert OID = 1330 ( "=" PGNSP PGUID b t t 1186 1186 16 1330 1331 interval_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1331 ( "<>" PGNSP PGUID b f f 1186 1186 16 1331 1330 interval_ne neqsel neqjoinsel ));
+DATA(insert OID = 1331 ( "<>" PGNSP PGUID b f f 1186 1186 16 1331 1330 interval_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1332 ( "<" PGNSP PGUID b f f 1186 1186 16 1334 1335 interval_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1332 ( "<" PGNSP PGUID b f f 1186 1186 16 1334 1335 interval_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1333 ( "<=" PGNSP PGUID b f f 1186 1186 16 1335 1334 interval_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1333 ( "<=" PGNSP PGUID b f f 1186 1186 16 1335 1334 interval_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1334 ( ">" PGNSP PGUID b f f 1186 1186 16 1332 1333 interval_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1334 ( ">" PGNSP PGUID b f f 1186 1186 16 1332 1333 interval_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1335 ( ">=" PGNSP PGUID b f f 1186 1186 16 1333 1332 interval_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1335 ( ">=" PGNSP PGUID b f f 1186 1186 16 1333 1332 interval_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1336 ( "-" PGNSP PGUID l f f 0 1186 1186 0 0 interval_um - - ));
+DATA(insert OID = 1336 ( "-" PGNSP PGUID l f f 0 1186 1186 0 0 interval_um - - "---"));
DESCR("negate");
-DATA(insert OID = 1337 ( "+" PGNSP PGUID b f f 1186 1186 1186 1337 0 interval_pl - - ));
+DATA(insert OID = 1337 ( "+" PGNSP PGUID b f f 1186 1186 1186 1337 0 interval_pl - - "---"));
DESCR("add");
-DATA(insert OID = 1338 ( "-" PGNSP PGUID b f f 1186 1186 1186 0 0 interval_mi - - ));
+DATA(insert OID = 1338 ( "-" PGNSP PGUID b f f 1186 1186 1186 0 0 interval_mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 1360 ( "+" PGNSP PGUID b f f 1082 1083 1114 1363 0 datetime_pl - - ));
+DATA(insert OID = 1360 ( "+" PGNSP PGUID b f f 1082 1083 1114 1363 0 datetime_pl - - "---"));
DESCR("convert date and time to timestamp");
-DATA(insert OID = 1361 ( "+" PGNSP PGUID b f f 1082 1266 1184 1366 0 datetimetz_pl - - ));
+DATA(insert OID = 1361 ( "+" PGNSP PGUID b f f 1082 1266 1184 1366 0 datetimetz_pl - - "---"));
DESCR("convert date and time with time zone to timestamp with time zone");
-DATA(insert OID = 1363 ( "+" PGNSP PGUID b f f 1083 1082 1114 1360 0 timedate_pl - - ));
+DATA(insert OID = 1363 ( "+" PGNSP PGUID b f f 1083 1082 1114 1360 0 timedate_pl - - "---"));
DESCR("convert time and date to timestamp");
-DATA(insert OID = 1366 ( "+" PGNSP PGUID b f f 1266 1082 1184 1361 0 timetzdate_pl - - ));
+DATA(insert OID = 1366 ( "+" PGNSP PGUID b f f 1266 1082 1184 1361 0 timetzdate_pl - - "---"));
DESCR("convert time with time zone and date to timestamp with time zone");
-DATA(insert OID = 1399 ( "-" PGNSP PGUID b f f 1083 1083 1186 0 0 time_mi_time - - ));
+DATA(insert OID = 1399 ( "-" PGNSP PGUID b f f 1083 1083 1186 0 0 time_mi_time - - "---"));
DESCR("subtract");
/* additional geometric operators - thomas 97/04/18 */
-DATA(insert OID = 1420 ( "@@" PGNSP PGUID l f f 0 718 600 0 0 circle_center - - ));
+DATA(insert OID = 1420 ( "@@" PGNSP PGUID l f f 0 718 600 0 0 circle_center - - "---"));
DESCR("center of");
-DATA(insert OID = 1500 ( "=" PGNSP PGUID b f f 718 718 16 1500 1501 circle_eq eqsel eqjoinsel ));
+DATA(insert OID = 1500 ( "=" PGNSP PGUID b f f 718 718 16 1500 1501 circle_eq eqsel eqjoinsel "mhf"));
DESCR("equal by area");
-DATA(insert OID = 1501 ( "<>" PGNSP PGUID b f f 718 718 16 1501 1500 circle_ne neqsel neqjoinsel ));
+DATA(insert OID = 1501 ( "<>" PGNSP PGUID b f f 718 718 16 1501 1500 circle_ne neqsel neqjoinsel "mhf"));
DESCR("not equal by area");
-DATA(insert OID = 1502 ( "<" PGNSP PGUID b f f 718 718 16 1503 1505 circle_lt areasel areajoinsel ));
+DATA(insert OID = 1502 ( "<" PGNSP PGUID b f f 718 718 16 1503 1505 circle_lt areasel areajoinsel "---"));
DESCR("less than by area");
-DATA(insert OID = 1503 ( ">" PGNSP PGUID b f f 718 718 16 1502 1504 circle_gt areasel areajoinsel ));
+DATA(insert OID = 1503 ( ">" PGNSP PGUID b f f 718 718 16 1502 1504 circle_gt areasel areajoinsel "---"));
DESCR("greater than by area");
-DATA(insert OID = 1504 ( "<=" PGNSP PGUID b f f 718 718 16 1505 1503 circle_le areasel areajoinsel ));
+DATA(insert OID = 1504 ( "<=" PGNSP PGUID b f f 718 718 16 1505 1503 circle_le areasel areajoinsel "---"));
DESCR("less than or equal by area");
-DATA(insert OID = 1505 ( ">=" PGNSP PGUID b f f 718 718 16 1504 1502 circle_ge areasel areajoinsel ));
+DATA(insert OID = 1505 ( ">=" PGNSP PGUID b f f 718 718 16 1504 1502 circle_ge areasel areajoinsel "---"));
DESCR("greater than or equal by area");
-DATA(insert OID = 1506 ( "<<" PGNSP PGUID b f f 718 718 16 0 0 circle_left positionsel positionjoinsel ));
+DATA(insert OID = 1506 ( "<<" PGNSP PGUID b f f 718 718 16 0 0 circle_left positionsel positionjoinsel "---"));
DESCR("is left of");
-DATA(insert OID = 1507 ( "&<" PGNSP PGUID b f f 718 718 16 0 0 circle_overleft positionsel positionjoinsel ));
+DATA(insert OID = 1507 ( "&<" PGNSP PGUID b f f 718 718 16 0 0 circle_overleft positionsel positionjoinsel "---"));
DESCR("overlaps or is left of");
-DATA(insert OID = 1508 ( "&>" PGNSP PGUID b f f 718 718 16 0 0 circle_overright positionsel positionjoinsel ));
+DATA(insert OID = 1508 ( "&>" PGNSP PGUID b f f 718 718 16 0 0 circle_overright positionsel positionjoinsel "---"));
DESCR("overlaps or is right of");
-DATA(insert OID = 1509 ( ">>" PGNSP PGUID b f f 718 718 16 0 0 circle_right positionsel positionjoinsel ));
+DATA(insert OID = 1509 ( ">>" PGNSP PGUID b f f 718 718 16 0 0 circle_right positionsel positionjoinsel "---"));
DESCR("is right of");
-DATA(insert OID = 1510 ( "<@" PGNSP PGUID b f f 718 718 16 1511 0 circle_contained contsel contjoinsel ));
+DATA(insert OID = 1510 ( "<@" PGNSP PGUID b f f 718 718 16 1511 0 circle_contained contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 1511 ( "@>" PGNSP PGUID b f f 718 718 16 1510 0 circle_contain contsel contjoinsel ));
+DATA(insert OID = 1511 ( "@>" PGNSP PGUID b f f 718 718 16 1510 0 circle_contain contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 1512 ( "~=" PGNSP PGUID b f f 718 718 16 1512 0 circle_same eqsel eqjoinsel ));
+DATA(insert OID = 1512 ( "~=" PGNSP PGUID b f f 718 718 16 1512 0 circle_same eqsel eqjoinsel "mhf"));
DESCR("same as");
-DATA(insert OID = 1513 ( "&&" PGNSP PGUID b f f 718 718 16 1513 0 circle_overlap areasel areajoinsel ));
+DATA(insert OID = 1513 ( "&&" PGNSP PGUID b f f 718 718 16 1513 0 circle_overlap areasel areajoinsel "---"));
DESCR("overlaps");
-DATA(insert OID = 1514 ( "|>>" PGNSP PGUID b f f 718 718 16 0 0 circle_above positionsel positionjoinsel ));
+DATA(insert OID = 1514 ( "|>>" PGNSP PGUID b f f 718 718 16 0 0 circle_above positionsel positionjoinsel "---"));
DESCR("is above");
-DATA(insert OID = 1515 ( "<<|" PGNSP PGUID b f f 718 718 16 0 0 circle_below positionsel positionjoinsel ));
+DATA(insert OID = 1515 ( "<<|" PGNSP PGUID b f f 718 718 16 0 0 circle_below positionsel positionjoinsel "---"));
DESCR("is below");
-DATA(insert OID = 1516 ( "+" PGNSP PGUID b f f 718 600 718 0 0 circle_add_pt - - ));
+DATA(insert OID = 1516 ( "+" PGNSP PGUID b f f 718 600 718 0 0 circle_add_pt - - "---"));
DESCR("add");
-DATA(insert OID = 1517 ( "-" PGNSP PGUID b f f 718 600 718 0 0 circle_sub_pt - - ));
+DATA(insert OID = 1517 ( "-" PGNSP PGUID b f f 718 600 718 0 0 circle_sub_pt - - "---"));
DESCR("subtract");
-DATA(insert OID = 1518 ( "*" PGNSP PGUID b f f 718 600 718 0 0 circle_mul_pt - - ));
+DATA(insert OID = 1518 ( "*" PGNSP PGUID b f f 718 600 718 0 0 circle_mul_pt - - "---"));
DESCR("multiply");
-DATA(insert OID = 1519 ( "/" PGNSP PGUID b f f 718 600 718 0 0 circle_div_pt - - ));
+DATA(insert OID = 1519 ( "/" PGNSP PGUID b f f 718 600 718 0 0 circle_div_pt - - "---"));
DESCR("divide");
-DATA(insert OID = 1520 ( "<->" PGNSP PGUID b f f 718 718 701 1520 0 circle_distance - - ));
+DATA(insert OID = 1520 ( "<->" PGNSP PGUID b f f 718 718 701 1520 0 circle_distance - - "---"));
DESCR("distance between");
-DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
+DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - "---"));
DESCR("number of points");
-DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3291 0 dist_pc - - ));
+DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3291 0 dist_pc - - "---"));
DESCR("distance between");
-DATA(insert OID = 3291 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
+DATA(insert OID = 3291 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - "---"));
DESCR("distance between");
-DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 3289 0 dist_ppoly - - ));
+DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 3289 0 dist_ppoly - - "---"));
DESCR("distance between");
-DATA(insert OID = 3289 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
+DATA(insert OID = 3289 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - "---"));
DESCR("distance between");
-DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
+DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - "---"));
DESCR("distance between");
/* additional geometric operators - thomas 1997-07-09 */
-DATA(insert OID = 1524 ( "<->" PGNSP PGUID b f f 628 603 701 0 0 dist_lb - - ));
+DATA(insert OID = 1524 ( "<->" PGNSP PGUID b f f 628 603 701 0 0 dist_lb - - "---"));
DESCR("distance between");
-DATA(insert OID = 1525 ( "?#" PGNSP PGUID b f f 601 601 16 1525 0 lseg_intersect - - ));
+DATA(insert OID = 1525 ( "?#" PGNSP PGUID b f f 601 601 16 1525 0 lseg_intersect - - "---"));
DESCR("intersect");
-DATA(insert OID = 1526 ( "?||" PGNSP PGUID b f f 601 601 16 1526 0 lseg_parallel - - ));
+DATA(insert OID = 1526 ( "?||" PGNSP PGUID b f f 601 601 16 1526 0 lseg_parallel - - "---"));
DESCR("parallel");
-DATA(insert OID = 1527 ( "?-|" PGNSP PGUID b f f 601 601 16 1527 0 lseg_perp - - ));
+DATA(insert OID = 1527 ( "?-|" PGNSP PGUID b f f 601 601 16 1527 0 lseg_perp - - "---"));
DESCR("perpendicular");
-DATA(insert OID = 1528 ( "?-" PGNSP PGUID l f f 0 601 16 0 0 lseg_horizontal - - ));
+DATA(insert OID = 1528 ( "?-" PGNSP PGUID l f f 0 601 16 0 0 lseg_horizontal - - "---"));
DESCR("horizontal");
-DATA(insert OID = 1529 ( "?|" PGNSP PGUID l f f 0 601 16 0 0 lseg_vertical - - ));
+DATA(insert OID = 1529 ( "?|" PGNSP PGUID l f f 0 601 16 0 0 lseg_vertical - - "---"));
DESCR("vertical");
-DATA(insert OID = 1535 ( "=" PGNSP PGUID b f f 601 601 16 1535 1586 lseg_eq eqsel eqjoinsel ));
+DATA(insert OID = 1535 ( "=" PGNSP PGUID b f f 601 601 16 1535 1586 lseg_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1536 ( "#" PGNSP PGUID b f f 601 601 600 1536 0 lseg_interpt - - ));
+DATA(insert OID = 1536 ( "#" PGNSP PGUID b f f 601 601 600 1536 0 lseg_interpt - - "---"));
DESCR("intersection point");
-DATA(insert OID = 1537 ( "?#" PGNSP PGUID b f f 601 628 16 0 0 inter_sl - - ));
+DATA(insert OID = 1537 ( "?#" PGNSP PGUID b f f 601 628 16 0 0 inter_sl - - "---"));
DESCR("intersect");
-DATA(insert OID = 1538 ( "?#" PGNSP PGUID b f f 601 603 16 0 0 inter_sb - - ));
+DATA(insert OID = 1538 ( "?#" PGNSP PGUID b f f 601 603 16 0 0 inter_sb - - "---"));
DESCR("intersect");
-DATA(insert OID = 1539 ( "?#" PGNSP PGUID b f f 628 603 16 0 0 inter_lb - - ));
+DATA(insert OID = 1539 ( "?#" PGNSP PGUID b f f 628 603 16 0 0 inter_lb - - "---"));
DESCR("intersect");
-DATA(insert OID = 1546 ( "<@" PGNSP PGUID b f f 600 628 16 0 0 on_pl - - ));
+DATA(insert OID = 1546 ( "<@" PGNSP PGUID b f f 600 628 16 0 0 on_pl - - "---"));
DESCR("point on line");
-DATA(insert OID = 1547 ( "<@" PGNSP PGUID b f f 600 601 16 0 0 on_ps - - ));
+DATA(insert OID = 1547 ( "<@" PGNSP PGUID b f f 600 601 16 0 0 on_ps - - "---"));
DESCR("is contained by");
-DATA(insert OID = 1548 ( "<@" PGNSP PGUID b f f 601 628 16 0 0 on_sl - - ));
+DATA(insert OID = 1548 ( "<@" PGNSP PGUID b f f 601 628 16 0 0 on_sl - - "---"));
DESCR("lseg on line");
-DATA(insert OID = 1549 ( "<@" PGNSP PGUID b f f 601 603 16 0 0 on_sb - - ));
+DATA(insert OID = 1549 ( "<@" PGNSP PGUID b f f 601 603 16 0 0 on_sb - - "---"));
DESCR("is contained by");
-DATA(insert OID = 1557 ( "##" PGNSP PGUID b f f 600 628 600 0 0 close_pl - - ));
+DATA(insert OID = 1557 ( "##" PGNSP PGUID b f f 600 628 600 0 0 close_pl - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1558 ( "##" PGNSP PGUID b f f 600 601 600 0 0 close_ps - - ));
+DATA(insert OID = 1558 ( "##" PGNSP PGUID b f f 600 601 600 0 0 close_ps - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1559 ( "##" PGNSP PGUID b f f 600 603 600 0 0 close_pb - - ));
+DATA(insert OID = 1559 ( "##" PGNSP PGUID b f f 600 603 600 0 0 close_pb - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1566 ( "##" PGNSP PGUID b f f 601 628 600 0 0 close_sl - - ));
+DATA(insert OID = 1566 ( "##" PGNSP PGUID b f f 601 628 600 0 0 close_sl - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1567 ( "##" PGNSP PGUID b f f 601 603 600 0 0 close_sb - - ));
+DATA(insert OID = 1567 ( "##" PGNSP PGUID b f f 601 603 600 0 0 close_sb - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1568 ( "##" PGNSP PGUID b f f 628 603 600 0 0 close_lb - - ));
+DATA(insert OID = 1568 ( "##" PGNSP PGUID b f f 628 603 600 0 0 close_lb - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1577 ( "##" PGNSP PGUID b f f 628 601 600 0 0 close_ls - - ));
+DATA(insert OID = 1577 ( "##" PGNSP PGUID b f f 628 601 600 0 0 close_ls - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1578 ( "##" PGNSP PGUID b f f 601 601 600 0 0 close_lseg - - ));
+DATA(insert OID = 1578 ( "##" PGNSP PGUID b f f 601 601 600 0 0 close_lseg - - "---"));
DESCR("closest point to A on B");
-DATA(insert OID = 1583 ( "*" PGNSP PGUID b f f 1186 701 1186 1584 0 interval_mul - - ));
+DATA(insert OID = 1583 ( "*" PGNSP PGUID b f f 1186 701 1186 1584 0 interval_mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 1584 ( "*" PGNSP PGUID b f f 701 1186 1186 1583 0 mul_d_interval - - ));
+DATA(insert OID = 1584 ( "*" PGNSP PGUID b f f 701 1186 1186 1583 0 mul_d_interval - - "---"));
DESCR("multiply");
-DATA(insert OID = 1585 ( "/" PGNSP PGUID b f f 1186 701 1186 0 0 interval_div - - ));
+DATA(insert OID = 1585 ( "/" PGNSP PGUID b f f 1186 701 1186 0 0 interval_div - - "---"));
DESCR("divide");
-DATA(insert OID = 1586 ( "<>" PGNSP PGUID b f f 601 601 16 1586 1535 lseg_ne neqsel neqjoinsel ));
+DATA(insert OID = 1586 ( "<>" PGNSP PGUID b f f 601 601 16 1586 1535 lseg_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1587 ( "<" PGNSP PGUID b f f 601 601 16 1589 1590 lseg_lt - - ));
+DATA(insert OID = 1587 ( "<" PGNSP PGUID b f f 601 601 16 1589 1590 lseg_lt - - "---"));
DESCR("less than by length");
-DATA(insert OID = 1588 ( "<=" PGNSP PGUID b f f 601 601 16 1590 1589 lseg_le - - ));
+DATA(insert OID = 1588 ( "<=" PGNSP PGUID b f f 601 601 16 1590 1589 lseg_le - - "---"));
DESCR("less than or equal by length");
-DATA(insert OID = 1589 ( ">" PGNSP PGUID b f f 601 601 16 1587 1588 lseg_gt - - ));
+DATA(insert OID = 1589 ( ">" PGNSP PGUID b f f 601 601 16 1587 1588 lseg_gt - - "---"));
DESCR("greater than by length");
-DATA(insert OID = 1590 ( ">=" PGNSP PGUID b f f 601 601 16 1588 1587 lseg_ge - - ));
+DATA(insert OID = 1590 ( ">=" PGNSP PGUID b f f 601 601 16 1588 1587 lseg_ge - - "---"));
DESCR("greater than or equal by length");
-DATA(insert OID = 1591 ( "@-@" PGNSP PGUID l f f 0 601 701 0 0 lseg_length - - ));
+DATA(insert OID = 1591 ( "@-@" PGNSP PGUID l f f 0 601 701 0 0 lseg_length - - "---"));
DESCR("distance between endpoints");
-DATA(insert OID = 1611 ( "?#" PGNSP PGUID b f f 628 628 16 1611 0 line_intersect - - ));
+DATA(insert OID = 1611 ( "?#" PGNSP PGUID b f f 628 628 16 1611 0 line_intersect - - "---"));
DESCR("intersect");
-DATA(insert OID = 1612 ( "?||" PGNSP PGUID b f f 628 628 16 1612 0 line_parallel - - ));
+DATA(insert OID = 1612 ( "?||" PGNSP PGUID b f f 628 628 16 1612 0 line_parallel - - "---"));
DESCR("parallel");
-DATA(insert OID = 1613 ( "?-|" PGNSP PGUID b f f 628 628 16 1613 0 line_perp - - ));
+DATA(insert OID = 1613 ( "?-|" PGNSP PGUID b f f 628 628 16 1613 0 line_perp - - "---"));
DESCR("perpendicular");
-DATA(insert OID = 1614 ( "?-" PGNSP PGUID l f f 0 628 16 0 0 line_horizontal - - ));
+DATA(insert OID = 1614 ( "?-" PGNSP PGUID l f f 0 628 16 0 0 line_horizontal - - "---"));
DESCR("horizontal");
-DATA(insert OID = 1615 ( "?|" PGNSP PGUID l f f 0 628 16 0 0 line_vertical - - ));
+DATA(insert OID = 1615 ( "?|" PGNSP PGUID l f f 0 628 16 0 0 line_vertical - - "---"));
DESCR("vertical");
-DATA(insert OID = 1616 ( "=" PGNSP PGUID b f f 628 628 16 1616 0 line_eq eqsel eqjoinsel ));
+DATA(insert OID = 1616 ( "=" PGNSP PGUID b f f 628 628 16 1616 0 line_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1617 ( "#" PGNSP PGUID b f f 628 628 600 1617 0 line_interpt - - ));
+DATA(insert OID = 1617 ( "#" PGNSP PGUID b f f 628 628 600 1617 0 line_interpt - - "---"));
DESCR("intersection point");
/* MAC type */
-DATA(insert OID = 1220 ( "=" PGNSP PGUID b t t 829 829 16 1220 1221 macaddr_eq eqsel eqjoinsel ));
+DATA(insert OID = 1220 ( "=" PGNSP PGUID b t t 829 829 16 1220 1221 macaddr_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1221 ( "<>" PGNSP PGUID b f f 829 829 16 1221 1220 macaddr_ne neqsel neqjoinsel ));
+DATA(insert OID = 1221 ( "<>" PGNSP PGUID b f f 829 829 16 1221 1220 macaddr_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1222 ( "<" PGNSP PGUID b f f 829 829 16 1224 1225 macaddr_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1222 ( "<" PGNSP PGUID b f f 829 829 16 1224 1225 macaddr_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1223 ( "<=" PGNSP PGUID b f f 829 829 16 1225 1224 macaddr_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1223 ( "<=" PGNSP PGUID b f f 829 829 16 1225 1224 macaddr_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1224 ( ">" PGNSP PGUID b f f 829 829 16 1222 1223 macaddr_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1224 ( ">" PGNSP PGUID b f f 829 829 16 1222 1223 macaddr_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1225 ( ">=" PGNSP PGUID b f f 829 829 16 1223 1222 macaddr_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1225 ( ">=" PGNSP PGUID b f f 829 829 16 1223 1222 macaddr_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 3147 ( "~" PGNSP PGUID l f f 0 829 829 0 0 macaddr_not - - ));
+DATA(insert OID = 3147 ( "~" PGNSP PGUID l f f 0 829 829 0 0 macaddr_not - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 3148 ( "&" PGNSP PGUID b f f 829 829 829 0 0 macaddr_and - - ));
+DATA(insert OID = 3148 ( "&" PGNSP PGUID b f f 829 829 829 0 0 macaddr_and - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 3149 ( "|" PGNSP PGUID b f f 829 829 829 0 0 macaddr_or - - ));
+DATA(insert OID = 3149 ( "|" PGNSP PGUID b f f 829 829 829 0 0 macaddr_or - - "---"));
DESCR("bitwise or");
/* INET type (these also support CIDR via implicit cast) */
-DATA(insert OID = 1201 ( "=" PGNSP PGUID b t t 869 869 16 1201 1202 network_eq eqsel eqjoinsel ));
+DATA(insert OID = 1201 ( "=" PGNSP PGUID b t t 869 869 16 1201 1202 network_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1202 ( "<>" PGNSP PGUID b f f 869 869 16 1202 1201 network_ne neqsel neqjoinsel ));
+DATA(insert OID = 1202 ( "<>" PGNSP PGUID b f f 869 869 16 1202 1201 network_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1203 ( "<" PGNSP PGUID b f f 869 869 16 1205 1206 network_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1203 ( "<" PGNSP PGUID b f f 869 869 16 1205 1206 network_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1204 ( "<=" PGNSP PGUID b f f 869 869 16 1206 1205 network_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1204 ( "<=" PGNSP PGUID b f f 869 869 16 1206 1205 network_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1205 ( ">" PGNSP PGUID b f f 869 869 16 1203 1204 network_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1205 ( ">" PGNSP PGUID b f f 869 869 16 1203 1204 network_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1206 ( ">=" PGNSP PGUID b f f 869 869 16 1204 1203 network_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1206 ( ">=" PGNSP PGUID b f f 869 869 16 1204 1203 network_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 931 ( "<<" PGNSP PGUID b f f 869 869 16 933 0 network_sub networksel networkjoinsel ));
+DATA(insert OID = 931 ( "<<" PGNSP PGUID b f f 869 869 16 933 0 network_sub networksel networkjoinsel "---"));
DESCR("is subnet");
#define OID_INET_SUB_OP 931
-DATA(insert OID = 932 ( "<<=" PGNSP PGUID b f f 869 869 16 934 0 network_subeq networksel networkjoinsel ));
+DATA(insert OID = 932 ( "<<=" PGNSP PGUID b f f 869 869 16 934 0 network_subeq networksel networkjoinsel "---"));
DESCR("is subnet or equal");
#define OID_INET_SUBEQ_OP 932
-DATA(insert OID = 933 ( ">>" PGNSP PGUID b f f 869 869 16 931 0 network_sup networksel networkjoinsel ));
+DATA(insert OID = 933 ( ">>" PGNSP PGUID b f f 869 869 16 931 0 network_sup networksel networkjoinsel "---"));
DESCR("is supernet");
#define OID_INET_SUP_OP 933
-DATA(insert OID = 934 ( ">>=" PGNSP PGUID b f f 869 869 16 932 0 network_supeq networksel networkjoinsel ));
+DATA(insert OID = 934 ( ">>=" PGNSP PGUID b f f 869 869 16 932 0 network_supeq networksel networkjoinsel "---"));
DESCR("is supernet or equal");
#define OID_INET_SUPEQ_OP 934
-DATA(insert OID = 3552 ( "&&" PGNSP PGUID b f f 869 869 16 3552 0 network_overlap networksel networkjoinsel ));
+DATA(insert OID = 3552 ( "&&" PGNSP PGUID b f f 869 869 16 3552 0 network_overlap networksel networkjoinsel "---"));
DESCR("overlaps (is subnet or supernet)");
#define OID_INET_OVERLAP_OP 3552
-DATA(insert OID = 2634 ( "~" PGNSP PGUID l f f 0 869 869 0 0 inetnot - - ));
+DATA(insert OID = 2634 ( "~" PGNSP PGUID l f f 0 869 869 0 0 inetnot - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 2635 ( "&" PGNSP PGUID b f f 869 869 869 0 0 inetand - - ));
+DATA(insert OID = 2635 ( "&" PGNSP PGUID b f f 869 869 869 0 0 inetand - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 2636 ( "|" PGNSP PGUID b f f 869 869 869 0 0 inetor - - ));
+DATA(insert OID = 2636 ( "|" PGNSP PGUID b f f 869 869 869 0 0 inetor - - "---"));
DESCR("bitwise or");
-DATA(insert OID = 2637 ( "+" PGNSP PGUID b f f 869 20 869 2638 0 inetpl - - ));
+DATA(insert OID = 2637 ( "+" PGNSP PGUID b f f 869 20 869 2638 0 inetpl - - "---"));
DESCR("add");
-DATA(insert OID = 2638 ( "+" PGNSP PGUID b f f 20 869 869 2637 0 int8pl_inet - - ));
+DATA(insert OID = 2638 ( "+" PGNSP PGUID b f f 20 869 869 2637 0 int8pl_inet - - "---"));
DESCR("add");
-DATA(insert OID = 2639 ( "-" PGNSP PGUID b f f 869 20 869 0 0 inetmi_int8 - - ));
+DATA(insert OID = 2639 ( "-" PGNSP PGUID b f f 869 20 869 0 0 inetmi_int8 - - "---"));
DESCR("subtract");
-DATA(insert OID = 2640 ( "-" PGNSP PGUID b f f 869 869 20 0 0 inetmi - - ));
+DATA(insert OID = 2640 ( "-" PGNSP PGUID b f f 869 869 20 0 0 inetmi - - "---"));
DESCR("subtract");
/* case-insensitive LIKE hacks */
-DATA(insert OID = 1625 ( "~~*" PGNSP PGUID b f f 19 25 16 0 1626 nameiclike iclikesel iclikejoinsel ));
+DATA(insert OID = 1625 ( "~~*" PGNSP PGUID b f f 19 25 16 0 1626 nameiclike iclikesel iclikejoinsel "---"));
DESCR("matches LIKE expression, case-insensitive");
#define OID_NAME_ICLIKE_OP 1625
-DATA(insert OID = 1626 ( "!~~*" PGNSP PGUID b f f 19 25 16 0 1625 nameicnlike icnlikesel icnlikejoinsel ));
+DATA(insert OID = 1626 ( "!~~*" PGNSP PGUID b f f 19 25 16 0 1625 nameicnlike icnlikesel icnlikejoinsel "---"));
DESCR("does not match LIKE expression, case-insensitive");
-DATA(insert OID = 1627 ( "~~*" PGNSP PGUID b f f 25 25 16 0 1628 texticlike iclikesel iclikejoinsel ));
+DATA(insert OID = 1627 ( "~~*" PGNSP PGUID b f f 25 25 16 0 1628 texticlike iclikesel iclikejoinsel "---"));
DESCR("matches LIKE expression, case-insensitive");
#define OID_TEXT_ICLIKE_OP 1627
-DATA(insert OID = 1628 ( "!~~*" PGNSP PGUID b f f 25 25 16 0 1627 texticnlike icnlikesel icnlikejoinsel ));
+DATA(insert OID = 1628 ( "!~~*" PGNSP PGUID b f f 25 25 16 0 1627 texticnlike icnlikesel icnlikejoinsel "---"));
DESCR("does not match LIKE expression, case-insensitive");
-DATA(insert OID = 1629 ( "~~*" PGNSP PGUID b f f 1042 25 16 0 1630 bpchariclike iclikesel iclikejoinsel ));
+DATA(insert OID = 1629 ( "~~*" PGNSP PGUID b f f 1042 25 16 0 1630 bpchariclike iclikesel iclikejoinsel "---"));
DESCR("matches LIKE expression, case-insensitive");
#define OID_BPCHAR_ICLIKE_OP 1629
-DATA(insert OID = 1630 ( "!~~*" PGNSP PGUID b f f 1042 25 16 0 1629 bpcharicnlike icnlikesel icnlikejoinsel ));
+DATA(insert OID = 1630 ( "!~~*" PGNSP PGUID b f f 1042 25 16 0 1629 bpcharicnlike icnlikesel icnlikejoinsel "---"));
DESCR("does not match LIKE expression, case-insensitive");
/* NUMERIC type - OID's 1700-1799 */
-DATA(insert OID = 1751 ( "-" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_uminus - - ));
+DATA(insert OID = 1751 ( "-" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_uminus - - "---"));
DESCR("negate");
-DATA(insert OID = 1752 ( "=" PGNSP PGUID b t t 1700 1700 16 1752 1753 numeric_eq eqsel eqjoinsel ));
+DATA(insert OID = 1752 ( "=" PGNSP PGUID b t t 1700 1700 16 1752 1753 numeric_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1753 ( "<>" PGNSP PGUID b f f 1700 1700 16 1753 1752 numeric_ne neqsel neqjoinsel ));
+DATA(insert OID = 1753 ( "<>" PGNSP PGUID b f f 1700 1700 16 1753 1752 numeric_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1754 ( "<" PGNSP PGUID b f f 1700 1700 16 1756 1757 numeric_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1754 ( "<" PGNSP PGUID b f f 1700 1700 16 1756 1757 numeric_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1755 ( "<=" PGNSP PGUID b f f 1700 1700 16 1757 1756 numeric_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1755 ( "<=" PGNSP PGUID b f f 1700 1700 16 1757 1756 numeric_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1756 ( ">" PGNSP PGUID b f f 1700 1700 16 1754 1755 numeric_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1756 ( ">" PGNSP PGUID b f f 1700 1700 16 1754 1755 numeric_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1757 ( ">=" PGNSP PGUID b f f 1700 1700 16 1755 1754 numeric_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1757 ( ">=" PGNSP PGUID b f f 1700 1700 16 1755 1754 numeric_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1758 ( "+" PGNSP PGUID b f f 1700 1700 1700 1758 0 numeric_add - - ));
+DATA(insert OID = 1758 ( "+" PGNSP PGUID b f f 1700 1700 1700 1758 0 numeric_add - - "---"));
DESCR("add");
-DATA(insert OID = 1759 ( "-" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_sub - - ));
+DATA(insert OID = 1759 ( "-" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_sub - - "---"));
DESCR("subtract");
-DATA(insert OID = 1760 ( "*" PGNSP PGUID b f f 1700 1700 1700 1760 0 numeric_mul - - ));
+DATA(insert OID = 1760 ( "*" PGNSP PGUID b f f 1700 1700 1700 1760 0 numeric_mul - - "---"));
DESCR("multiply");
-DATA(insert OID = 1761 ( "/" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_div - - ));
+DATA(insert OID = 1761 ( "/" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_div - - "---"));
DESCR("divide");
-DATA(insert OID = 1762 ( "%" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_mod - - ));
+DATA(insert OID = 1762 ( "%" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_mod - - "---"));
DESCR("modulus");
-DATA(insert OID = 1038 ( "^" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_power - - ));
+DATA(insert OID = 1038 ( "^" PGNSP PGUID b f f 1700 1700 1700 0 0 numeric_power - - "---"));
DESCR("exponentiation");
-DATA(insert OID = 1763 ( "@" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_abs - - ));
+DATA(insert OID = 1763 ( "@" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_abs - - "---"));
DESCR("absolute value");
-DATA(insert OID = 1784 ( "=" PGNSP PGUID b t f 1560 1560 16 1784 1785 biteq eqsel eqjoinsel ));
+DATA(insert OID = 1784 ( "=" PGNSP PGUID b t f 1560 1560 16 1784 1785 biteq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1785 ( "<>" PGNSP PGUID b f f 1560 1560 16 1785 1784 bitne neqsel neqjoinsel ));
+DATA(insert OID = 1785 ( "<>" PGNSP PGUID b f f 1560 1560 16 1785 1784 bitne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1786 ( "<" PGNSP PGUID b f f 1560 1560 16 1787 1789 bitlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1786 ( "<" PGNSP PGUID b f f 1560 1560 16 1787 1789 bitlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1787 ( ">" PGNSP PGUID b f f 1560 1560 16 1786 1788 bitgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1787 ( ">" PGNSP PGUID b f f 1560 1560 16 1786 1788 bitgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1788 ( "<=" PGNSP PGUID b f f 1560 1560 16 1789 1787 bitle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1788 ( "<=" PGNSP PGUID b f f 1560 1560 16 1789 1787 bitle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1789 ( ">=" PGNSP PGUID b f f 1560 1560 16 1788 1786 bitge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1789 ( ">=" PGNSP PGUID b f f 1560 1560 16 1788 1786 bitge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1791 ( "&" PGNSP PGUID b f f 1560 1560 1560 1791 0 bitand - - ));
+DATA(insert OID = 1791 ( "&" PGNSP PGUID b f f 1560 1560 1560 1791 0 bitand - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 1792 ( "|" PGNSP PGUID b f f 1560 1560 1560 1792 0 bitor - - ));
+DATA(insert OID = 1792 ( "|" PGNSP PGUID b f f 1560 1560 1560 1792 0 bitor - - "---"));
DESCR("bitwise or");
-DATA(insert OID = 1793 ( "#" PGNSP PGUID b f f 1560 1560 1560 1793 0 bitxor - - ));
+DATA(insert OID = 1793 ( "#" PGNSP PGUID b f f 1560 1560 1560 1793 0 bitxor - - "---"));
DESCR("bitwise exclusive or");
-DATA(insert OID = 1794 ( "~" PGNSP PGUID l f f 0 1560 1560 0 0 bitnot - - ));
+DATA(insert OID = 1794 ( "~" PGNSP PGUID l f f 0 1560 1560 0 0 bitnot - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 1795 ( "<<" PGNSP PGUID b f f 1560 23 1560 0 0 bitshiftleft - - ));
+DATA(insert OID = 1795 ( "<<" PGNSP PGUID b f f 1560 23 1560 0 0 bitshiftleft - - "---"));
DESCR("bitwise shift left");
-DATA(insert OID = 1796 ( ">>" PGNSP PGUID b f f 1560 23 1560 0 0 bitshiftright - - ));
+DATA(insert OID = 1796 ( ">>" PGNSP PGUID b f f 1560 23 1560 0 0 bitshiftright - - "---"));
DESCR("bitwise shift right");
-DATA(insert OID = 1797 ( "||" PGNSP PGUID b f f 1562 1562 1562 0 0 bitcat - - ));
+DATA(insert OID = 1797 ( "||" PGNSP PGUID b f f 1562 1562 1562 0 0 bitcat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 1800 ( "+" PGNSP PGUID b f f 1083 1186 1083 1849 0 time_pl_interval - - ));
+DATA(insert OID = 1800 ( "+" PGNSP PGUID b f f 1083 1186 1083 1849 0 time_pl_interval - - "---"));
DESCR("add");
-DATA(insert OID = 1801 ( "-" PGNSP PGUID b f f 1083 1186 1083 0 0 time_mi_interval - - ));
+DATA(insert OID = 1801 ( "-" PGNSP PGUID b f f 1083 1186 1083 0 0 time_mi_interval - - "---"));
DESCR("subtract");
-DATA(insert OID = 1802 ( "+" PGNSP PGUID b f f 1266 1186 1266 2552 0 timetz_pl_interval - - ));
+DATA(insert OID = 1802 ( "+" PGNSP PGUID b f f 1266 1186 1266 2552 0 timetz_pl_interval - - "---"));
DESCR("add");
-DATA(insert OID = 1803 ( "-" PGNSP PGUID b f f 1266 1186 1266 0 0 timetz_mi_interval - - ));
+DATA(insert OID = 1803 ( "-" PGNSP PGUID b f f 1266 1186 1266 0 0 timetz_mi_interval - - "---"));
DESCR("subtract");
-DATA(insert OID = 1804 ( "=" PGNSP PGUID b t f 1562 1562 16 1804 1805 varbiteq eqsel eqjoinsel ));
+DATA(insert OID = 1804 ( "=" PGNSP PGUID b t f 1562 1562 16 1804 1805 varbiteq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1805 ( "<>" PGNSP PGUID b f f 1562 1562 16 1805 1804 varbitne neqsel neqjoinsel ));
+DATA(insert OID = 1805 ( "<>" PGNSP PGUID b f f 1562 1562 16 1805 1804 varbitne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1806 ( "<" PGNSP PGUID b f f 1562 1562 16 1807 1809 varbitlt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1806 ( "<" PGNSP PGUID b f f 1562 1562 16 1807 1809 varbitlt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1807 ( ">" PGNSP PGUID b f f 1562 1562 16 1806 1808 varbitgt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1807 ( ">" PGNSP PGUID b f f 1562 1562 16 1806 1808 varbitgt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1808 ( "<=" PGNSP PGUID b f f 1562 1562 16 1809 1807 varbitle scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1808 ( "<=" PGNSP PGUID b f f 1562 1562 16 1809 1807 varbitle scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1809 ( ">=" PGNSP PGUID b f f 1562 1562 16 1808 1806 varbitge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1809 ( ">=" PGNSP PGUID b f f 1562 1562 16 1808 1806 varbitge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1849 ( "+" PGNSP PGUID b f f 1186 1083 1083 1800 0 interval_pl_time - - ));
+DATA(insert OID = 1849 ( "+" PGNSP PGUID b f f 1186 1083 1083 1800 0 interval_pl_time - - "---"));
DESCR("add");
-DATA(insert OID = 1862 ( "=" PGNSP PGUID b t t 21 20 16 1868 1863 int28eq eqsel eqjoinsel ));
+DATA(insert OID = 1862 ( "=" PGNSP PGUID b t t 21 20 16 1868 1863 int28eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1863 ( "<>" PGNSP PGUID b f f 21 20 16 1869 1862 int28ne neqsel neqjoinsel ));
+DATA(insert OID = 1863 ( "<>" PGNSP PGUID b f f 21 20 16 1869 1862 int28ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1864 ( "<" PGNSP PGUID b f f 21 20 16 1871 1867 int28lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1864 ( "<" PGNSP PGUID b f f 21 20 16 1871 1867 int28lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1865 ( ">" PGNSP PGUID b f f 21 20 16 1870 1866 int28gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1865 ( ">" PGNSP PGUID b f f 21 20 16 1870 1866 int28gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1866 ( "<=" PGNSP PGUID b f f 21 20 16 1873 1865 int28le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1866 ( "<=" PGNSP PGUID b f f 21 20 16 1873 1865 int28le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1867 ( ">=" PGNSP PGUID b f f 21 20 16 1872 1864 int28ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1867 ( ">=" PGNSP PGUID b f f 21 20 16 1872 1864 int28ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1868 ( "=" PGNSP PGUID b t t 20 21 16 1862 1869 int82eq eqsel eqjoinsel ));
+DATA(insert OID = 1868 ( "=" PGNSP PGUID b t t 20 21 16 1862 1869 int82eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1869 ( "<>" PGNSP PGUID b f f 20 21 16 1863 1868 int82ne neqsel neqjoinsel ));
+DATA(insert OID = 1869 ( "<>" PGNSP PGUID b f f 20 21 16 1863 1868 int82ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1870 ( "<" PGNSP PGUID b f f 20 21 16 1865 1873 int82lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1870 ( "<" PGNSP PGUID b f f 20 21 16 1865 1873 int82lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1871 ( ">" PGNSP PGUID b f f 20 21 16 1864 1872 int82gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1871 ( ">" PGNSP PGUID b f f 20 21 16 1864 1872 int82gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1872 ( "<=" PGNSP PGUID b f f 20 21 16 1867 1871 int82le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1872 ( "<=" PGNSP PGUID b f f 20 21 16 1867 1871 int82le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1873 ( ">=" PGNSP PGUID b f f 20 21 16 1866 1870 int82ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1873 ( ">=" PGNSP PGUID b f f 20 21 16 1866 1870 int82ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 1874 ( "&" PGNSP PGUID b f f 21 21 21 1874 0 int2and - - ));
+DATA(insert OID = 1874 ( "&" PGNSP PGUID b f f 21 21 21 1874 0 int2and - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 1875 ( "|" PGNSP PGUID b f f 21 21 21 1875 0 int2or - - ));
+DATA(insert OID = 1875 ( "|" PGNSP PGUID b f f 21 21 21 1875 0 int2or - - "---"));
DESCR("bitwise or");
-DATA(insert OID = 1876 ( "#" PGNSP PGUID b f f 21 21 21 1876 0 int2xor - - ));
+DATA(insert OID = 1876 ( "#" PGNSP PGUID b f f 21 21 21 1876 0 int2xor - - "---"));
DESCR("bitwise exclusive or");
-DATA(insert OID = 1877 ( "~" PGNSP PGUID l f f 0 21 21 0 0 int2not - - ));
+DATA(insert OID = 1877 ( "~" PGNSP PGUID l f f 0 21 21 0 0 int2not - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 1878 ( "<<" PGNSP PGUID b f f 21 23 21 0 0 int2shl - - ));
+DATA(insert OID = 1878 ( "<<" PGNSP PGUID b f f 21 23 21 0 0 int2shl - - "---"));
DESCR("bitwise shift left");
-DATA(insert OID = 1879 ( ">>" PGNSP PGUID b f f 21 23 21 0 0 int2shr - - ));
+DATA(insert OID = 1879 ( ">>" PGNSP PGUID b f f 21 23 21 0 0 int2shr - - "---"));
DESCR("bitwise shift right");
-DATA(insert OID = 1880 ( "&" PGNSP PGUID b f f 23 23 23 1880 0 int4and - - ));
+DATA(insert OID = 1880 ( "&" PGNSP PGUID b f f 23 23 23 1880 0 int4and - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 1881 ( "|" PGNSP PGUID b f f 23 23 23 1881 0 int4or - - ));
+DATA(insert OID = 1881 ( "|" PGNSP PGUID b f f 23 23 23 1881 0 int4or - - "---"));
DESCR("bitwise or");
-DATA(insert OID = 1882 ( "#" PGNSP PGUID b f f 23 23 23 1882 0 int4xor - - ));
+DATA(insert OID = 1882 ( "#" PGNSP PGUID b f f 23 23 23 1882 0 int4xor - - "---"));
DESCR("bitwise exclusive or");
-DATA(insert OID = 1883 ( "~" PGNSP PGUID l f f 0 23 23 0 0 int4not - - ));
+DATA(insert OID = 1883 ( "~" PGNSP PGUID l f f 0 23 23 0 0 int4not - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 1884 ( "<<" PGNSP PGUID b f f 23 23 23 0 0 int4shl - - ));
+DATA(insert OID = 1884 ( "<<" PGNSP PGUID b f f 23 23 23 0 0 int4shl - - "---"));
DESCR("bitwise shift left");
-DATA(insert OID = 1885 ( ">>" PGNSP PGUID b f f 23 23 23 0 0 int4shr - - ));
+DATA(insert OID = 1885 ( ">>" PGNSP PGUID b f f 23 23 23 0 0 int4shr - - "---"));
DESCR("bitwise shift right");
-DATA(insert OID = 1886 ( "&" PGNSP PGUID b f f 20 20 20 1886 0 int8and - - ));
+DATA(insert OID = 1886 ( "&" PGNSP PGUID b f f 20 20 20 1886 0 int8and - - "---"));
DESCR("bitwise and");
-DATA(insert OID = 1887 ( "|" PGNSP PGUID b f f 20 20 20 1887 0 int8or - - ));
+DATA(insert OID = 1887 ( "|" PGNSP PGUID b f f 20 20 20 1887 0 int8or - - "---"));
DESCR("bitwise or");
-DATA(insert OID = 1888 ( "#" PGNSP PGUID b f f 20 20 20 1888 0 int8xor - - ));
+DATA(insert OID = 1888 ( "#" PGNSP PGUID b f f 20 20 20 1888 0 int8xor - - "---"));
DESCR("bitwise exclusive or");
-DATA(insert OID = 1889 ( "~" PGNSP PGUID l f f 0 20 20 0 0 int8not - - ));
+DATA(insert OID = 1889 ( "~" PGNSP PGUID l f f 0 20 20 0 0 int8not - - "---"));
DESCR("bitwise not");
-DATA(insert OID = 1890 ( "<<" PGNSP PGUID b f f 20 23 20 0 0 int8shl - - ));
+DATA(insert OID = 1890 ( "<<" PGNSP PGUID b f f 20 23 20 0 0 int8shl - - "---"));
DESCR("bitwise shift left");
-DATA(insert OID = 1891 ( ">>" PGNSP PGUID b f f 20 23 20 0 0 int8shr - - ));
+DATA(insert OID = 1891 ( ">>" PGNSP PGUID b f f 20 23 20 0 0 int8shr - - "---"));
DESCR("bitwise shift right");
-DATA(insert OID = 1916 ( "+" PGNSP PGUID l f f 0 20 20 0 0 int8up - - ));
+DATA(insert OID = 1916 ( "+" PGNSP PGUID l f f 0 20 20 0 0 int8up - - "---"));
DESCR("unary plus");
-DATA(insert OID = 1917 ( "+" PGNSP PGUID l f f 0 21 21 0 0 int2up - - ));
+DATA(insert OID = 1917 ( "+" PGNSP PGUID l f f 0 21 21 0 0 int2up - - "---"));
DESCR("unary plus");
-DATA(insert OID = 1918 ( "+" PGNSP PGUID l f f 0 23 23 0 0 int4up - - ));
+DATA(insert OID = 1918 ( "+" PGNSP PGUID l f f 0 23 23 0 0 int4up - - "---"));
DESCR("unary plus");
-DATA(insert OID = 1919 ( "+" PGNSP PGUID l f f 0 700 700 0 0 float4up - - ));
+DATA(insert OID = 1919 ( "+" PGNSP PGUID l f f 0 700 700 0 0 float4up - - "---"));
DESCR("unary plus");
-DATA(insert OID = 1920 ( "+" PGNSP PGUID l f f 0 701 701 0 0 float8up - - ));
+DATA(insert OID = 1920 ( "+" PGNSP PGUID l f f 0 701 701 0 0 float8up - - "---"));
DESCR("unary plus");
-DATA(insert OID = 1921 ( "+" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_uplus - - ));
+DATA(insert OID = 1921 ( "+" PGNSP PGUID l f f 0 1700 1700 0 0 numeric_uplus - - "---"));
DESCR("unary plus");
/* bytea operators */
-DATA(insert OID = 1955 ( "=" PGNSP PGUID b t t 17 17 16 1955 1956 byteaeq eqsel eqjoinsel ));
+DATA(insert OID = 1955 ( "=" PGNSP PGUID b t t 17 17 16 1955 1956 byteaeq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 1956 ( "<>" PGNSP PGUID b f f 17 17 16 1956 1955 byteane neqsel neqjoinsel ));
+DATA(insert OID = 1956 ( "<>" PGNSP PGUID b f f 17 17 16 1956 1955 byteane neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 1957 ( "<" PGNSP PGUID b f f 17 17 16 1959 1960 bytealt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1957 ( "<" PGNSP PGUID b f f 17 17 16 1959 1960 bytealt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 1958 ( "<=" PGNSP PGUID b f f 17 17 16 1960 1959 byteale scalarltsel scalarltjoinsel ));
+DATA(insert OID = 1958 ( "<=" PGNSP PGUID b f f 17 17 16 1960 1959 byteale scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 1959 ( ">" PGNSP PGUID b f f 17 17 16 1957 1958 byteagt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1959 ( ">" PGNSP PGUID b f f 17 17 16 1957 1958 byteagt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 1960 ( ">=" PGNSP PGUID b f f 17 17 16 1958 1957 byteage scalargtsel scalargtjoinsel ));
+DATA(insert OID = 1960 ( ">=" PGNSP PGUID b f f 17 17 16 1958 1957 byteage scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2016 ( "~~" PGNSP PGUID b f f 17 17 16 0 2017 bytealike likesel likejoinsel ));
+DATA(insert OID = 2016 ( "~~" PGNSP PGUID b f f 17 17 16 0 2017 bytealike likesel likejoinsel "---"));
DESCR("matches LIKE expression");
#define OID_BYTEA_LIKE_OP 2016
-DATA(insert OID = 2017 ( "!~~" PGNSP PGUID b f f 17 17 16 0 2016 byteanlike nlikesel nlikejoinsel ));
+DATA(insert OID = 2017 ( "!~~" PGNSP PGUID b f f 17 17 16 0 2016 byteanlike nlikesel nlikejoinsel "---"));
DESCR("does not match LIKE expression");
-DATA(insert OID = 2018 ( "||" PGNSP PGUID b f f 17 17 17 0 0 byteacat - - ));
+DATA(insert OID = 2018 ( "||" PGNSP PGUID b f f 17 17 17 0 0 byteacat - - "---"));
DESCR("concatenate");
/* timestamp operators */
-DATA(insert OID = 2060 ( "=" PGNSP PGUID b t t 1114 1114 16 2060 2061 timestamp_eq eqsel eqjoinsel ));
+DATA(insert OID = 2060 ( "=" PGNSP PGUID b t t 1114 1114 16 2060 2061 timestamp_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2061 ( "<>" PGNSP PGUID b f f 1114 1114 16 2061 2060 timestamp_ne neqsel neqjoinsel ));
+DATA(insert OID = 2061 ( "<>" PGNSP PGUID b f f 1114 1114 16 2061 2060 timestamp_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2062 ( "<" PGNSP PGUID b f f 1114 1114 16 2064 2065 timestamp_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2062 ( "<" PGNSP PGUID b f f 1114 1114 16 2064 2065 timestamp_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2063 ( "<=" PGNSP PGUID b f f 1114 1114 16 2065 2064 timestamp_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2063 ( "<=" PGNSP PGUID b f f 1114 1114 16 2065 2064 timestamp_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2064 ( ">" PGNSP PGUID b f f 1114 1114 16 2062 2063 timestamp_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2064 ( ">" PGNSP PGUID b f f 1114 1114 16 2062 2063 timestamp_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2065 ( ">=" PGNSP PGUID b f f 1114 1114 16 2063 2062 timestamp_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2065 ( ">=" PGNSP PGUID b f f 1114 1114 16 2063 2062 timestamp_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2066 ( "+" PGNSP PGUID b f f 1114 1186 1114 2553 0 timestamp_pl_interval - - ));
+DATA(insert OID = 2066 ( "+" PGNSP PGUID b f f 1114 1186 1114 2553 0 timestamp_pl_interval - - "---"));
DESCR("add");
-DATA(insert OID = 2067 ( "-" PGNSP PGUID b f f 1114 1114 1186 0 0 timestamp_mi - - ));
+DATA(insert OID = 2067 ( "-" PGNSP PGUID b f f 1114 1114 1186 0 0 timestamp_mi - - "---"));
DESCR("subtract");
-DATA(insert OID = 2068 ( "-" PGNSP PGUID b f f 1114 1186 1114 0 0 timestamp_mi_interval - - ));
+DATA(insert OID = 2068 ( "-" PGNSP PGUID b f f 1114 1186 1114 0 0 timestamp_mi_interval - - "---"));
DESCR("subtract");
/* character-by-character (not collation order) comparison operators for character types */
-DATA(insert OID = 2314 ( "~<~" PGNSP PGUID b f f 25 25 16 2318 2317 text_pattern_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2314 ( "~<~" PGNSP PGUID b f f 25 25 16 2318 2317 text_pattern_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2315 ( "~<=~" PGNSP PGUID b f f 25 25 16 2317 2318 text_pattern_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2315 ( "~<=~" PGNSP PGUID b f f 25 25 16 2317 2318 text_pattern_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2317 ( "~>=~" PGNSP PGUID b f f 25 25 16 2315 2314 text_pattern_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2317 ( "~>=~" PGNSP PGUID b f f 25 25 16 2315 2314 text_pattern_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2318 ( "~>~" PGNSP PGUID b f f 25 25 16 2314 2315 text_pattern_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2318 ( "~>~" PGNSP PGUID b f f 25 25 16 2314 2315 text_pattern_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2326 ( "~<~" PGNSP PGUID b f f 1042 1042 16 2330 2329 bpchar_pattern_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2326 ( "~<~" PGNSP PGUID b f f 1042 1042 16 2330 2329 bpchar_pattern_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2327 ( "~<=~" PGNSP PGUID b f f 1042 1042 16 2329 2330 bpchar_pattern_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2327 ( "~<=~" PGNSP PGUID b f f 1042 1042 16 2329 2330 bpchar_pattern_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2329 ( "~>=~" PGNSP PGUID b f f 1042 1042 16 2327 2326 bpchar_pattern_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2329 ( "~>=~" PGNSP PGUID b f f 1042 1042 16 2327 2326 bpchar_pattern_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2330 ( "~>~" PGNSP PGUID b f f 1042 1042 16 2326 2327 bpchar_pattern_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2330 ( "~>~" PGNSP PGUID b f f 1042 1042 16 2326 2327 bpchar_pattern_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
/* crosstype operations for date vs. timestamp and timestamptz */
-DATA(insert OID = 2345 ( "<" PGNSP PGUID b f f 1082 1114 16 2375 2348 date_lt_timestamp scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2345 ( "<" PGNSP PGUID b f f 1082 1114 16 2375 2348 date_lt_timestamp scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2346 ( "<=" PGNSP PGUID b f f 1082 1114 16 2374 2349 date_le_timestamp scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2346 ( "<=" PGNSP PGUID b f f 1082 1114 16 2374 2349 date_le_timestamp scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2347 ( "=" PGNSP PGUID b t f 1082 1114 16 2373 2350 date_eq_timestamp eqsel eqjoinsel ));
+DATA(insert OID = 2347 ( "=" PGNSP PGUID b t f 1082 1114 16 2373 2350 date_eq_timestamp eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2348 ( ">=" PGNSP PGUID b f f 1082 1114 16 2372 2345 date_ge_timestamp scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2348 ( ">=" PGNSP PGUID b f f 1082 1114 16 2372 2345 date_ge_timestamp scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2349 ( ">" PGNSP PGUID b f f 1082 1114 16 2371 2346 date_gt_timestamp scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2349 ( ">" PGNSP PGUID b f f 1082 1114 16 2371 2346 date_gt_timestamp scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2350 ( "<>" PGNSP PGUID b f f 1082 1114 16 2376 2347 date_ne_timestamp neqsel neqjoinsel ));
+DATA(insert OID = 2350 ( "<>" PGNSP PGUID b f f 1082 1114 16 2376 2347 date_ne_timestamp neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2358 ( "<" PGNSP PGUID b f f 1082 1184 16 2388 2361 date_lt_timestamptz scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2358 ( "<" PGNSP PGUID b f f 1082 1184 16 2388 2361 date_lt_timestamptz scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2359 ( "<=" PGNSP PGUID b f f 1082 1184 16 2387 2362 date_le_timestamptz scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2359 ( "<=" PGNSP PGUID b f f 1082 1184 16 2387 2362 date_le_timestamptz scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2360 ( "=" PGNSP PGUID b t f 1082 1184 16 2386 2363 date_eq_timestamptz eqsel eqjoinsel ));
+DATA(insert OID = 2360 ( "=" PGNSP PGUID b t f 1082 1184 16 2386 2363 date_eq_timestamptz eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2361 ( ">=" PGNSP PGUID b f f 1082 1184 16 2385 2358 date_ge_timestamptz scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2361 ( ">=" PGNSP PGUID b f f 1082 1184 16 2385 2358 date_ge_timestamptz scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2362 ( ">" PGNSP PGUID b f f 1082 1184 16 2384 2359 date_gt_timestamptz scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2362 ( ">" PGNSP PGUID b f f 1082 1184 16 2384 2359 date_gt_timestamptz scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2363 ( "<>" PGNSP PGUID b f f 1082 1184 16 2389 2360 date_ne_timestamptz neqsel neqjoinsel ));
+DATA(insert OID = 2363 ( "<>" PGNSP PGUID b f f 1082 1184 16 2389 2360 date_ne_timestamptz neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2371 ( "<" PGNSP PGUID b f f 1114 1082 16 2349 2374 timestamp_lt_date scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2371 ( "<" PGNSP PGUID b f f 1114 1082 16 2349 2374 timestamp_lt_date scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2372 ( "<=" PGNSP PGUID b f f 1114 1082 16 2348 2375 timestamp_le_date scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2372 ( "<=" PGNSP PGUID b f f 1114 1082 16 2348 2375 timestamp_le_date scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2373 ( "=" PGNSP PGUID b t f 1114 1082 16 2347 2376 timestamp_eq_date eqsel eqjoinsel ));
+DATA(insert OID = 2373 ( "=" PGNSP PGUID b t f 1114 1082 16 2347 2376 timestamp_eq_date eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2374 ( ">=" PGNSP PGUID b f f 1114 1082 16 2346 2371 timestamp_ge_date scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2374 ( ">=" PGNSP PGUID b f f 1114 1082 16 2346 2371 timestamp_ge_date scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2375 ( ">" PGNSP PGUID b f f 1114 1082 16 2345 2372 timestamp_gt_date scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2375 ( ">" PGNSP PGUID b f f 1114 1082 16 2345 2372 timestamp_gt_date scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2376 ( "<>" PGNSP PGUID b f f 1114 1082 16 2350 2373 timestamp_ne_date neqsel neqjoinsel ));
+DATA(insert OID = 2376 ( "<>" PGNSP PGUID b f f 1114 1082 16 2350 2373 timestamp_ne_date neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2384 ( "<" PGNSP PGUID b f f 1184 1082 16 2362 2387 timestamptz_lt_date scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2384 ( "<" PGNSP PGUID b f f 1184 1082 16 2362 2387 timestamptz_lt_date scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2385 ( "<=" PGNSP PGUID b f f 1184 1082 16 2361 2388 timestamptz_le_date scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2385 ( "<=" PGNSP PGUID b f f 1184 1082 16 2361 2388 timestamptz_le_date scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2386 ( "=" PGNSP PGUID b t f 1184 1082 16 2360 2389 timestamptz_eq_date eqsel eqjoinsel ));
+DATA(insert OID = 2386 ( "=" PGNSP PGUID b t f 1184 1082 16 2360 2389 timestamptz_eq_date eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2387 ( ">=" PGNSP PGUID b f f 1184 1082 16 2359 2384 timestamptz_ge_date scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2387 ( ">=" PGNSP PGUID b f f 1184 1082 16 2359 2384 timestamptz_ge_date scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2388 ( ">" PGNSP PGUID b f f 1184 1082 16 2358 2385 timestamptz_gt_date scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2388 ( ">" PGNSP PGUID b f f 1184 1082 16 2358 2385 timestamptz_gt_date scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2389 ( "<>" PGNSP PGUID b f f 1184 1082 16 2363 2386 timestamptz_ne_date neqsel neqjoinsel ));
+DATA(insert OID = 2389 ( "<>" PGNSP PGUID b f f 1184 1082 16 2363 2386 timestamptz_ne_date neqsel neqjoinsel "mhf"));
DESCR("not equal");
/* crosstype operations for timestamp vs. timestamptz */
-DATA(insert OID = 2534 ( "<" PGNSP PGUID b f f 1114 1184 16 2544 2537 timestamp_lt_timestamptz scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2534 ( "<" PGNSP PGUID b f f 1114 1184 16 2544 2537 timestamp_lt_timestamptz scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2535 ( "<=" PGNSP PGUID b f f 1114 1184 16 2543 2538 timestamp_le_timestamptz scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2535 ( "<=" PGNSP PGUID b f f 1114 1184 16 2543 2538 timestamp_le_timestamptz scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2536 ( "=" PGNSP PGUID b t f 1114 1184 16 2542 2539 timestamp_eq_timestamptz eqsel eqjoinsel ));
+DATA(insert OID = 2536 ( "=" PGNSP PGUID b t f 1114 1184 16 2542 2539 timestamp_eq_timestamptz eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2537 ( ">=" PGNSP PGUID b f f 1114 1184 16 2541 2534 timestamp_ge_timestamptz scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2537 ( ">=" PGNSP PGUID b f f 1114 1184 16 2541 2534 timestamp_ge_timestamptz scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2538 ( ">" PGNSP PGUID b f f 1114 1184 16 2540 2535 timestamp_gt_timestamptz scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2538 ( ">" PGNSP PGUID b f f 1114 1184 16 2540 2535 timestamp_gt_timestamptz scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2539 ( "<>" PGNSP PGUID b f f 1114 1184 16 2545 2536 timestamp_ne_timestamptz neqsel neqjoinsel ));
+DATA(insert OID = 2539 ( "<>" PGNSP PGUID b f f 1114 1184 16 2545 2536 timestamp_ne_timestamptz neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2540 ( "<" PGNSP PGUID b f f 1184 1114 16 2538 2543 timestamptz_lt_timestamp scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2540 ( "<" PGNSP PGUID b f f 1184 1114 16 2538 2543 timestamptz_lt_timestamp scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2541 ( "<=" PGNSP PGUID b f f 1184 1114 16 2537 2544 timestamptz_le_timestamp scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2541 ( "<=" PGNSP PGUID b f f 1184 1114 16 2537 2544 timestamptz_le_timestamp scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2542 ( "=" PGNSP PGUID b t f 1184 1114 16 2536 2545 timestamptz_eq_timestamp eqsel eqjoinsel ));
+DATA(insert OID = 2542 ( "=" PGNSP PGUID b t f 1184 1114 16 2536 2545 timestamptz_eq_timestamp eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2543 ( ">=" PGNSP PGUID b f f 1184 1114 16 2535 2540 timestamptz_ge_timestamp scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2543 ( ">=" PGNSP PGUID b f f 1184 1114 16 2535 2540 timestamptz_ge_timestamp scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 2544 ( ">" PGNSP PGUID b f f 1184 1114 16 2534 2541 timestamptz_gt_timestamp scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2544 ( ">" PGNSP PGUID b f f 1184 1114 16 2534 2541 timestamptz_gt_timestamp scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2545 ( "<>" PGNSP PGUID b f f 1184 1114 16 2539 2542 timestamptz_ne_timestamp neqsel neqjoinsel ));
+DATA(insert OID = 2545 ( "<>" PGNSP PGUID b f f 1184 1114 16 2539 2542 timestamptz_ne_timestamp neqsel neqjoinsel "mhf"));
DESCR("not equal");
/* formerly-missing interval + datetime operators */
-DATA(insert OID = 2551 ( "+" PGNSP PGUID b f f 1186 1082 1114 1076 0 interval_pl_date - - ));
+DATA(insert OID = 2551 ( "+" PGNSP PGUID b f f 1186 1082 1114 1076 0 interval_pl_date - - "---"));
DESCR("add");
-DATA(insert OID = 2552 ( "+" PGNSP PGUID b f f 1186 1266 1266 1802 0 interval_pl_timetz - - ));
+DATA(insert OID = 2552 ( "+" PGNSP PGUID b f f 1186 1266 1266 1802 0 interval_pl_timetz - - "---"));
DESCR("add");
-DATA(insert OID = 2553 ( "+" PGNSP PGUID b f f 1186 1114 1114 2066 0 interval_pl_timestamp - - ));
+DATA(insert OID = 2553 ( "+" PGNSP PGUID b f f 1186 1114 1114 2066 0 interval_pl_timestamp - - "---"));
DESCR("add");
-DATA(insert OID = 2554 ( "+" PGNSP PGUID b f f 1186 1184 1184 1327 0 interval_pl_timestamptz - - ));
+DATA(insert OID = 2554 ( "+" PGNSP PGUID b f f 1186 1184 1184 1327 0 interval_pl_timestamptz - - "---"));
DESCR("add");
-DATA(insert OID = 2555 ( "+" PGNSP PGUID b f f 23 1082 1082 1100 0 integer_pl_date - - ));
+DATA(insert OID = 2555 ( "+" PGNSP PGUID b f f 23 1082 1082 1100 0 integer_pl_date - - "---"));
DESCR("add");
/* new operators for Y-direction rtree opfamilies */
-DATA(insert OID = 2570 ( "<<|" PGNSP PGUID b f f 603 603 16 0 0 box_below positionsel positionjoinsel ));
+DATA(insert OID = 2570 ( "<<|" PGNSP PGUID b f f 603 603 16 0 0 box_below positionsel positionjoinsel "---"));
DESCR("is below");
-DATA(insert OID = 2571 ( "&<|" PGNSP PGUID b f f 603 603 16 0 0 box_overbelow positionsel positionjoinsel ));
+DATA(insert OID = 2571 ( "&<|" PGNSP PGUID b f f 603 603 16 0 0 box_overbelow positionsel positionjoinsel "---"));
DESCR("overlaps or is below");
-DATA(insert OID = 2572 ( "|&>" PGNSP PGUID b f f 603 603 16 0 0 box_overabove positionsel positionjoinsel ));
+DATA(insert OID = 2572 ( "|&>" PGNSP PGUID b f f 603 603 16 0 0 box_overabove positionsel positionjoinsel "---"));
DESCR("overlaps or is above");
-DATA(insert OID = 2573 ( "|>>" PGNSP PGUID b f f 603 603 16 0 0 box_above positionsel positionjoinsel ));
+DATA(insert OID = 2573 ( "|>>" PGNSP PGUID b f f 603 603 16 0 0 box_above positionsel positionjoinsel "---"));
DESCR("is above");
-DATA(insert OID = 2574 ( "<<|" PGNSP PGUID b f f 604 604 16 0 0 poly_below positionsel positionjoinsel ));
+DATA(insert OID = 2574 ( "<<|" PGNSP PGUID b f f 604 604 16 0 0 poly_below positionsel positionjoinsel "---"));
DESCR("is below");
-DATA(insert OID = 2575 ( "&<|" PGNSP PGUID b f f 604 604 16 0 0 poly_overbelow positionsel positionjoinsel ));
+DATA(insert OID = 2575 ( "&<|" PGNSP PGUID b f f 604 604 16 0 0 poly_overbelow positionsel positionjoinsel "---"));
DESCR("overlaps or is below");
-DATA(insert OID = 2576 ( "|&>" PGNSP PGUID b f f 604 604 16 0 0 poly_overabove positionsel positionjoinsel ));
+DATA(insert OID = 2576 ( "|&>" PGNSP PGUID b f f 604 604 16 0 0 poly_overabove positionsel positionjoinsel "---"));
DESCR("overlaps or is above");
-DATA(insert OID = 2577 ( "|>>" PGNSP PGUID b f f 604 604 16 0 0 poly_above positionsel positionjoinsel ));
+DATA(insert OID = 2577 ( "|>>" PGNSP PGUID b f f 604 604 16 0 0 poly_above positionsel positionjoinsel "---"));
DESCR("is above");
-DATA(insert OID = 2589 ( "&<|" PGNSP PGUID b f f 718 718 16 0 0 circle_overbelow positionsel positionjoinsel ));
+DATA(insert OID = 2589 ( "&<|" PGNSP PGUID b f f 718 718 16 0 0 circle_overbelow positionsel positionjoinsel "---"));
DESCR("overlaps or is below");
-DATA(insert OID = 2590 ( "|&>" PGNSP PGUID b f f 718 718 16 0 0 circle_overabove positionsel positionjoinsel ));
+DATA(insert OID = 2590 ( "|&>" PGNSP PGUID b f f 718 718 16 0 0 circle_overabove positionsel positionjoinsel "---"));
DESCR("overlaps or is above");
/* overlap/contains/contained for arrays */
-DATA(insert OID = 2750 ( "&&" PGNSP PGUID b f f 2277 2277 16 2750 0 arrayoverlap arraycontsel arraycontjoinsel ));
+DATA(insert OID = 2750 ( "&&" PGNSP PGUID b f f 2277 2277 16 2750 0 arrayoverlap arraycontsel arraycontjoinsel "---"));
DESCR("overlaps");
#define OID_ARRAY_OVERLAP_OP 2750
-DATA(insert OID = 2751 ( "@>" PGNSP PGUID b f f 2277 2277 16 2752 0 arraycontains arraycontsel arraycontjoinsel ));
+DATA(insert OID = 2751 ( "@>" PGNSP PGUID b f f 2277 2277 16 2752 0 arraycontains arraycontsel arraycontjoinsel "---"));
DESCR("contains");
#define OID_ARRAY_CONTAINS_OP 2751
-DATA(insert OID = 2752 ( "<@" PGNSP PGUID b f f 2277 2277 16 2751 0 arraycontained arraycontsel arraycontjoinsel ));
+DATA(insert OID = 2752 ( "<@" PGNSP PGUID b f f 2277 2277 16 2751 0 arraycontained arraycontsel arraycontjoinsel "---"));
DESCR("is contained by");
#define OID_ARRAY_CONTAINED_OP 2752
/* capturing operators to preserve pre-8.3 behavior of text concatenation */
-DATA(insert OID = 2779 ( "||" PGNSP PGUID b f f 25 2776 25 0 0 textanycat - - ));
+DATA(insert OID = 2779 ( "||" PGNSP PGUID b f f 25 2776 25 0 0 textanycat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 2780 ( "||" PGNSP PGUID b f f 2776 25 25 0 0 anytextcat - - ));
+DATA(insert OID = 2780 ( "||" PGNSP PGUID b f f 2776 25 25 0 0 anytextcat - - "---"));
DESCR("concatenate");
/* obsolete names for contains/contained-by operators; remove these someday */
-DATA(insert OID = 2860 ( "@" PGNSP PGUID b f f 604 604 16 2861 0 poly_contained contsel contjoinsel ));
+DATA(insert OID = 2860 ( "@" PGNSP PGUID b f f 604 604 16 2861 0 poly_contained contsel contjoinsel "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2861 ( "~" PGNSP PGUID b f f 604 604 16 2860 0 poly_contain contsel contjoinsel ));
+DATA(insert OID = 2861 ( "~" PGNSP PGUID b f f 604 604 16 2860 0 poly_contain contsel contjoinsel "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2862 ( "@" PGNSP PGUID b f f 603 603 16 2863 0 box_contained contsel contjoinsel ));
+DATA(insert OID = 2862 ( "@" PGNSP PGUID b f f 603 603 16 2863 0 box_contained contsel contjoinsel "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2863 ( "~" PGNSP PGUID b f f 603 603 16 2862 0 box_contain contsel contjoinsel ));
+DATA(insert OID = 2863 ( "~" PGNSP PGUID b f f 603 603 16 2862 0 box_contain contsel contjoinsel "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2864 ( "@" PGNSP PGUID b f f 718 718 16 2865 0 circle_contained contsel contjoinsel ));
+DATA(insert OID = 2864 ( "@" PGNSP PGUID b f f 718 718 16 2865 0 circle_contained contsel contjoinsel "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2865 ( "~" PGNSP PGUID b f f 718 718 16 2864 0 circle_contain contsel contjoinsel ));
+DATA(insert OID = 2865 ( "~" PGNSP PGUID b f f 718 718 16 2864 0 circle_contain contsel contjoinsel "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2866 ( "@" PGNSP PGUID b f f 600 603 16 0 0 on_pb - - ));
+DATA(insert OID = 2866 ( "@" PGNSP PGUID b f f 600 603 16 0 0 on_pb - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2867 ( "@" PGNSP PGUID b f f 600 602 16 2868 0 on_ppath - - ));
+DATA(insert OID = 2867 ( "@" PGNSP PGUID b f f 600 602 16 2868 0 on_ppath - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2868 ( "~" PGNSP PGUID b f f 602 600 16 2867 0 path_contain_pt - - ));
+DATA(insert OID = 2868 ( "~" PGNSP PGUID b f f 602 600 16 2867 0 path_contain_pt - - "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2869 ( "@" PGNSP PGUID b f f 600 604 16 2870 0 pt_contained_poly - - ));
+DATA(insert OID = 2869 ( "@" PGNSP PGUID b f f 600 604 16 2870 0 pt_contained_poly - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2870 ( "~" PGNSP PGUID b f f 604 600 16 2869 0 poly_contain_pt - - ));
+DATA(insert OID = 2870 ( "~" PGNSP PGUID b f f 604 600 16 2869 0 poly_contain_pt - - "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2871 ( "@" PGNSP PGUID b f f 600 718 16 2872 0 pt_contained_circle - - ));
+DATA(insert OID = 2871 ( "@" PGNSP PGUID b f f 600 718 16 2872 0 pt_contained_circle - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2872 ( "~" PGNSP PGUID b f f 718 600 16 2871 0 circle_contain_pt - - ));
+DATA(insert OID = 2872 ( "~" PGNSP PGUID b f f 718 600 16 2871 0 circle_contain_pt - - "---"));
DESCR("deprecated, use @> instead");
-DATA(insert OID = 2873 ( "@" PGNSP PGUID b f f 600 628 16 0 0 on_pl - - ));
+DATA(insert OID = 2873 ( "@" PGNSP PGUID b f f 600 628 16 0 0 on_pl - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2874 ( "@" PGNSP PGUID b f f 600 601 16 0 0 on_ps - - ));
+DATA(insert OID = 2874 ( "@" PGNSP PGUID b f f 600 601 16 0 0 on_ps - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2875 ( "@" PGNSP PGUID b f f 601 628 16 0 0 on_sl - - ));
+DATA(insert OID = 2875 ( "@" PGNSP PGUID b f f 601 628 16 0 0 on_sl - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2876 ( "@" PGNSP PGUID b f f 601 603 16 0 0 on_sb - - ));
+DATA(insert OID = 2876 ( "@" PGNSP PGUID b f f 601 603 16 0 0 on_sb - - "---"));
DESCR("deprecated, use <@ instead");
-DATA(insert OID = 2877 ( "~" PGNSP PGUID b f f 1034 1033 16 0 0 aclcontains - - ));
+DATA(insert OID = 2877 ( "~" PGNSP PGUID b f f 1034 1033 16 0 0 aclcontains - - "---"));
DESCR("deprecated, use @> instead");
/* uuid operators */
-DATA(insert OID = 2972 ( "=" PGNSP PGUID b t t 2950 2950 16 2972 2973 uuid_eq eqsel eqjoinsel ));
+DATA(insert OID = 2972 ( "=" PGNSP PGUID b t t 2950 2950 16 2972 2973 uuid_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 2973 ( "<>" PGNSP PGUID b f f 2950 2950 16 2973 2972 uuid_ne neqsel neqjoinsel ));
+DATA(insert OID = 2973 ( "<>" PGNSP PGUID b f f 2950 2950 16 2973 2972 uuid_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2974 ( "<" PGNSP PGUID b f f 2950 2950 16 2975 2977 uuid_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2974 ( "<" PGNSP PGUID b f f 2950 2950 16 2975 2977 uuid_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 2975 ( ">" PGNSP PGUID b f f 2950 2950 16 2974 2976 uuid_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2975 ( ">" PGNSP PGUID b f f 2950 2950 16 2974 2976 uuid_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 2976 ( "<=" PGNSP PGUID b f f 2950 2950 16 2977 2975 uuid_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2976 ( "<=" PGNSP PGUID b f f 2950 2950 16 2977 2975 uuid_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2977 ( ">=" PGNSP PGUID b f f 2950 2950 16 2976 2974 uuid_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2977 ( ">=" PGNSP PGUID b f f 2950 2950 16 2976 2974 uuid_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* pg_lsn operators */
-DATA(insert OID = 3222 ( "=" PGNSP PGUID b t t 3220 3220 16 3222 3223 pg_lsn_eq eqsel eqjoinsel ));
+DATA(insert OID = 3222 ( "=" PGNSP PGUID b t t 3220 3220 16 3222 3223 pg_lsn_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3223 ( "<>" PGNSP PGUID b f f 3220 3220 16 3223 3222 pg_lsn_ne neqsel neqjoinsel ));
+DATA(insert OID = 3223 ( "<>" PGNSP PGUID b f f 3220 3220 16 3223 3222 pg_lsn_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3224 ( "<" PGNSP PGUID b f f 3220 3220 16 3225 3227 pg_lsn_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3224 ( "<" PGNSP PGUID b f f 3220 3220 16 3225 3227 pg_lsn_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3225 ( ">" PGNSP PGUID b f f 3220 3220 16 3224 3226 pg_lsn_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3225 ( ">" PGNSP PGUID b f f 3220 3220 16 3224 3226 pg_lsn_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3226 ( "<=" PGNSP PGUID b f f 3220 3220 16 3227 3225 pg_lsn_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3226 ( "<=" PGNSP PGUID b f f 3220 3220 16 3227 3225 pg_lsn_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3227 ( ">=" PGNSP PGUID b f f 3220 3220 16 3226 3224 pg_lsn_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3227 ( ">=" PGNSP PGUID b f f 3220 3220 16 3226 3224 pg_lsn_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 3228 ( "-" PGNSP PGUID b f f 3220 3220 1700 0 0 pg_lsn_mi - - ));
+DATA(insert OID = 3228 ( "-" PGNSP PGUID b f f 3220 3220 1700 0 0 pg_lsn_mi - - "---"));
DESCR("minus");
/* enum operators */
-DATA(insert OID = 3516 ( "=" PGNSP PGUID b t t 3500 3500 16 3516 3517 enum_eq eqsel eqjoinsel ));
+DATA(insert OID = 3516 ( "=" PGNSP PGUID b t t 3500 3500 16 3516 3517 enum_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3517 ( "<>" PGNSP PGUID b f f 3500 3500 16 3517 3516 enum_ne neqsel neqjoinsel ));
+DATA(insert OID = 3517 ( "<>" PGNSP PGUID b f f 3500 3500 16 3517 3516 enum_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3518 ( "<" PGNSP PGUID b f f 3500 3500 16 3519 3521 enum_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3518 ( "<" PGNSP PGUID b f f 3500 3500 16 3519 3521 enum_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3519 ( ">" PGNSP PGUID b f f 3500 3500 16 3518 3520 enum_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3519 ( ">" PGNSP PGUID b f f 3500 3500 16 3518 3520 enum_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3520 ( "<=" PGNSP PGUID b f f 3500 3500 16 3521 3519 enum_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3520 ( "<=" PGNSP PGUID b f f 3500 3500 16 3521 3519 enum_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3521 ( ">=" PGNSP PGUID b f f 3500 3500 16 3520 3518 enum_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3521 ( ">=" PGNSP PGUID b f f 3500 3500 16 3520 3518 enum_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/*
* tsearch operations
*/
-DATA(insert OID = 3627 ( "<" PGNSP PGUID b f f 3614 3614 16 3632 3631 tsvector_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3627 ( "<" PGNSP PGUID b f f 3614 3614 16 3632 3631 tsvector_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3628 ( "<=" PGNSP PGUID b f f 3614 3614 16 3631 3632 tsvector_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3628 ( "<=" PGNSP PGUID b f f 3614 3614 16 3631 3632 tsvector_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3629 ( "=" PGNSP PGUID b t f 3614 3614 16 3629 3630 tsvector_eq eqsel eqjoinsel ));
+DATA(insert OID = 3629 ( "=" PGNSP PGUID b t f 3614 3614 16 3629 3630 tsvector_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3630 ( "<>" PGNSP PGUID b f f 3614 3614 16 3630 3629 tsvector_ne neqsel neqjoinsel ));
+DATA(insert OID = 3630 ( "<>" PGNSP PGUID b f f 3614 3614 16 3630 3629 tsvector_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3631 ( ">=" PGNSP PGUID b f f 3614 3614 16 3628 3627 tsvector_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3631 ( ">=" PGNSP PGUID b f f 3614 3614 16 3628 3627 tsvector_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 3632 ( ">" PGNSP PGUID b f f 3614 3614 16 3627 3628 tsvector_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3632 ( ">" PGNSP PGUID b f f 3614 3614 16 3627 3628 tsvector_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3633 ( "||" PGNSP PGUID b f f 3614 3614 3614 0 0 tsvector_concat - - ));
+DATA(insert OID = 3633 ( "||" PGNSP PGUID b f f 3614 3614 3614 0 0 tsvector_concat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 3636 ( "@@" PGNSP PGUID b f f 3614 3615 16 3637 0 ts_match_vq tsmatchsel tsmatchjoinsel ));
+DATA(insert OID = 3636 ( "@@" PGNSP PGUID b f f 3614 3615 16 3637 0 ts_match_vq tsmatchsel tsmatchjoinsel "---"));
DESCR("text search match");
-DATA(insert OID = 3637 ( "@@" PGNSP PGUID b f f 3615 3614 16 3636 0 ts_match_qv tsmatchsel tsmatchjoinsel ));
+DATA(insert OID = 3637 ( "@@" PGNSP PGUID b f f 3615 3614 16 3636 0 ts_match_qv tsmatchsel tsmatchjoinsel "---"));
DESCR("text search match");
-DATA(insert OID = 3660 ( "@@@" PGNSP PGUID b f f 3614 3615 16 3661 0 ts_match_vq tsmatchsel tsmatchjoinsel ));
+DATA(insert OID = 3660 ( "@@@" PGNSP PGUID b f f 3614 3615 16 3661 0 ts_match_vq tsmatchsel tsmatchjoinsel "---"));
DESCR("deprecated, use @@ instead");
-DATA(insert OID = 3661 ( "@@@" PGNSP PGUID b f f 3615 3614 16 3660 0 ts_match_qv tsmatchsel tsmatchjoinsel ));
+DATA(insert OID = 3661 ( "@@@" PGNSP PGUID b f f 3615 3614 16 3660 0 ts_match_qv tsmatchsel tsmatchjoinsel "---"));
DESCR("deprecated, use @@ instead");
-DATA(insert OID = 3674 ( "<" PGNSP PGUID b f f 3615 3615 16 3679 3678 tsquery_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3674 ( "<" PGNSP PGUID b f f 3615 3615 16 3679 3678 tsquery_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3675 ( "<=" PGNSP PGUID b f f 3615 3615 16 3678 3679 tsquery_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3675 ( "<=" PGNSP PGUID b f f 3615 3615 16 3678 3679 tsquery_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3676 ( "=" PGNSP PGUID b t f 3615 3615 16 3676 3677 tsquery_eq eqsel eqjoinsel ));
+DATA(insert OID = 3676 ( "=" PGNSP PGUID b t f 3615 3615 16 3676 3677 tsquery_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3677 ( "<>" PGNSP PGUID b f f 3615 3615 16 3677 3676 tsquery_ne neqsel neqjoinsel ));
+DATA(insert OID = 3677 ( "<>" PGNSP PGUID b f f 3615 3615 16 3677 3676 tsquery_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3678 ( ">=" PGNSP PGUID b f f 3615 3615 16 3675 3674 tsquery_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3678 ( ">=" PGNSP PGUID b f f 3615 3615 16 3675 3674 tsquery_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 3679 ( ">" PGNSP PGUID b f f 3615 3615 16 3674 3675 tsquery_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3679 ( ">" PGNSP PGUID b f f 3615 3615 16 3674 3675 tsquery_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3680 ( "&&" PGNSP PGUID b f f 3615 3615 3615 0 0 tsquery_and - - ));
+DATA(insert OID = 3680 ( "&&" PGNSP PGUID b f f 3615 3615 3615 0 0 tsquery_and - - "---"));
DESCR("AND-concatenate");
-DATA(insert OID = 3681 ( "||" PGNSP PGUID b f f 3615 3615 3615 0 0 tsquery_or - - ));
+DATA(insert OID = 3681 ( "||" PGNSP PGUID b f f 3615 3615 3615 0 0 tsquery_or - - "---"));
DESCR("OR-concatenate");
-DATA(insert OID = 3682 ( "!!" PGNSP PGUID l f f 0 3615 3615 0 0 tsquery_not - - ));
+DATA(insert OID = 3682 ( "!!" PGNSP PGUID l f f 0 3615 3615 0 0 tsquery_not - - "---"));
DESCR("NOT tsquery");
-DATA(insert OID = 3693 ( "@>" PGNSP PGUID b f f 3615 3615 16 3694 0 tsq_mcontains contsel contjoinsel ));
+DATA(insert OID = 3693 ( "@>" PGNSP PGUID b f f 3615 3615 16 3694 0 tsq_mcontains contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 3694 ( "<@" PGNSP PGUID b f f 3615 3615 16 3693 0 tsq_mcontained contsel contjoinsel ));
+DATA(insert OID = 3694 ( "<@" PGNSP PGUID b f f 3615 3615 16 3693 0 tsq_mcontained contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 3762 ( "@@" PGNSP PGUID b f f 25 25 16 0 0 ts_match_tt contsel contjoinsel ));
+DATA(insert OID = 3762 ( "@@" PGNSP PGUID b f f 25 25 16 0 0 ts_match_tt contsel contjoinsel "---"));
DESCR("text search match");
-DATA(insert OID = 3763 ( "@@" PGNSP PGUID b f f 25 3615 16 0 0 ts_match_tq contsel contjoinsel ));
+DATA(insert OID = 3763 ( "@@" PGNSP PGUID b f f 25 3615 16 0 0 ts_match_tq contsel contjoinsel "---"));
DESCR("text search match");
/* generic record comparison operators */
-DATA(insert OID = 2988 ( "=" PGNSP PGUID b t f 2249 2249 16 2988 2989 record_eq eqsel eqjoinsel ));
+DATA(insert OID = 2988 ( "=" PGNSP PGUID b t f 2249 2249 16 2988 2989 record_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
#define RECORD_EQ_OP 2988
-DATA(insert OID = 2989 ( "<>" PGNSP PGUID b f f 2249 2249 16 2989 2988 record_ne neqsel neqjoinsel ));
+DATA(insert OID = 2989 ( "<>" PGNSP PGUID b f f 2249 2249 16 2989 2988 record_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 2990 ( "<" PGNSP PGUID b f f 2249 2249 16 2991 2993 record_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2990 ( "<" PGNSP PGUID b f f 2249 2249 16 2991 2993 record_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
#define RECORD_LT_OP 2990
-DATA(insert OID = 2991 ( ">" PGNSP PGUID b f f 2249 2249 16 2990 2992 record_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2991 ( ">" PGNSP PGUID b f f 2249 2249 16 2990 2992 record_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
#define RECORD_GT_OP 2991
-DATA(insert OID = 2992 ( "<=" PGNSP PGUID b f f 2249 2249 16 2993 2991 record_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 2992 ( "<=" PGNSP PGUID b f f 2249 2249 16 2993 2991 record_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 2993 ( ">=" PGNSP PGUID b f f 2249 2249 16 2992 2990 record_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 2993 ( ">=" PGNSP PGUID b f f 2249 2249 16 2992 2990 record_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* byte-oriented tests for identical rows and fast sorting */
-DATA(insert OID = 3188 ( "*=" PGNSP PGUID b t f 2249 2249 16 3188 3189 record_image_eq eqsel eqjoinsel ));
+DATA(insert OID = 3188 ( "*=" PGNSP PGUID b t f 2249 2249 16 3188 3189 record_image_eq eqsel eqjoinsel "mhf"));
DESCR("identical");
-DATA(insert OID = 3189 ( "*<>" PGNSP PGUID b f f 2249 2249 16 3189 3188 record_image_ne neqsel neqjoinsel ));
+DATA(insert OID = 3189 ( "*<>" PGNSP PGUID b f f 2249 2249 16 3189 3188 record_image_ne neqsel neqjoinsel "mhf"));
DESCR("not identical");
-DATA(insert OID = 3190 ( "*<" PGNSP PGUID b f f 2249 2249 16 3191 3193 record_image_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3190 ( "*<" PGNSP PGUID b f f 2249 2249 16 3191 3193 record_image_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3191 ( "*>" PGNSP PGUID b f f 2249 2249 16 3190 3192 record_image_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3191 ( "*>" PGNSP PGUID b f f 2249 2249 16 3190 3192 record_image_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3192 ( "*<=" PGNSP PGUID b f f 2249 2249 16 3193 3191 record_image_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3192 ( "*<=" PGNSP PGUID b f f 2249 2249 16 3193 3191 record_image_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3193 ( "*>=" PGNSP PGUID b f f 2249 2249 16 3192 3190 record_image_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3193 ( "*>=" PGNSP PGUID b f f 2249 2249 16 3192 3190 record_image_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
/* generic range type operators */
-DATA(insert OID = 3882 ( "=" PGNSP PGUID b t t 3831 3831 16 3882 3883 range_eq eqsel eqjoinsel ));
+DATA(insert OID = 3882 ( "=" PGNSP PGUID b t t 3831 3831 16 3882 3883 range_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3883 ( "<>" PGNSP PGUID b f f 3831 3831 16 3883 3882 range_ne neqsel neqjoinsel ));
+DATA(insert OID = 3883 ( "<>" PGNSP PGUID b f f 3831 3831 16 3883 3882 range_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3884 ( "<" PGNSP PGUID b f f 3831 3831 16 3887 3886 range_lt rangesel scalarltjoinsel ));
+DATA(insert OID = 3884 ( "<" PGNSP PGUID b f f 3831 3831 16 3887 3886 range_lt rangesel scalarltjoinsel "---"));
DESCR("less than");
#define OID_RANGE_LESS_OP 3884
-DATA(insert OID = 3885 ( "<=" PGNSP PGUID b f f 3831 3831 16 3886 3887 range_le rangesel scalarltjoinsel ));
+DATA(insert OID = 3885 ( "<=" PGNSP PGUID b f f 3831 3831 16 3886 3887 range_le rangesel scalarltjoinsel "---"));
DESCR("less than or equal");
#define OID_RANGE_LESS_EQUAL_OP 3885
-DATA(insert OID = 3886 ( ">=" PGNSP PGUID b f f 3831 3831 16 3885 3884 range_ge rangesel scalargtjoinsel ));
+DATA(insert OID = 3886 ( ">=" PGNSP PGUID b f f 3831 3831 16 3885 3884 range_ge rangesel scalargtjoinsel "---"));
DESCR("greater than or equal");
#define OID_RANGE_GREATER_EQUAL_OP 3886
-DATA(insert OID = 3887 ( ">" PGNSP PGUID b f f 3831 3831 16 3884 3885 range_gt rangesel scalargtjoinsel ));
+DATA(insert OID = 3887 ( ">" PGNSP PGUID b f f 3831 3831 16 3884 3885 range_gt rangesel scalargtjoinsel "---"));
DESCR("greater than");
#define OID_RANGE_GREATER_OP 3887
-DATA(insert OID = 3888 ( "&&" PGNSP PGUID b f f 3831 3831 16 3888 0 range_overlaps rangesel areajoinsel ));
+DATA(insert OID = 3888 ( "&&" PGNSP PGUID b f f 3831 3831 16 3888 0 range_overlaps rangesel areajoinsel "---"));
DESCR("overlaps");
#define OID_RANGE_OVERLAP_OP 3888
-DATA(insert OID = 3889 ( "@>" PGNSP PGUID b f f 3831 2283 16 3891 0 range_contains_elem rangesel contjoinsel ));
+DATA(insert OID = 3889 ( "@>" PGNSP PGUID b f f 3831 2283 16 3891 0 range_contains_elem rangesel contjoinsel "---"));
DESCR("contains");
#define OID_RANGE_CONTAINS_ELEM_OP 3889
-DATA(insert OID = 3890 ( "@>" PGNSP PGUID b f f 3831 3831 16 3892 0 range_contains rangesel contjoinsel ));
+DATA(insert OID = 3890 ( "@>" PGNSP PGUID b f f 3831 3831 16 3892 0 range_contains rangesel contjoinsel "---"));
DESCR("contains");
#define OID_RANGE_CONTAINS_OP 3890
-DATA(insert OID = 3891 ( "<@" PGNSP PGUID b f f 2283 3831 16 3889 0 elem_contained_by_range rangesel contjoinsel ));
+DATA(insert OID = 3891 ( "<@" PGNSP PGUID b f f 2283 3831 16 3889 0 elem_contained_by_range rangesel contjoinsel "---"));
DESCR("is contained by");
#define OID_RANGE_ELEM_CONTAINED_OP 3891
-DATA(insert OID = 3892 ( "<@" PGNSP PGUID b f f 3831 3831 16 3890 0 range_contained_by rangesel contjoinsel ));
+DATA(insert OID = 3892 ( "<@" PGNSP PGUID b f f 3831 3831 16 3890 0 range_contained_by rangesel contjoinsel "---"));
DESCR("is contained by");
#define OID_RANGE_CONTAINED_OP 3892
-DATA(insert OID = 3893 ( "<<" PGNSP PGUID b f f 3831 3831 16 3894 0 range_before rangesel scalarltjoinsel ));
+DATA(insert OID = 3893 ( "<<" PGNSP PGUID b f f 3831 3831 16 3894 0 range_before rangesel scalarltjoinsel "---"));
DESCR("is left of");
#define OID_RANGE_LEFT_OP 3893
-DATA(insert OID = 3894 ( ">>" PGNSP PGUID b f f 3831 3831 16 3893 0 range_after rangesel scalargtjoinsel ));
+DATA(insert OID = 3894 ( ">>" PGNSP PGUID b f f 3831 3831 16 3893 0 range_after rangesel scalargtjoinsel "---"));
DESCR("is right of");
#define OID_RANGE_RIGHT_OP 3894
-DATA(insert OID = 3895 ( "&<" PGNSP PGUID b f f 3831 3831 16 0 0 range_overleft rangesel scalarltjoinsel ));
+DATA(insert OID = 3895 ( "&<" PGNSP PGUID b f f 3831 3831 16 0 0 range_overleft rangesel scalarltjoinsel "---"));
DESCR("overlaps or is left of");
#define OID_RANGE_OVERLAPS_LEFT_OP 3895
-DATA(insert OID = 3896 ( "&>" PGNSP PGUID b f f 3831 3831 16 0 0 range_overright rangesel scalargtjoinsel ));
+DATA(insert OID = 3896 ( "&>" PGNSP PGUID b f f 3831 3831 16 0 0 range_overright rangesel scalargtjoinsel "---"));
DESCR("overlaps or is right of");
#define OID_RANGE_OVERLAPS_RIGHT_OP 3896
-DATA(insert OID = 3897 ( "-|-" PGNSP PGUID b f f 3831 3831 16 3897 0 range_adjacent contsel contjoinsel ));
+DATA(insert OID = 3897 ( "-|-" PGNSP PGUID b f f 3831 3831 16 3897 0 range_adjacent contsel contjoinsel "---"));
DESCR("is adjacent to");
-DATA(insert OID = 3898 ( "+" PGNSP PGUID b f f 3831 3831 3831 3898 0 range_union - - ));
+DATA(insert OID = 3898 ( "+" PGNSP PGUID b f f 3831 3831 3831 3898 0 range_union - - "---"));
DESCR("range union");
-DATA(insert OID = 3899 ( "-" PGNSP PGUID b f f 3831 3831 3831 0 0 range_minus - - ));
+DATA(insert OID = 3899 ( "-" PGNSP PGUID b f f 3831 3831 3831 0 0 range_minus - - "---"));
DESCR("range difference");
-DATA(insert OID = 3900 ( "*" PGNSP PGUID b f f 3831 3831 3831 3900 0 range_intersect - - ));
+DATA(insert OID = 3900 ( "*" PGNSP PGUID b f f 3831 3831 3831 3900 0 range_intersect - - "---"));
DESCR("range intersection");
-DATA(insert OID = 3962 ( "->" PGNSP PGUID b f f 114 25 114 0 0 json_object_field - - ));
+DATA(insert OID = 3962 ( "->" PGNSP PGUID b f f 114 25 114 0 0 json_object_field - - "---"));
DESCR("get json object field");
-DATA(insert OID = 3963 ( "->>" PGNSP PGUID b f f 114 25 25 0 0 json_object_field_text - - ));
+DATA(insert OID = 3963 ( "->>" PGNSP PGUID b f f 114 25 25 0 0 json_object_field_text - - "---"));
DESCR("get json object field as text");
-DATA(insert OID = 3964 ( "->" PGNSP PGUID b f f 114 23 114 0 0 json_array_element - - ));
+DATA(insert OID = 3964 ( "->" PGNSP PGUID b f f 114 23 114 0 0 json_array_element - - "---"));
DESCR("get json array element");
-DATA(insert OID = 3965 ( "->>" PGNSP PGUID b f f 114 23 25 0 0 json_array_element_text - - ));
+DATA(insert OID = 3965 ( "->>" PGNSP PGUID b f f 114 23 25 0 0 json_array_element_text - - "---"));
DESCR("get json array element as text");
-DATA(insert OID = 3966 ( "#>" PGNSP PGUID b f f 114 1009 114 0 0 json_extract_path - - ));
+DATA(insert OID = 3966 ( "#>" PGNSP PGUID b f f 114 1009 114 0 0 json_extract_path - - "---"));
DESCR("get value from json with path elements");
-DATA(insert OID = 3967 ( "#>>" PGNSP PGUID b f f 114 1009 25 0 0 json_extract_path_text - - ));
+DATA(insert OID = 3967 ( "#>>" PGNSP PGUID b f f 114 1009 25 0 0 json_extract_path_text - - "---"));
DESCR("get value from json as text with path elements");
-DATA(insert OID = 3211 ( "->" PGNSP PGUID b f f 3802 25 3802 0 0 jsonb_object_field - - ));
+DATA(insert OID = 3211 ( "->" PGNSP PGUID b f f 3802 25 3802 0 0 jsonb_object_field - - "---"));
DESCR("get jsonb object field");
-DATA(insert OID = 3477 ( "->>" PGNSP PGUID b f f 3802 25 25 0 0 jsonb_object_field_text - - ));
+DATA(insert OID = 3477 ( "->>" PGNSP PGUID b f f 3802 25 25 0 0 jsonb_object_field_text - - "---"));
DESCR("get jsonb object field as text");
-DATA(insert OID = 3212 ( "->" PGNSP PGUID b f f 3802 23 3802 0 0 jsonb_array_element - - ));
+DATA(insert OID = 3212 ( "->" PGNSP PGUID b f f 3802 23 3802 0 0 jsonb_array_element - - "---"));
DESCR("get jsonb array element");
-DATA(insert OID = 3481 ( "->>" PGNSP PGUID b f f 3802 23 25 0 0 jsonb_array_element_text - - ));
+DATA(insert OID = 3481 ( "->>" PGNSP PGUID b f f 3802 23 25 0 0 jsonb_array_element_text - - "---"));
DESCR("get jsonb array element as text");
-DATA(insert OID = 3213 ( "#>" PGNSP PGUID b f f 3802 1009 3802 0 0 jsonb_extract_path - - ));
+DATA(insert OID = 3213 ( "#>" PGNSP PGUID b f f 3802 1009 3802 0 0 jsonb_extract_path - - "---"));
DESCR("get value from jsonb with path elements");
-DATA(insert OID = 3206 ( "#>>" PGNSP PGUID b f f 3802 1009 25 0 0 jsonb_extract_path_text - - ));
+DATA(insert OID = 3206 ( "#>>" PGNSP PGUID b f f 3802 1009 25 0 0 jsonb_extract_path_text - - "---"));
DESCR("get value from jsonb as text with path elements");
-DATA(insert OID = 3240 ( "=" PGNSP PGUID b t t 3802 3802 16 3240 3241 jsonb_eq eqsel eqjoinsel ));
+DATA(insert OID = 3240 ( "=" PGNSP PGUID b t t 3802 3802 16 3240 3241 jsonb_eq eqsel eqjoinsel "mhf"));
DESCR("equal");
-DATA(insert OID = 3241 ( "<>" PGNSP PGUID b f f 3802 3802 16 3241 3240 jsonb_ne neqsel neqjoinsel ));
+DATA(insert OID = 3241 ( "<>" PGNSP PGUID b f f 3802 3802 16 3241 3240 jsonb_ne neqsel neqjoinsel "mhf"));
DESCR("not equal");
-DATA(insert OID = 3242 ( "<" PGNSP PGUID b f f 3802 3802 16 3243 3245 jsonb_lt scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3242 ( "<" PGNSP PGUID b f f 3802 3802 16 3243 3245 jsonb_lt scalarltsel scalarltjoinsel "mh-"));
DESCR("less than");
-DATA(insert OID = 3243 ( ">" PGNSP PGUID b f f 3802 3802 16 3242 3244 jsonb_gt scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3243 ( ">" PGNSP PGUID b f f 3802 3802 16 3242 3244 jsonb_gt scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than");
-DATA(insert OID = 3244 ( "<=" PGNSP PGUID b f f 3802 3802 16 3245 3243 jsonb_le scalarltsel scalarltjoinsel ));
+DATA(insert OID = 3244 ( "<=" PGNSP PGUID b f f 3802 3802 16 3245 3243 jsonb_le scalarltsel scalarltjoinsel "mh-"));
DESCR("less than or equal");
-DATA(insert OID = 3245 ( ">=" PGNSP PGUID b f f 3802 3802 16 3244 3242 jsonb_ge scalargtsel scalargtjoinsel ));
+DATA(insert OID = 3245 ( ">=" PGNSP PGUID b f f 3802 3802 16 3244 3242 jsonb_ge scalargtsel scalargtjoinsel "mh-"));
DESCR("greater than or equal");
-DATA(insert OID = 3246 ( "@>" PGNSP PGUID b f f 3802 3802 16 3250 0 jsonb_contains contsel contjoinsel ));
+DATA(insert OID = 3246 ( "@>" PGNSP PGUID b f f 3802 3802 16 3250 0 jsonb_contains contsel contjoinsel "---"));
DESCR("contains");
-DATA(insert OID = 3247 ( "?" PGNSP PGUID b f f 3802 25 16 0 0 jsonb_exists contsel contjoinsel ));
+DATA(insert OID = 3247 ( "?" PGNSP PGUID b f f 3802 25 16 0 0 jsonb_exists contsel contjoinsel "---"));
DESCR("exists");
-DATA(insert OID = 3248 ( "?|" PGNSP PGUID b f f 3802 1009 16 0 0 jsonb_exists_any contsel contjoinsel ));
+DATA(insert OID = 3248 ( "?|" PGNSP PGUID b f f 3802 1009 16 0 0 jsonb_exists_any contsel contjoinsel "---"));
DESCR("exists any");
-DATA(insert OID = 3249 ( "?&" PGNSP PGUID b f f 3802 1009 16 0 0 jsonb_exists_all contsel contjoinsel ));
+DATA(insert OID = 3249 ( "?&" PGNSP PGUID b f f 3802 1009 16 0 0 jsonb_exists_all contsel contjoinsel "---"));
DESCR("exists all");
-DATA(insert OID = 3250 ( "<@" PGNSP PGUID b f f 3802 3802 16 3246 0 jsonb_contained contsel contjoinsel ));
+DATA(insert OID = 3250 ( "<@" PGNSP PGUID b f f 3802 3802 16 3246 0 jsonb_contained contsel contjoinsel "---"));
DESCR("is contained by");
-DATA(insert OID = 3284 ( "||" PGNSP PGUID b f f 3802 3802 3802 0 0 jsonb_concat - - ));
+DATA(insert OID = 3284 ( "||" PGNSP PGUID b f f 3802 3802 3802 0 0 jsonb_concat - - "---"));
DESCR("concatenate");
-DATA(insert OID = 3285 ( "-" PGNSP PGUID b f f 3802 25 3802 0 0 3302 - - ));
+DATA(insert OID = 3285 ( "-" PGNSP PGUID b f f 3802 25 3802 0 0 3302 - - "---"));
DESCR("delete object field");
-DATA(insert OID = 3286 ( "-" PGNSP PGUID b f f 3802 23 3802 0 0 3303 - - ));
+DATA(insert OID = 3286 ( "-" PGNSP PGUID b f f 3802 23 3802 0 0 3303 - - "---"));
DESCR("delete array element");
-DATA(insert OID = 3287 ( "#-" PGNSP PGUID b f f 3802 1009 3802 0 0 jsonb_delete_path - - ));
+DATA(insert OID = 3287 ( "#-" PGNSP PGUID b f f 3802 1009 3802 0 0 jsonb_delete_path - - "---"));
DESCR("delete path");
/*
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 5a8d0ee..03935af 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -250,6 +250,7 @@ typedef enum NodeTag
T_MinMaxAggInfo,
T_PlannerParamItem,
T_MVStatisticInfo,
+ T_RestrictStatData,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 1979cdf..b78ee5d 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -15,12 +15,12 @@
#define RELATION_H
#include "access/sdir.h"
+#include "access/htup.h"
#include "lib/stringinfo.h"
#include "nodes/params.h"
#include "nodes/parsenodes.h"
#include "storage/block.h"
-
/*
* Relids
* Set of relation identifiers (indexes into the rangetable).
@@ -1341,6 +1341,26 @@ typedef struct RestrictInfo
Selectivity right_bucketsize; /* avg bucketsize of right side */
} RestrictInfo;
+typedef struct bm_mvstat
+{
+ Bitmapset *attrs;
+ MVStatisticInfo *stats;
+ int mvkind;
+} bm_mvstat;
+
+typedef struct RestrictStatData
+{
+ NodeTag type;
+ BoolExprType boolop;
+ Node *clause;
+ Node *mvclause;
+ Node *nonmvclause;
+ List *children;
+ List *mvstats;
+ Bitmapset *mvattrs;
+ List *unusedrinfos;
+} RestrictStatData;
+
/*
* Since mergejoinscansel() is a relatively expensive function, and would
* otherwise be invoked many times while planning a large join tree,
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 1445f3f..dd43e45 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -184,13 +184,11 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions);
+ SpecialJoinInfo *sjinfo);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo,
- List *conditions);
+ SpecialJoinInfo *sjinfo);
#endif /* COST_H */
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 9711538..6a9bec9 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -84,6 +84,7 @@ extern Oid get_commutator(Oid opno);
extern Oid get_negator(Oid opno);
extern RegProcedure get_oprrest(Oid opno);
extern RegProcedure get_oprjoin(Oid opno);
+extern int get_oprmvstat(Oid opno);
extern char *get_func_name(Oid funcid);
extern Oid get_func_namespace(Oid funcid);
extern Oid get_func_rettype(Oid funcid);
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index f2fbc11..a08fd58 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -34,6 +34,9 @@ extern int mvstat_search_type;
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+#define MVSTATISTIC_MCV 1
+#define MVSTATISTIC_HIST 2
+#define MVSTATISTIC_FDEP 4
/*
* Functional dependencies, tracking column-level relationships (values
--
1.8.3.1
Hi,
On 07/30/2015 10:21 AM, Heikki Linnakangas wrote:
On 05/25/2015 11:43 PM, Tomas Vondra wrote:
There are 6 files attached, but only 0002-0006 are actually part of the
multivariate statistics patch itself.All of these patches are huge. In order to review this in a reasonable
amount of time, we need to do this in several steps. So let's see what
would be the minimal set of these patches that could be reviewed and
committed, while still being useful.The main patches are:
1. shared infrastructure and functional dependencies
2. clause reduction using functional dependencies
3. multivariate MCV lists
4. multivariate histograms
5. multi-statistics estimationWould it make sense to commit only patches 1 and 2 first? Would that be
enough to get a benefit from this?
I agree that the patch can't be reviewed as a single chunk - that was
the idea when I split the original (single chunk) patch into multiple
smaller pieces.
And yes, I believe committing pieces 1&2 might be enough to get
something useful, which can then be improved by adding the "usual" MCV
and histogram stats on top of that.
I have some doubts about the clause reduction and functional
dependencies part of this. It seems to treat functional dependency as
a boolean property, but even with the classic zipcode and city case,
it's not always an all or nothing thing. At least in some countries,
there can be zipcodes that span multiple cities. So zipcode=X does
not completely imply city=Y, although there is a strong correlation
(if that's the right term). How strong does the correlation need to
be for this patch to decide that zipcode implies city? I couldn't
actually see a clear threshold stated anywhere.So rather than treating functional dependence as a boolean, I think
it would make more sense to put a 0.0-1.0 number to it. That means
that you can't do clause reduction like it's done in this patch,
where you actually remove clauses from the query for cost esimation
purposes. Instead, you need to calculate the selectivity for each
clause independently, but instead of just multiplying the
selectivities together, apply the "dependence factor" to it.Does that make sense? I haven't really looked at the MCV, histogram
and "multi-statistics estimation" patches yet. Do those patches make
the clause reduction patch obsolete? Should we forget about the
clause reduction and functional dependency patch, and focus on those
later patches instead?
Perhaps. It's true that most real-world data sets are not 100% valid
with respect to functional dependencies - either because of natural
imperfections (multiple cities with the same ZIP code) or just noise in
the data (incorrect entries ...). And it's even mentioned in the code
comments somewhere, I guess.
But there are two main reasons why I chose not to extend the functional
dependencies with the [0.0-1.0] value you propose.
Firstly, functional dependencies were meant to be the simplest possible
implementation, illustrating how the "infrastructure" is supposed to
work (which is the main topic of the first patch).
Secondly, all kinds of statistics are "simplifications" of the actual
data. So I think it's not incorrect to ignore the exceptions up to some
threshold.
I also don't think this will make the estimates globally better. Let's
say you have 1% of rows that contradict the functional dependency - you
may either ignore them and have good estimates for 99% of the values and
incorrect estimates for 1%, or tweak the rule a bit and make the
estimates worse for 99% (and possibly better for 1%).
That being said, I'm not against improving the functional dependencies.
I already do have some improvements on my TODO - like for example
dependencies on more columns (not just A=>B but [A,B]=>C and such), but
I think we should not squash this into those two patches.
And yet another point - ISTM these cases might easily be handled better
by the statistics based on ndistinct coefficients, as proposed by
Kyotaro-san some time ago. That is, compute and track
ndistinct(A) * ndistinct(B) / ndistinct(A,B)
for all pairs of columns (or possibly larger groups). That seems to be
similar to the coefficient you propose.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
On 07/30/2015 01:26 PM, Kyotaro HORIGUCHI wrote:
Hello, I certainly attached the file this time.
At Mon, 27 Jul 2015 23:54:08 +0200, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote in <55B6A880.3050801@2ndquadrant.com>
The bottom-up would work too, probably - I mean, we could start from
leaves of the expression tree, and build the largest "subtree"
compatible with multivariate stats and then try to estimate it. I
don't see how we could pass conditions though, which works naturally
in the top-down approach.By the way, the 'condition' looks to mean what will be received
by the parameter of clause(list)_selectivity with the same
name. But it is always NIL. Looking at the comment for
collect_mv_attnum, it is prepared for 'multitable statistics'. If
so, I think it's better removed from the current patch, because
it is useless now.
I don't think so. Conditions certainly are not meant for multitable
statistics only (I don't see any comment suggesting that at
collect_mv_attnums), but are actually used with the current code.
For example try this:
create table t (a int, b int, c int);
insert into t select i/100, i/100, i/100
from generate_series(1,100000) s(i);
alter table t add statistics (mcv) on (a,b);
analyze t;
select * from t where a<10 and b < 10 and (a < 50 or b < 50 or c < 50);
What will happen when estimating this query is this:
(1) clauselist_selectivity is called, and sees a list of three clauses:
(a<10)
(b<10)
(a<50 OR b<50 OR c<50)
But there's only a single statistics on columns [a,b] so at this
point we can process only the first two clauses. So we'll do that,
computing
P(a<10, b<10)
and we'll pass the OR-clause to the clause_selectivity() call, along
with the two already estimated clauses as conditions.
(b) clause_selectivity will receive (a<50 OR b<50 OR c<50) as a clause
to estimate, and the two clauses as conditions, computing
P(a<50 OR b<50 OR c<50 | a<10, b<10)
The current estimate for the OR-clause is off, but I believe that's a
bug in the current implementation of clauselist_selectivity_or(), and
we've already discussed that some time ago.
The functional dependency code looks immature in both the
detection phase and application phase in comparison to MCV and
histogram. Addition to that, as the comment in dependencies.c
says, fdep is not so significant (than MCV/HIST) because it is
usually carefully avoided and should be noticed and considered in
designing of application or the whole system.
The code is certainly imperfect and needs improvements, no doubt about
that. I have certainly spent much more time on MCV/histograms.
I'm not sure about stating that functional dependencies are less
significant than MCV/HIST (I don't see any such statement in
dependencies.c). I might have thought that initially, when I opted to
implement fdeps as the simplest possible type of statistics, but I think
it's quite practical, actually.
I however disagree about the last point - it's true that in many cases
the databases are carefully normalized, which mostly makes functional
dependencies irrelevant. But this is only true for OLTP systems, while
the primary target of the patch are DSS/DWH systems. And in those
systems denormalization is a very common practice.
So I don't think fdeps are completely irrelevant - it's quite useful in
some scenarios, actually. Similarly to the ndistinct coefficient stats
that you proposed, for example.
Persisting to apply them all at once doesn't seem to be a good
strategy to be adopted earlier.
Why?
Or perhaps it might be better to register the dependency itself
than registering incomplete information (only the set of colums
involoved in the relationship) and try to detect the relationship
from the given values. I suppose those who can register the
columnset know the precise nature of the dependency in advance.
I don't see how that could be done? I mean, you only have the constants
supplied in the query - how could you verify the functional dependency
based on just those values (or even decide the direction)?
What do you mean by "reconstruct the expression tree"? It's true I'm
walking the expression tree top-down, but how is that reconstructing?For example clauselist_mv_split does. It separates mvclauses from
original clauselist and apply mv-stats at once and (parhaps) let
the rest be processed in the 'normal' route. I called this as
"reconstruct", which I tried to do explicity and separately.
Ah, I see. Thanks for the explanation. I wouldn't call this
"reconstruction" though - I merely need to track which clauses to
estimate using multivariate stats (and which need to be estimated using
the regular stats). That's pretty much what RestrictStatData does, no?
I find your comments very valuable. I may not agree with some of
them, but I certainly appreciate your point of view. So thank you
very much for the time you spent reviewing this patch so far!Yeah, thank you for your patience and kindness.
Likewise. It's very frustrating trying to understand complex code
written by someone else, and I appreciate your effort.
Regarding the complexity - I am not too worried about spending
more CPU cycles on this, as long as it does not impact the case
where people have no multivariate statistics at all. That's because
I expect people to use this for large DSS/DWH data sets with lots
of dependencies in the (often denormalized) tables and complex
conditions - in those cases the planning difference is negligible,
especially if the improved estimates make the query run in seconds
instead of hours.I share the vision with you. If that is the case, the mv-stats
route should not be intrude the existing non-mv-stats route. I
feel you have too much intruded clauselist_selectivity all the
more.If that is the case, my mv-distinct code has different objective
from you. It aims to save the misestimation from multicolumn
correlations more commonly occurs in OLTP usage.
OK. Let's see if we can make it work for both use cases.
This is why I was so careful to entirely skip the expensive
processing when where were no multivariate stats, and why I don't
like the fact that your approach makes this skip more difficult (or
maybe impossible, I'm not sure).My code totally skips if transformRestrictionForEstimate returns
NULL and runs clauselist_selectivity as usual. I think almost the
same as yours.
Ah, OK. Perhaps I missed that as I've had trouble applying the patch.
However, if you think it I believe we should not only skipping
calculation but also hiding the additional code blocks which is
overwhelming the normal route. The one of major objectives of my
approach is that point.
My main concern at this point was planning time, so skipping the
calculation should be enough I believe. Hiding the additional code
blocks is a matter of aesthetics, and we can address that by moving it
to a separate method or such.
But sorry. I found that considering multiple stats at every level
cannot be done without exhaustive searching of combinations among
child clauses and needs additional data structure. It needs more
thoughs.. As mentioned later, top-down might be suitable for
this optimization.
Do you think a combined approach - first bottom-up preprocessing, then
top-down optimization (using the results of the first phase to speed
things up) - might work?
Understood. As I explained above, I'm not all that concerned about
the performance impact, as long as we make sure it only applies to
people using the multivariate stats.I also think a combined approach - first a bottom-up step
(identifying the largest compatible subtrees & caching the varnos),
then a top-down step (doing the same optimization as implemented
today) might minimize the performance impact.I almost reaching the same conclusion.
Ah, so the answer to my last question is "yes". Now we only need to
actually code it ;-)
kind regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 07/30/2015 03:55 PM, Tomas Vondra wrote:
On 07/30/2015 10:21 AM, Heikki Linnakangas wrote:
I have some doubts about the clause reduction and functional
dependencies part of this. It seems to treat functional dependency as
a boolean property, but even with the classic zipcode and city case,
it's not always an all or nothing thing. At least in some countries,
there can be zipcodes that span multiple cities. So zipcode=X does
not completely imply city=Y, although there is a strong correlation
(if that's the right term). How strong does the correlation need to
be for this patch to decide that zipcode implies city? I couldn't
actually see a clear threshold stated anywhere.So rather than treating functional dependence as a boolean, I think
it would make more sense to put a 0.0-1.0 number to it. That means
that you can't do clause reduction like it's done in this patch,
where you actually remove clauses from the query for cost esimation
purposes. Instead, you need to calculate the selectivity for each
clause independently, but instead of just multiplying the
selectivities together, apply the "dependence factor" to it.Does that make sense? I haven't really looked at the MCV, histogram
and "multi-statistics estimation" patches yet. Do those patches make
the clause reduction patch obsolete? Should we forget about the
clause reduction and functional dependency patch, and focus on those
later patches instead?Perhaps. It's true that most real-world data sets are not 100% valid
with respect to functional dependencies - either because of natural
imperfections (multiple cities with the same ZIP code) or just noise in
the data (incorrect entries ...). And it's even mentioned in the code
comments somewhere, I guess.But there are two main reasons why I chose not to extend the functional
dependencies with the [0.0-1.0] value you propose.Firstly, functional dependencies were meant to be the simplest possible
implementation, illustrating how the "infrastructure" is supposed to
work (which is the main topic of the first patch).Secondly, all kinds of statistics are "simplifications" of the actual
data. So I think it's not incorrect to ignore the exceptions up to some
threshold.
The problem with a threshold is that around that threshold, even a small
change in the data set can drastically change the produced estimates.
For example, imagine that we know from the stats that zip code implies
city. But then someone adds a single row to the table with an odd zip
code & city combination, which pushes the estimator over the threshold,
and the columns are no longer considered dependent, and the estimates
are now completely different. We should avoid steep cliffs like that.
BTW, what is the threshold in the current patch?
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 07/30/2015 06:58 PM, Heikki Linnakangas wrote:
The problem with a threshold is that around that threshold, even a
small change in the data set can drastically change the produced
estimates. For example, imagine that we know from the stats that zip
code implies city. But then someone adds a single row to the table
with an odd zip code & city combination, which pushes the estimator
over the threshold, and the columns are no longer considered
dependent, and the estimates are now completely different. We should
avoid steep cliffs like that.BTW, what is the threshold in the current patch?
There's not a simple threshold - the algorithm mining the functional
dependencies is a bit more complicated. I tried to explain it in the
comment before build_mv_dependencies (in dependencies.c), but let me
briefly summarize it here.
To mine dependency [A => B], build_mv_dependencies does this:
(1) sort the sample by {A,B}
(2) split the sample into groups with the same value of A
(3) for each group, decide if it's consistent with the dependency
(a) if the group is too small (less than 3 rows), ignore it
(a) if the group is consistent, update
n_supporting
n_supporting_rows
(b) if the group is inconsistent, update
n_contradicting
n_contradicting_rows
(4) decide whether the dependency is "valid" by checking
n_supporting_rows >= n_contradicting_rows * 10
The limit is rather arbitrary and yes - I can imagine a more complex
condition (e.g. looking at average number of tuples per group etc.), but
I haven't looked into that - the point was to use something very simple,
only to illustrate the infrastructure.
I think we might come up with some elaborate way of associating "degree"
with the functional dependency, but at that point we really loose the
simplicity, and also make it indistinguishable from the remaining
statistics (because it won't be possible to reduce the clauses like
this, before performing the regular estimation). Which is exactly what
makes the functional dependencies so neat and efficient, so I'm not
overly enthusiastic about doing that.
What seems more interesting is implementing the ndistinct coefficient
instead, as proposed by Kyotaro-san - that seems to have the nice
"smooth" behavior you desire, while keeping the simplicity.
Both statistics types (functional dependencies and ndistinct coeff) have
one weak point, though - they somehow assume the queries use
"compatible" values. For example if you use a query with
WHERE city = 'New York' AND zip = 'zip for Detroit'
they can't detect cases like this, because those statistics types are
oblivious to individual values. I don't see this as a fatal flaw, though
- it's rather a consequence of the nature of the stats. And I tend to
look at the functional dependencies the same way.
If you need stats without these "issues" you'll have to use MCV list or
a histogram. Trying to fix the simple statistics types is futile, IMHO.
regards
Tomas
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jul 31, 2015 at 6:28 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
[series of arguments]
If you need stats without these "issues" you'll have to use MCV list or a
histogram. Trying to fix the simple statistics types is futile, IMHO.
Patch is marked as returned with feedback. There has been advanced
discussions and reviews as well.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tomas,
attached is v7 of the multivariate stats patch. The main improvement is
major refactoring of the clausesel.c portion - splitting the awfully
long spaghetti-style functions into smaller pieces, making it much more
understandable etc.
So presumably v7 handles varlena attributes as well, yes? I have a
destruction test case for correlated column stats, so I'd like to test
your patch on it.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WMbdce762a9eb3c133829c01f4a03b823f03e911e18e83918d922f86f1fac1a39f3616449d62794182170e9765f10776f2@asav-3.01.com
Hi,
On 09/24/2015 06:43 PM, Josh Berkus wrote:
Tomas,
attached is v7 of the multivariate stats patch. The main improvement is
major refactoring of the clausesel.c portion - splitting the awfully
long spaghetti-style functions into smaller pieces, making it much more
understandable etc.So presumably v7 handles varlena attributes as well, yes? I have a
destruction test case for correlated column stats, so I'd like to test
your patch on it.
Yes, it should handle varlena OK. Let me know if you need help with
that, and I'd like to hear feedback - whether it fixed your test case or
not, etc.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
attached is v8 of the multivariate statistics patch (or rather a patch
series). The patch currently has 7 parts, but 0001 is just a fix of the
pull_varnos issue (possibly incorrect/temporary), and 0007 is just an
attempt to add the "multicolumn distinctness" (experimental for now).
There are three noteworthy changes:
1) Correct estimation of OR-clauses - this turned out to be a rather
minor change, thanks to simply transforming the OR-clauses to
AND-clauses, see clauselist_selectivity_or() for details.
2) Abandoning the ALTER TABLE ... ADD STATISTICS syntax and instead
adding separate commands CREATE STATISTICS / DROP STATISTICS, as
proposed in the "multicolumn distinctness" thread:
/messages/by-id/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp
This seems a better approach than the ALTER TABLE one - not only it
nicely fixes the grammar issues, it also naturally extends to
multi-table statistics (despite we don't know how those should work
exactly).
The syntax is this:
CREATE STATISTICS name ON table (columns) WITH (options);
DROP STATISTICS name;
and the 'name' is optional (and if absent, should be generated just
like for indexes, but that's not implemented yet).
The remaining question is how unique the statistics name should be.
My initial plan was to make it unique within a table, but that of
course does not work well with the DROP STATISTICS (it'd have to
specify the table name also), and it'd also now work with statistics
on multiple tables (which is one of the reasons for abandoning ALTER
TABLE stuff).
So I think it should be unique across tables. Statistics are hardly
a global object, so it should be unique within a schema. I thought
that simply using the schema of the table would work, but that of
course breaks with multiple tables in different schemas. So the only
solution seems to be explicit schema for statistics.
3) I've also started hacking on adding the "multicolumn distinctness"
proposed by Horiguchi-san, but I haven't really got that working. It
seems to be a bit more complicated than I anticipated because of the
"only equality conditions" restriction. So the 0007 patch only
really adds basic syntax and trivial build.
I do have bunch of ideas/questions about this statistics type. For
example, should we compute just a single coefficient or the exact
combination of columns specified in CREATE STATISTICS, or perhaps
for some additional subsets? I.e. with
CREATE STATISTICS ON t (a,b,c) WITH (ndistinct);
should we compute just the coefficient for (a,b,c), or maybe also
for (a,b), (b,c) and (a,c)? For N columns there's O(2^N) such
combinations, but perhaps it's acceptable.
Having the coefficient for just the single combination specified in
CREATE STATISTICS makes the estimation difficult when some of the
columns are not specified. For example, with coefficient just for
(a,b,c), what should happen for (WHERE a=1 AND b=2)?
Should we simply ignore the statistics, or apply it anyway and
somehow compensate for the missing columns?
I've also started working on something like a paper, hopefully
explaining the ideas and implementation more clearly and consistently
than possible on a mailing list (thanks to charts, figures and such).
It's available here (both the .tex source and .pdf with the current
version):
https://bitbucket.org/tvondra/mvstats-paper/src
It's not exactly short (~30 pages), and it's certainly incomplete with a
plenty of TODO notes, but hopefully it's already useful and not entirely
bogus.
Comments and questions are welcome - both to the patch and paper.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchtext/x-diff; name=0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchDownload
>From 537ef6c3889754aa9566cae21421371c345143d7 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 1/7] teach pull_(varno|varattno)_walker about RestrictInfo
otherwise pull_varnos fails when processing OR clauses
---
src/backend/optimizer/util/var.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index 32038ce..141a491 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -197,6 +197,13 @@ pull_varnos_walker(Node *node, pull_varnos_context *context)
context->sublevels_up--;
return result;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo*)node;
+ context->varnos = bms_add_members(context->varnos,
+ rinfo->clause_relids);
+ return false;
+ }
return expression_tree_walker(node, pull_varnos_walker,
(void *) context);
}
@@ -245,6 +252,15 @@ pull_varattnos_walker(Node *node, pull_varattnos_context *context)
return false;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *)node;
+
+ return expression_tree_walker((Node*)rinfo->clause,
+ pull_varattnos_walker,
+ (void*) context);
+ }
+
/* Should not find an unplanned subquery */
Assert(!IsA(node, Query));
--
2.1.0
0002-shared-infrastructure-and-functional-dependencies.patchtext/x-diff; name=0002-shared-infrastructure-and-functional-dependencies.patchDownload
>From cdbb6d854fc59b576603c25f4567aab831e3d5b3 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 2/7] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate
stats, most importantly:
- adds a new system catalog (pg_mv_statistic)
- CREATE STATISTICS name ON table (columns) WITH (options)
- DROP STATISTICS name
- implementation of functional dependencies (the simplest
type of multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e.
it does not influence the query planning (subject to
follow-up patches).
The current implementation requires a valid 'ltopr' for
the columns, so that we can sort the sample rows in various
ways, both in this patch and other kinds of statistics.
Maybe this restriction could be relaxed in the future,
requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
Maybe some of the stats (functional dependencies and MCV
list with limited functionality) might be made to work
with hashes of the values, which is sufficient for equality
comparisons. But the queries would require the equality
operator anyway, so it's not really a weaker requirement.
The hashes might reduce space requirements, though.
The algorithm detecting the dependencies is rather simple
and probably needs improvements, so that it detects more
complicated dependencies, and also validation of the math.
The name 'functional dependencies' is more correct (than
'association rules') as it's exactly the name used in
relational theory (esp. Normal Forms) for tracking
column-level dependencies.
The multivariate statistics are automatically removed in
two situations
(a) after a DROP TABLE (obviously)
(b) after ALTER TABLE ... DROP COLUMN, if the statistics
would be defined on less than 2 columns (remaining)
If there are more at least 2 columns remaining, we keep
the statistics but perform cleanup on the next ANALYZE.
The dropped columns are removed from stakeys, and the new
statistics is built on the smaller set.
We can't do this at DROP COLUMN, because that'd leave us
with invalid statistics, or we'd have to throw it away
although we can still use it. This lazy approach lets us
use the statistics although some of the columns are dead.
This also adds a simple list of statistics to \d in psql.
---
src/backend/catalog/Makefile | 1 +
src/backend/catalog/dependency.c | 11 +-
src/backend/catalog/heap.c | 102 +++++
src/backend/catalog/namespace.c | 49 +++
src/backend/catalog/objectaddress.c | 22 +
src/backend/catalog/system_views.sql | 11 +
src/backend/commands/Makefile | 6 +-
src/backend/commands/analyze.c | 21 +
src/backend/commands/dropcmds.c | 4 +
src/backend/commands/event_trigger.c | 3 +
src/backend/commands/statscmds.c | 299 ++++++++++++++
src/backend/commands/tablecmds.c | 8 +-
src/backend/nodes/copyfuncs.c | 16 +
src/backend/nodes/outfuncs.c | 18 +
src/backend/optimizer/util/plancat.c | 63 +++
src/backend/parser/gram.y | 71 +++-
src/backend/tcop/utility.c | 11 +
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/relcache.c | 59 +++
src/backend/utils/cache/syscache.c | 23 ++
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/common.c | 356 ++++++++++++++++
src/backend/utils/mvstats/common.h | 75 ++++
src/backend/utils/mvstats/dependencies.c | 638 +++++++++++++++++++++++++++++
src/bin/psql/describe.c | 42 ++
src/include/catalog/dependency.h | 5 +-
src/include/catalog/heap.h | 1 +
src/include/catalog/indexing.h | 7 +
src/include/catalog/namespace.h | 2 +
src/include/catalog/pg_mv_statistic.h | 71 ++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/commands/defrem.h | 4 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 11 +
src/include/nodes/relation.h | 28 ++
src/include/parser/kwlist.h | 2 +-
src/include/utils/mvstats.h | 69 ++++
src/include/utils/rel.h | 4 +
src/include/utils/relcache.h | 1 +
src/include/utils/syscache.h | 2 +
src/test/regress/expected/rules.out | 8 +
src/test/regress/expected/sanity_check.out | 1 +
43 files changed, 2139 insertions(+), 13 deletions(-)
create mode 100644 src/backend/commands/statscmds.c
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 25130ec..058b8a9 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index efca34c..32a9ee3 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -39,6 +39,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -159,7 +160,8 @@ static const Oid object_classes[] = {
ExtensionRelationId, /* OCLASS_EXTENSION */
EventTriggerRelationId, /* OCLASS_EVENT_TRIGGER */
PolicyRelationId, /* OCLASS_POLICY */
- TransformRelationId /* OCLASS_TRANSFORM */
+ TransformRelationId, /* OCLASS_TRANSFORM */
+ MvStatisticRelationId /* OCLASS_STATISTICS */
};
@@ -1271,6 +1273,10 @@ doDeletion(const ObjectAddress *object, int flags)
DropTransformById(object->objectId);
break;
+ case OCLASS_STATISTICS:
+ RemoveStatisticsById(object->objectId);
+ break;
+
default:
elog(ERROR, "unrecognized object class: %u",
object->classId);
@@ -2414,6 +2420,9 @@ getObjectClass(const ObjectAddress *object)
case TransformRelationId:
return OCLASS_TRANSFORM;
+
+ case MvStatisticRelationId:
+ return OCLASS_STATISTICS;
}
/* shouldn't get here */
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 04c4f8f..5176f86 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -46,6 +46,7 @@
#include "catalog/pg_constraint.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_tablespace.h"
@@ -1612,7 +1613,10 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
if (attnum > 0)
+ {
RemoveStatistics(relid, attnum);
+ RemoveMVStatistics(relid, attnum);
+ }
relation_close(rel, NoLock);
}
@@ -1840,6 +1844,11 @@ heap_drop_with_catalog(Oid relid)
RemoveStatistics(relid, 0);
/*
+ * delete multi-variate statistics
+ */
+ RemoveMVStatistics(relid, 0);
+
+ /*
* delete attribute tuples
*/
DeleteAttributeTuples(relid);
@@ -2695,6 +2704,99 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
/*
+ * RemoveMVStatistics --- remove entries in pg_mv_statistic for a rel
+ *
+ * If attnum is zero, remove all entries for rel; else remove only the one(s)
+ * for that column.
+ */
+void
+RemoveMVStatistics(Oid relid, AttrNumber attnum)
+{
+ Relation pgmvstatistic;
+ TupleDesc tupdesc = NULL;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ /*
+ * When dropping a column, we'll drop statistics with a single
+ * remaining (undropped column). To do that, we need the tuple
+ * descriptor.
+ *
+ * We already have the relation locked (as we're running ALTER
+ * TABLE ... DROP COLUMN), so we'll just get the descriptor here.
+ */
+ if (attnum != 0)
+ {
+ Relation rel = relation_open(relid, NoLock);
+
+ /* multivariate stats are supported on tables and matviews */
+ if (rel->rd_rel->relkind == RELKIND_RELATION ||
+ rel->rd_rel->relkind == RELKIND_MATVIEW)
+ tupdesc = RelationGetDescr(rel);
+
+ relation_close(rel, NoLock);
+ }
+
+ if (tupdesc == NULL)
+ return;
+
+ pgmvstatistic = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ scan = systable_beginscan(pgmvstatistic,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ bool delete = true;
+
+ if (attnum != 0)
+ {
+ Datum adatum;
+ bool isnull;
+ int i;
+ int ncolumns = 0;
+ ArrayType *arr;
+ int16 *attnums;
+
+ /* get the columns */
+ adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+ attnums = (int16*)ARR_DATA_PTR(arr);
+
+ for (i = 0; i < ARR_DIMS(arr)[0]; i++)
+ {
+ /* count the column unless it's has been / is being dropped */
+ if ((! tupdesc->attrs[attnums[i]-1]->attisdropped) &&
+ (attnums[i] != attnum))
+ ncolumns += 1;
+ }
+
+ /* delete if there are less than two attributes */
+ delete = (ncolumns < 2);
+ }
+
+ if (delete)
+ simple_heap_delete(pgmvstatistic, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(pgmvstatistic, RowExclusiveLock);
+}
+
+
+/*
* RelationTruncateIndexes - truncate all indexes associated
* with the heap relation to zero tuples.
*
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 6644c6f..178f565 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -4201,3 +4201,52 @@ pg_is_other_temp_schema(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(isOtherTempNamespace(oid));
}
+
+Oid
+get_statistics_oid(List *names, bool missing_ok)
+{
+ char *schemaname;
+ char *stats_name;
+ Oid namespaceId;
+ Oid stats_oid = InvalidOid;
+ ListCell *l;
+
+ /* deconstruct the name list */
+ DeconstructQualifiedName(names, &schemaname, &stats_name);
+
+ if (schemaname)
+ {
+ /* use exact schema given */
+ namespaceId = LookupExplicitNamespace(schemaname, missing_ok);
+ if (missing_ok && !OidIsValid(namespaceId))
+ stats_oid = InvalidOid;
+ else
+ stats_oid = GetSysCacheOid1(MVSTATNAME,
+ PointerGetDatum(stats_name));
+ }
+ else
+ {
+ /* search for it in search path */
+ recomputeNamespacePath();
+
+ foreach(l, activeSearchPath)
+ {
+ namespaceId = lfirst_oid(l);
+
+ if (namespaceId == myTempNamespace)
+ continue; /* do not look in temp namespace */
+ stats_oid = GetSysCacheOid1(MVSTATNAME,
+ PointerGetDatum(stats_name));
+ if (OidIsValid(stats_oid))
+ break;
+ }
+ }
+
+ if (!OidIsValid(stats_oid) && !missing_ok)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics \"%s\" does not exist",
+ NameListToString(names))));
+
+ return stats_oid;
+}
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index e44d7d0..b2bcf1f 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -37,6 +37,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_opfamily.h"
@@ -436,9 +437,22 @@ static const ObjectPropertyType ObjectProperty[] =
Anum_pg_type_typacl,
ACL_KIND_TYPE,
true
+ },
+ {
+ MvStatisticRelationId,
+ MvStatisticOidIndexId,
+ MVSTATOID,
+ MVSTATNAME,
+ Anum_pg_mv_statistic_staname,
+ InvalidAttrNumber, /* FIXME probably should have namespace */
+ InvalidAttrNumber, /* XXX same owner as relation */
+ InvalidAttrNumber, /* no ACL (same as relation) */
+ -1, /* no ACL */
+ true
}
};
+
/*
* This struct maps the string object types as returned by
* getObjectTypeDescription into ObjType enum values. Note that some enum
@@ -911,6 +925,11 @@ get_object_address(ObjectType objtype, List *objname, List *objargs,
address = get_object_address_defacl(objname, objargs,
missing_ok);
break;
+ case OBJECT_STATISTICS:
+ address.classId = MvStatisticRelationId;
+ address.objectId = get_statistics_oid(objname, missing_ok);
+ address.objectSubId = 0;
+ break;
default:
elog(ERROR, "unrecognized objtype: %d", (int) objtype);
/* placate compiler, in case it thinks elog might return */
@@ -2183,6 +2202,9 @@ check_object_ownership(Oid roleid, ObjectType objtype, ObjectAddress address,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser")));
break;
+ case OBJECT_STATISTICS:
+ /* FIXME do the right owner checks here */
+ break;
default:
elog(ERROR, "unrecognized object type: %d",
(int) objtype);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 536c805..e3f3387 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,6 +158,17 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.staname AS staname,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats WITH (security_barrier) AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index b1ac704..5151001 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -18,8 +18,8 @@ OBJS = aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
event_trigger.o explain.o extension.o foreigncmds.o functioncmds.o \
indexcmds.o lockcmds.o matview.o operatorcmds.o opclasscmds.o \
policy.o portalcmds.o prepare.o proclang.o \
- schemacmds.o seclabel.o sequence.o tablecmds.o tablespace.o trigger.o \
- tsearchcmds.o typecmds.o user.o vacuum.o vacuumlazy.o \
- variable.o view.o
+ schemacmds.o seclabel.o sequence.o statscmds.o \
+ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o \
+ user.o vacuum.o vacuumlazy.o variable.o view.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index ddb68ab..fa18903 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -55,7 +56,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Per-index data for ANALYZE */
typedef struct AnlIndexData
@@ -460,6 +465,19 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, because we need to build more
+ * detailed stats (more MCV items / histogram buckets) to get
+ * good accuracy. Maybe it'd be appropriate to use samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -562,6 +580,9 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/dropcmds.c b/src/backend/commands/dropcmds.c
index f04f4f5..7d6318d 100644
--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -292,6 +292,10 @@ does_not_exist_skipping(ObjectType objtype, List *objname, List *objargs)
msg = gettext_noop("schema \"%s\" does not exist, skipping");
name = NameListToString(objname);
break;
+ case OBJECT_STATISTICS:
+ msg = gettext_noop("statistics \"%s\" does not exist, skipping");
+ name = NameListToString(objname);
+ break;
case OBJECT_TSPARSER:
if (!schema_does_not_exist_skipping(objname, &msg, &name))
{
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 3d1cb0b..baea9dd 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -110,6 +110,7 @@ static event_trigger_support_data event_trigger_support[] = {
{"SCHEMA", true},
{"SEQUENCE", true},
{"SERVER", true},
+ {"STATISTICS", true},
{"TABLE", true},
{"TABLESPACE", false},
{"TRANSFORM", true},
@@ -1106,6 +1107,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
case OBJECT_RULE:
case OBJECT_SCHEMA:
case OBJECT_SEQUENCE:
+ case OBJECT_STATISTICS:
case OBJECT_TABCONSTRAINT:
case OBJECT_TABLE:
case OBJECT_TRANSFORM:
@@ -1167,6 +1169,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
case OCLASS_DEFACL:
case OCLASS_EXTENSION:
case OCLASS_POLICY:
+ case OCLASS_STATISTICS:
return true;
}
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
new file mode 100644
index 0000000..3790082
--- /dev/null
+++ b/src/backend/commands/statscmds.c
@@ -0,0 +1,299 @@
+/*-------------------------------------------------------------------------
+ *
+ * statscmds.c
+ * Commands for creating and altering multivariate statistics
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/commands/statscmds.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/multixact.h"
+#include "access/reloptions.h"
+#include "access/relscan.h"
+#include "access/sysattr.h"
+#include "access/xact.h"
+#include "access/xlog.h"
+#include "catalog/catalog.h"
+#include "catalog/dependency.h"
+#include "catalog/heap.h"
+#include "catalog/index.h"
+#include "catalog/indexing.h"
+#include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_constraint.h"
+#include "catalog/pg_depend.h"
+#include "catalog/pg_foreign_table.h"
+#include "catalog/pg_inherits.h"
+#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
+#include "catalog/pg_namespace.h"
+#include "catalog/pg_opclass.h"
+#include "catalog/pg_tablespace.h"
+#include "catalog/pg_trigger.h"
+#include "catalog/pg_type.h"
+#include "catalog/pg_type_fn.h"
+#include "catalog/storage.h"
+#include "catalog/toasting.h"
+#include "commands/cluster.h"
+#include "commands/comment.h"
+#include "commands/defrem.h"
+#include "commands/event_trigger.h"
+#include "commands/policy.h"
+#include "commands/sequence.h"
+#include "commands/tablecmds.h"
+#include "commands/tablespace.h"
+#include "commands/trigger.h"
+#include "commands/typecmds.h"
+#include "commands/user.h"
+#include "executor/executor.h"
+#include "foreign/foreign.h"
+#include "miscadmin.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/parsenodes.h"
+#include "optimizer/clauses.h"
+#include "optimizer/planner.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_coerce.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_expr.h"
+#include "parser/parse_oper.h"
+#include "parser/parse_relation.h"
+#include "parser/parse_type.h"
+#include "parser/parse_utilcmd.h"
+#include "parser/parser.h"
+#include "pgstat.h"
+#include "rewrite/rewriteDefine.h"
+#include "rewrite/rewriteHandler.h"
+#include "rewrite/rewriteManip.h"
+#include "storage/bufmgr.h"
+#include "storage/lmgr.h"
+#include "storage/lock.h"
+#include "storage/predicate.h"
+#include "storage/smgr.h"
+#include "utils/acl.h"
+#include "utils/builtins.h"
+#include "utils/fmgroids.h"
+#include "utils/inval.h"
+#include "utils/lsyscache.h"
+#include "utils/memutils.h"
+#include "utils/relcache.h"
+#include "utils/ruleutils.h"
+#include "utils/snapmgr.h"
+#include "utils/syscache.h"
+#include "utils/tqual.h"
+#include "utils/typcache.h"
+#include "utils/mvstats.h"
+
+
+/* used for sorting the attnums in ExecCreateStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the CREATE STATISTICS name ON table (columns) WITH (options)
+ *
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+ObjectAddress
+CreateStatistics(CreateStatsStmt *stmt)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+ ObjectAddress address = InvalidObjectAddress;
+ NameData staname;
+ Oid statoid;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+ Relation rel;
+ ObjectAddress parentobject, childobject;
+
+ /* by default build nothing */
+ bool build_dependencies = false;
+
+ Assert(IsA(stmt, CreateStatsStmt));
+
+ rel = heap_openrv(stmt->relation, AccessExclusiveLock);
+
+ /* transform the column names to attnum values */
+
+ foreach(l, stmt->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, stmt->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ namestrcpy(&staname, stmt->statsname);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+ values[Anum_pg_mv_statistic_staname -1] = NameGetDatum(&staname);
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ statoid = HeapTupleGetOid(htup);
+
+ heap_freetuple(htup);
+
+
+ /*
+ * Store a dependency too, so that statistics are dropped on DROP TABLE
+ */
+ parentobject.classId = RelationRelationId;
+ parentobject.objectId = ObjectIdGetDatum(RelationGetRelid(rel));
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ relation_close(rel, NoLock);
+
+ /*
+ * Invalidate relcache so that others see the new statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ ObjectAddressSet(address, MvStatisticRelationId, statoid);
+
+ return address;
+}
+
+
+/*
+ * Implements the DROP STATISTICS
+ *
+ * DROP STATISTICS stats_name ON table_name
+ *
+ * The first one requires an exact match, the second one just drops
+ * all the statistics on a table.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+ Relation relation;
+ HeapTuple tup;
+
+ /*
+ * Delete the pg_proc tuple.
+ */
+ relation = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ tup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(statsOid));
+ if (!HeapTupleIsValid(tup)) /* should not happen */
+ elog(ERROR, "cache lookup failed for statistics %u", statsOid);
+
+ simple_heap_delete(relation, &tup->t_self);
+
+ ReleaseSysCache(tup);
+
+ heap_close(relation, RowExclusiveLock);
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 56fed4d..f86d716 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -35,6 +35,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_tablespace.h"
@@ -93,7 +94,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -141,8 +142,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index ba04b72..0ca2d35 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4118,6 +4118,19 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static CreateStatsStmt *
+_copyCreateStatsStmt(const CreateStatsStmt *from)
+{
+ CreateStatsStmt *newnode = makeNode(CreateStatsStmt);
+
+ COPY_STRING_FIELD(statsname);
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4965,6 +4978,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_CreateStatsStmt:
+ retval = _copyCreateStatsStmt(from);
+ break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 63fae82..cae21d0 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1939,6 +1939,21 @@ _outIndexOptInfo(StringInfo str, const IndexOptInfo *node)
}
static void
+_outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
+{
+ WRITE_NODE_TYPE("MVSTATISTICINFO");
+
+ /* NB: this isn't a complete set of fields */
+ WRITE_OID_FIELD(mvoid);
+
+ /* enabled statistics */
+ WRITE_BOOL_FIELD(deps_enabled);
+
+ /* built/available statistics */
+ WRITE_BOOL_FIELD(deps_built);
+}
+
+static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
@@ -3358,6 +3373,9 @@ _outNode(StringInfo str, const void *obj)
case T_PlannerParamItem:
_outPlannerParamItem(str, obj);
break;
+ case T_MVStatisticInfo:
+ _outMVStatisticInfo(str, obj);
+ break;
case T_CreateStmt:
_outCreateStmt(str, obj);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 9442e5f..60fd57f 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -27,6 +27,7 @@
#include "catalog/catalog.h"
#include "catalog/dependency.h"
#include "catalog/heap.h"
+#include "catalog/pg_mv_statistic.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -39,7 +40,9 @@
#include "parser/parsetree.h"
#include "rewrite/rewriteManip.h"
#include "storage/bufmgr.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
+#include "utils/syscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -93,6 +96,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
Relation relation;
bool hasindex;
List *indexinfos = NIL;
+ List *stainfos = NIL;
/*
* We need not lock the relation since it was already locked, either by
@@ -381,6 +385,65 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->indexlist = indexinfos;
+ if (true)
+ {
+ List *mvstatoidlist;
+ ListCell *l;
+
+ mvstatoidlist = RelationGetMVStatList(relation);
+
+ foreach(l, mvstatoidlist)
+ {
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ Oid mvoid = lfirst_oid(l);
+ Form_pg_mv_statistic mvstat;
+ MVStatisticInfo *info;
+
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ continue;
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /* unavailable stats are not interesting for the planner */
+ if (mvstat->deps_built)
+ {
+ info = makeNode(MVStatisticInfo);
+
+ info->mvoid = mvoid;
+ info->rel = rel;
+
+ /* enabled statistics */
+ info->deps_enabled = mvstat->deps_enabled;
+
+ /* built/available statistics */
+ info->deps_built = mvstat->deps_built;
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ info->stakeys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+
+ stainfos = lcons(info, stainfos);
+ }
+
+ ReleaseSysCache(htup);
+ }
+
+ list_free(mvstatoidlist);
+ }
+
+ rel->mvstatlist = stainfos;
+
/* Grab foreign-table info using the relcache, while we have it */
if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c4bed8a..5446870 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -241,7 +241,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
ConstraintsSetStmt CopyStmt CreateAsStmt CreateCastStmt
CreateDomainStmt CreateExtensionStmt CreateGroupStmt CreateOpClassStmt
CreateOpFamilyStmt AlterOpFamilyStmt CreatePLangStmt
- CreateSchemaStmt CreateSeqStmt CreateStmt CreateTableSpaceStmt
+ CreateSchemaStmt CreateSeqStmt CreateStmt CreateStatsStmt CreateTableSpaceStmt
CreateFdwStmt CreateForeignServerStmt CreateForeignTableStmt
CreateAssertStmt CreateTransformStmt CreateTrigStmt CreateEventTrigStmt
CreateUserStmt CreateUserMappingStmt CreateRoleStmt CreatePolicyStmt
@@ -375,6 +375,12 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> group_by_item empty_grouping_set rollup_clause cube_clause
%type <node> grouping_sets_clause
+%type <list> OptStatsOptions
+%type <str> opt_stats_name stats_name stats_options_name
+%type <node> stats_options_arg
+%type <defelt> stats_options_elem
+%type <list> stats_options_list
+
%type <list> opt_fdw_options fdw_options
%type <defelt> fdw_option
@@ -809,6 +815,7 @@ stmt :
| CreateSchemaStmt
| CreateSeqStmt
| CreateStmt
+ | CreateStatsStmt
| CreateTableSpaceStmt
| CreateTransformStmt
| CreateTrigStmt
@@ -3436,6 +3443,65 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * CREATE STATISTICS stats_name ON relname (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+
+CreateStatsStmt: CREATE STATISTICS opt_stats_name ON qualified_name '(' columnList ')' OptStatsOptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->statsname = $3;
+ n->relation = $5;
+ n->keys = $7;
+ n->options = $9;
+ $$ = (Node *)n;
+ }
+ ;
+
+opt_stats_name:
+ stats_name { $$ = $1; }
+ | /*EMPTY*/ { $$ = NULL; }
+ ;
+
+stats_name: ColId { $$ = $1; };
+
+OptStatsOptions:
+ WITH '(' stats_options_list ')' { $$ = $3; }
+ | /*EMPTY*/ { $$ = NIL; }
+ ;
+
+stats_options_list:
+ stats_options_elem
+ {
+ $$ = list_make1($1);
+ }
+ | stats_options_list ',' stats_options_elem
+ {
+ $$ = lappend($1, $3);
+ }
+ ;
+
+stats_options_elem:
+ stats_options_name stats_options_arg
+ {
+ $$ = makeDefElem($1, $2);
+ }
+ ;
+
+stats_options_name:
+ NonReservedWord { $$ = $1; }
+ ;
+
+stats_options_arg:
+ opt_boolean_or_string { $$ = (Node *) makeString($1); }
+ | NumericOnly { $$ = (Node *) $1; }
+ | /* EMPTY */ { $$ = NULL; }
+ ;
+
/*****************************************************************************
*
@@ -5621,6 +5687,7 @@ drop_type: TABLE { $$ = OBJECT_TABLE; }
| TEXT_P SEARCH DICTIONARY { $$ = OBJECT_TSDICTIONARY; }
| TEXT_P SEARCH TEMPLATE { $$ = OBJECT_TSTEMPLATE; }
| TEXT_P SEARCH CONFIGURATION { $$ = OBJECT_TSCONFIGURATION; }
+ | STATISTICS { $$ = OBJECT_STATISTICS; }
;
any_name_list:
@@ -13860,7 +13927,6 @@ unreserved_keyword:
| STANDALONE_P
| START
| STATEMENT
- | STATISTICS
| STDIN
| STDOUT
| STORAGE
@@ -14077,6 +14143,7 @@ reserved_keyword:
| SELECT
| SESSION_USER
| SOME
+ | STATISTICS
| SYMMETRIC
| TABLE
| THEN
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index e81bbc6..7029278 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1520,6 +1520,10 @@ ProcessUtilitySlow(Node *parsetree,
address = ExecSecLabelStmt((SecLabelStmt *) parsetree);
break;
+ case T_CreateStatsStmt: /* CREATE STATISTICS */
+ address = CreateStatistics((CreateStatsStmt *) parsetree);
+ break;
+
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(parsetree));
@@ -2160,6 +2164,9 @@ CreateCommandTag(Node *parsetree)
case OBJECT_TRANSFORM:
tag = "DROP TRANSFORM";
break;
+ case OBJECT_STATISTICS:
+ tag = "DROP STATISTICS";
+ break;
default:
tag = "???";
}
@@ -2527,6 +2534,10 @@ CreateCommandTag(Node *parsetree)
tag = "EXECUTE";
break;
+ case T_CreateStatsStmt:
+ tag = "CREATE STATISTICS";
+ break;
+
case T_DeallocateStmt:
{
DeallocateStmt *stmt = (DeallocateStmt *) parsetree;
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b0c0b7..b6473bb 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -47,6 +47,7 @@
#include "catalog/pg_auth_members.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
@@ -3922,6 +3923,62 @@ RelationGetIndexList(Relation relation)
return result;
}
+
+List *
+RelationGetMVStatList(Relation relation)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result;
+ List *oldlist;
+ MemoryContext oldcxt;
+
+ /* Quick exit if we already computed the list. */
+ if (relation->rd_mvstatvalid != 0)
+ return list_copy(relation->rd_mvstatlist);
+
+ /*
+ * We build the list we intend to return (in the caller's context) while
+ * doing the scan. After successfully completing the scan, we copy that
+ * list into the relcache entry. This avoids cache-context memory leakage
+ * if we get some sort of error partway through.
+ */
+ result = NIL;
+
+ /* Prepare to scan pg_index for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(relation)));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ /* TODO maybe include only already built statistics? */
+ result = insert_ordered_oid(result, HeapTupleGetOid(htup));
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* Now save a copy of the completed list in the relcache entry. */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ oldlist = relation->rd_mvstatlist;
+ relation->rd_mvstatlist = list_copy(result);
+
+ relation->rd_mvstatvalid = true;
+ MemoryContextSwitchTo(oldcxt);
+
+ /* Don't leak the old list, if there is one */
+ list_free(oldlist);
+
+ return result;
+}
+
/*
* insert_ordered_oid
* Insert a new Oid into a sorted list of Oids, preserving ordering
@@ -4891,6 +4948,8 @@ load_relcache_init_file(bool shared)
rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_idattr = NULL;
+ rel->rd_mvstatvalid = false;
+ rel->rd_mvstatlist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index efce7b9..ced92cd 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -501,6 +502,28 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATNAME */
+ MvStatisticNameIndexId,
+ 1,
+ {
+ Anum_pg_mv_statistic_staname,
+ 0,
+ 0,
+ 0
+ },
+ 4
+ },
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 4
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..a755c49
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,356 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+static List* list_mv_stats(Oid relid);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ ListCell *lc;
+ List *mvstats;
+
+ TupleDesc tupdesc = RelationGetDescr(onerel);
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel));
+
+ foreach (lc, mvstats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies deps = NULL;
+
+ VacAttrStats **stats = NULL;
+ int numatts = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = stat->stakeys;
+
+ /* see how many of the columns are not dropped */
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ numatts += 1;
+
+ /* if there are dropped attributes, build a filtered int2vector */
+ if (numatts != attrs->dim1)
+ {
+ int16 *tmp = palloc0(numatts * sizeof(int16));
+ int attnum = 0;
+
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ tmp[attnum++] = attrs->values[j];
+
+ pfree(attrs);
+ attrs = buildint2vector(tmp, numatts);
+ }
+
+ /* filter only the interesting vacattrstats records */
+ stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(stat->mvoid, deps, attrs);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+static List*
+list_mv_stats(Oid relid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result = NIL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ info->mvoid = HeapTupleGetOid(htup);
+ info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_built = stats->deps_built;
+
+ result = lappend(result, info);
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_stakeys-1] = false;
+
+ /* use the new attnums, in case we removed some dropped ones */
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_stakeys -1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..6d5465b
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/datum.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "access/sysattr.h"
+
+#include "utils/mvstats.h"
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..84b6561
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,638 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ * Mine functional dependencies between columns, in the form (A => B),
+ * meaning that a value in column 'A' determines value in 'B'. A simple
+ * artificial example may be a table created like this
+ *
+ * CREATE TABLE deptest (a INT, b INT)
+ * AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+ *
+ * Clearly, once we know the value for 'A' we can easily determine the
+ * value of 'B' by dividing (A/10). A more practical example may be
+ * addresses, where (ZIP code => city name), i.e. once we know the ZIP,
+ * we probably know which city it belongs to. Larger cities usually have
+ * multiple ZIP codes, so the dependency can't be reversed.
+ *
+ * Functional dependencies are a concept well described in relational
+ * theory, especially in definition of normalization and "normal forms".
+ * Wikipedia has a nice definition of a functional dependency [1]:
+ *
+ * In a given table, an attribute Y is said to have a functional
+ * dependency on a set of attributes X (written X -> Y) if and only
+ * if each X value is associated with precisely one Y value. For
+ * example, in an "Employee" table that includes the attributes
+ * "Employee ID" and "Employee Date of Birth", the functional
+ * dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ * It follows from the previous two sentences that each {Employee ID}
+ * is associated with precisely one {Employee Date of Birth}.
+ *
+ * [1] http://en.wikipedia.org/wiki/Database_normalization
+ *
+ * Most datasets might be normalized not to contain any such functional
+ * dependencies, but sometimes it's not practical. In some cases it's
+ * actually a conscious choice to model the dataset in denormalized way,
+ * either because of performance or to make querying easier.
+ *
+ * The current implementation supports only dependencies between two
+ * columns, but this is merely a simplification of the initial patch.
+ * It's certainly useful to mine for dependencies involving multiple
+ * columns on the 'left' side, i.e. a condition for the dependency.
+ * That is dependencies [A,B] => C and so on.
+ *
+ * TODO The implementation may/should be smart enough not to mine both
+ * [A => B] and [A,C => B], because the second dependency is a
+ * consequence of the first one (if values of A determine values
+ * of B, adding another column won't change that). The ANALYZE
+ * should first analyze 1:1 dependencies, then 2:1 dependencies
+ * (and skip the already identified ones), etc.
+ *
+ * For example the dependency [city name => zip code] is much weaker
+ * than [city name, state name => zip code], because there may be
+ * multiple cities with the same name in various states. It's not
+ * perfect though - there are probably cities with the same name within
+ * the same state, but this is relatively rare occurence hopefully.
+ * More about this in the section about dependency mining.
+ *
+ * Handling multiple columns on the right side is not necessary, as such
+ * dependencies may be decomposed into a set of dependencies with
+ * the same meaning, one for each column on the right side. For example
+ *
+ * A => [B,C]
+ *
+ * is exactly the same as
+ *
+ * (A => B) & (A => C).
+ *
+ * Of course, storing (A => [B, C]) may be more efficient thant storing
+ * the two dependencies (A => B) and (A => C) separately.
+ *
+ *
+ * Dependency mining (ANALYZE)
+ * ---------------------------
+ *
+ * The current build algorithm is rather simple - for each pair [A,B] of
+ * columns, the data are sorted lexicographically (first by A, then B),
+ * and then a number of metrics is computed by walking the sorted data.
+ *
+ * In general the algorithm counts distict values of A (forming groups
+ * thanks to the sorting), supporting or contradicting the hypothesis
+ * that A => B (i.e. that values of B are predetermined by A). If there
+ * are multiple values of B for a single value of A, it's counted as
+ * contradicting.
+ *
+ * A group may be neither supporting nor contradicting. To be counted as
+ * supporting, the group has to have at least min_group_size(=3) rows.
+ * Smaller 'supporting' groups are counted as neutral.
+ *
+ * Finally, the number of rows in supporting and contradicting groups is
+ * compared, and if there is at least 10x more supporting rows, the
+ * dependency is considered valid.
+ *
+ *
+ * Real-world datasets are imperfect - there may be errors (e.g. due to
+ * data-entry mistakes), or factually correct records, yet contradicting
+ * the dependency (e.g. when a city splits into two, but both keep the
+ * same ZIP code). A strict ANALYZE implementation (where the functional
+ * dependencies are identified) would ignore dependencies on such noisy
+ * data, making the approach unusable in practice.
+ *
+ * The proposed implementation attempts to handle such noisy cases
+ * gracefully, by tolerating small number of contradicting cases.
+ *
+ * In the future this might also perform some sort of test and decide
+ * whether it's worth building any other kind of multivariate stats,
+ * or whether the dependencies sufficiently describe the data. Or at
+ * least not build the MCV list / histogram on the implied columns.
+ * Such reduction would however make the 'verification' (see the next
+ * section) impossible.
+ *
+ *
+ * Clause reduction (planner/optimizer)
+ * ------------------------------------
+ *
+ * Apllying the dependencies is quite simple - given a list of clauses,
+ * try to apply all the dependencies. For example given clause list
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d < 100)
+ *
+ * and dependencies [a=>b] and [a=>d], this may be reduced to
+ *
+ * (a = 1) AND (c = 1) AND (d < 100)
+ *
+ * The (d<100) can't be reduced as it's not an equality clause, so the
+ * dependency [a=>d] can't be applied.
+ *
+ * See clauselist_apply_dependencies() for more details.
+ *
+ * The problem with the reduction is that the query may use conditions
+ * that are not redundant, but in fact contradictory - e.g. the user
+ * may search for a ZIP code and a city name not matching the ZIP code.
+ *
+ * In such cases, the condition on the city name is not actually
+ * redundant, but actually contradictory (making the result empty), and
+ * removing it while estimating the cardinality will make the estimate
+ * worse.
+ *
+ * The current estimation assuming independence (and multiplying the
+ * selectivities) works better in this case, but only by utter luck.
+ *
+ * In some cases this might be verified using the other multivariate
+ * statistics - MCV lists and histograms. For MCV lists the verification
+ * might be very simple - peek into the list if there are any items
+ * matching the clause on the 'A' column (e.g. ZIP code), and if such
+ * item is found, check that the 'B' column matches the other clause.
+ * If it does not, the clauses are contradictory. We can't really say
+ * if such item was not found, except maybe restricting the selectivity
+ * using the MCV data (e.g. using min/max selectivity, or something).
+ *
+ * With histograms, it might work similarly - we can't check the values
+ * directly (because histograms use buckets, unlike MCV lists, storing
+ * the actual values). So we can only observe the buckets matching the
+ * clauses - if those buckets have very low frequency, it probably means
+ * the two clauses are incompatible.
+ *
+ * It's unclear what 'low frequency' is, but if one of the clauses is
+ * implied (automatically true because of the other clause), then
+ *
+ * selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+ *
+ * So we might compute selectivity of the first clause (on the column
+ * A in dependency [A=>B]) - for example using regular statistics.
+ * And then check if the selectivity computed from the histogram is
+ * about the same (or significantly lower).
+ *
+ * The problem is that histograms work well only when the data ordering
+ * matches the natural meaning. For values that serve as labels - like
+ * city names or ZIP codes, or even generated IDs, histograms really
+ * don't work all that well. For example sorting cities by name won't
+ * match the sorting of ZIP codes, rendering the histogram unusable.
+ *
+ * The MCV are probably going to work much better, because they don't
+ * really assume any sort of ordering. And it's probably more appropriate
+ * for the label-like data.
+ *
+ * TODO Support dependencies with multiple columns on left/right.
+ *
+ * TODO Investigate using histogram and MCV list to confirm the
+ * functional dependencies.
+ *
+ * TODO Investigate statistical testing of the distribution (to decide
+ * whether it makes sense to build the histogram/MCV list).
+ *
+ * TODO Using a min/max of selectivities would probably make more sense
+ * for the associated columns.
+ *
+ * TODO Consider eliminating the implied columns from the histogram and
+ * MCV lists (but maybe that's not a good idea, because that'd make
+ * it impossible to use these stats for non-equality clauses and
+ * also it wouldn't be possible to use the stats for verification
+ * of the dependencies as proposed in another TODO).
+ *
+ * TODO This builds a complete set of dependencies, i.e. including
+ * transitive dependencies - if we identify [A => B] and [B => C],
+ * we're likely to identify [A => C] too. It might be better to
+ * keep only the minimal set of dependencies, i.e. prune all the
+ * dependencies that we can recreate by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may
+ * be recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check
+ * before actually doing the work
+ *
+ * The second option has the advantage that we don't really need
+ * to perform the sort/count. It's not sufficient alone, though,
+ * because we may discover the dependencies in the wrong order.
+ * For example [A => B], [A => C] and then [B => C]. None of those
+ * dependencies is a combination of the already known ones, yet
+ * [A => C] is a combination of [A => B] and [B => C].
+ *
+ * FIXME Not sure the current NULL handling makes much sense. We assume
+ * that NULL is 0, so it's handled like a regular value
+ * (NULL == NULL), so all NULLs in a single column form a single
+ * group. Maybe that's not the right thing to do, especially with
+ * equality conditions - in that case NULLs are irrelevant. So
+ * maybe the right solution would be to just ignore NULL values?
+ *
+ * However simply "ignoring" the NULL values does not seem like
+ * a good idea - imagine columns A and B, where for each value of
+ * A, values in B are constant (same for the whole group) or NULL.
+ * Let's say only 10% of B values in each group is not NULL. Then
+ * ignoring the NULL values will result in 10x misestimate (and
+ * it's trivial to construct arbitrary errors). So maybe handling
+ * NULL values just like a regular value is the right thing here.
+ *
+ * Or maybe NULL values should be treated differently on each side
+ * of the dependency? E.g. as ignored on the left (condition) and
+ * as regular values on the right - this seems consistent with how
+ * equality clauses work, as equality clause means 'NOT NULL'.
+ * So if we say [A => B] then it may also imply "NOT NULL" on the
+ * right side.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct values in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we can estimate the
+ * average group size and use it as a threshold. Or something
+ * like that. Seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, stats);
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ SortItem current;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, stats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ current = items[0];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
+ {
+ n_violations += 1;
+ }
+
+ current = items[i];
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means there's a 1:1 relation (or one is a 'label'), making the
+ * conditions rather redundant. Although it's possible that the
+ * query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index bb59bc2..f6d60ad 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2104,6 +2104,48 @@ describeOneTableDetails(const char *schemaname,
PQclear(result);
}
+ /* print any multivariate statistics */
+ if (pset.sversion >= 90500)
+ {
+ printfPQExpBuffer(&buf,
+ "SELECT oid, staname, stakeys,\n"
+ " deps_enabled,\n"
+ " deps_built,\n"
+ " (SELECT string_agg(attname::text,', ')\n"
+ " FROM ((SELECT unnest(stakeys) AS attnum) s\n"
+ " JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
+ "FROM pg_mv_statistic stat WHERE starelid = '%s' ORDER BY 1;",
+ oid);
+
+ result = PSQLexec(buf.data);
+ if (!result)
+ goto error_return;
+ else
+ tuples = PQntuples(result);
+
+ if (tuples > 0)
+ {
+ printTableAddFooter(&cont, _("Statistics:"));
+ for (i = 0; i < tuples; i++)
+ {
+ printfPQExpBuffer(&buf, " ");
+
+ /* statistics name */
+ appendPQExpBuffer(&buf, "%s ", PQgetvalue(result, i, 1));
+
+ /* options */
+ if (!strcmp(PQgetvalue(result, i, 3), "t"))
+ appendPQExpBuffer(&buf, "(dependencies)");
+
+ appendPQExpBuffer(&buf, " ON (%s)",
+ PQgetvalue(result, i, 7));
+
+ printTableAddFooter(&cont, buf.data);
+ }
+ }
+ PQclear(result);
+ }
+
/* print rules */
if (tableinfo.hasrules && tableinfo.relkind != 'm')
{
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index fbcf904..9a5c397 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -153,10 +153,11 @@ typedef enum ObjectClass
OCLASS_EXTENSION, /* pg_extension */
OCLASS_EVENT_TRIGGER, /* pg_event_trigger */
OCLASS_POLICY, /* pg_policy */
- OCLASS_TRANSFORM /* pg_transform */
+ OCLASS_TRANSFORM, /* pg_transform */
+ OCLASS_STATISTICS /* pg_mv_statistics */
} ObjectClass;
-#define LAST_OCLASS OCLASS_TRANSFORM
+#define LAST_OCLASS OCLASS_STATISTICS
/* in dependency.c */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index e6ac394..36debeb 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -119,6 +119,7 @@ extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
DropBehavior behavior, bool complain, bool internal);
extern void RemoveAttrDefaultById(Oid attrdefId);
extern void RemoveStatistics(Oid relid, AttrNumber attnum);
+extern void RemoveMVStatistics(Oid relid, AttrNumber attnum);
extern Form_pg_attribute SystemAttributeDefinition(AttrNumber attno,
bool relhasoids);
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index c38958d..e171ae6 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,13 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_name_index, 3997, on pg_mv_statistic using btree(staname name_ops));
+#define MvStatisticNameIndexId 3997
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index b6ad934..9bb59f9 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -137,6 +137,8 @@ extern Oid get_collation_oid(List *collname, bool missing_ok);
extern Oid get_conversion_oid(List *conname, bool missing_ok);
extern Oid FindDefaultConversionProc(int32 for_encoding, int32 to_encoding);
+extern Oid get_statistics_oid(List *names, bool missing_ok);
+
/* initialization & transaction cleanup code */
extern void InitializeSearchPath(void);
extern void AtEOXact_Namespace(bool isCommit, bool parallel);
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..8c33a92
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,71 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+ NameData staname; /* statistics name */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_attrdef
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 6
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_staname 2
+#define Anum_pg_mv_statistic_deps_enabled 3
+#define Anum_pg_mv_statistic_deps_built 4
+#define Anum_pg_mv_statistic_stakeys 5
+#define Anum_pg_mv_statistic_stadeps 6
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index d8640db..85c638d 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2739,6 +2739,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index fb2f035..b7c878d 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3577, 3578);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index adae296..3adb956 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -75,6 +75,10 @@ extern ObjectAddress DefineOperator(List *names, List *parameters);
extern void RemoveOperatorById(Oid operOid);
extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
+/* commands/statscmds.c */
+extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
+extern void RemoveStatisticsById(Oid statsOid);
+
/* commands/aggregatecmds.c */
extern ObjectAddress DefineAggregate(List *name, List *args, bool oldstyle,
List *parameters, const char *queryString);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 603edd3..ece0776 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -251,6 +251,7 @@ typedef enum NodeTag
T_PlaceHolderInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
+ T_MVStatisticInfo,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
@@ -381,6 +382,7 @@ typedef enum NodeTag
T_CreatePolicyStmt,
T_AlterPolicyStmt,
T_CreateTransformStmt,
+ T_CreateStatsStmt,
/*
* TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 9142e94..3650897 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -596,6 +596,16 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct CreateStatsStmt
+{
+ NodeTag type;
+ char *statsname; /* name of new statistics, or NULL for default */
+ RangeVar *relation; /* relation to build statistics on */
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+} CreateStatsStmt;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1405,6 +1415,7 @@ typedef enum ObjectType
OBJECT_RULE,
OBJECT_SCHEMA,
OBJECT_SEQUENCE,
+ OBJECT_STATISTICS,
OBJECT_TABCONSTRAINT,
OBJECT_TABLE,
OBJECT_TABLESPACE,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 5393005..baa0c88 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -479,6 +479,7 @@ typedef struct RelOptInfo
List *lateral_vars; /* LATERAL Vars and PHVs referenced by rel */
Relids lateral_referencers; /* rels that reference me laterally */
List *indexlist; /* list of IndexOptInfo */
+ List *mvstatlist; /* list of MVStatisticInfo */
BlockNumber pages; /* size estimates derived from pg_class */
double tuples;
double allvisfrac;
@@ -573,6 +574,33 @@ typedef struct IndexOptInfo
bool amhasgetbitmap; /* does AM have amgetbitmap interface? */
} IndexOptInfo;
+/*
+ * MVStatisticInfo
+ * Information about multivariate stats for planning/optimization
+ *
+ * This contains information about which columns are covered by the
+ * statistics (stakeys), which options were requested while adding the
+ * statistics (*_enabled), and which kinds of statistics were actually
+ * built and are available for the optimizer (*_built).
+ */
+typedef struct MVStatisticInfo
+{
+ NodeTag type;
+
+ Oid mvoid; /* OID of the statistics row */
+ RelOptInfo *rel; /* back-link to index's table */
+
+ /* enabled statistics */
+ bool deps_enabled; /* functional dependencies enabled */
+
+ /* built/available statistics */
+ bool deps_built; /* functional dependencies built */
+
+ /* columns in the statistics (attnums) */
+ int2vector *stakeys; /* attnums of the columns covered */
+
+} MVStatisticInfo;
+
/*
* EquivalenceClasses
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 812ca83..daefcef 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -361,7 +361,7 @@ PG_KEYWORD("stable", STABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("standalone", STANDALONE_P, UNRESERVED_KEYWORD)
PG_KEYWORD("start", START, UNRESERVED_KEYWORD)
PG_KEYWORD("statement", STATEMENT, UNRESERVED_KEYWORD)
-PG_KEYWORD("statistics", STATISTICS, UNRESERVED_KEYWORD)
+PG_KEYWORD("statistics", STATISTICS, RESERVED_KEYWORD)
PG_KEYWORD("stdin", STDIN, UNRESERVED_KEYWORD)
PG_KEYWORD("stdout", STDOUT, UNRESERVED_KEYWORD)
PG_KEYWORD("storage", STORAGE, UNRESERVED_KEYWORD)
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..411cd16
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,69 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "commands/vacuum.h"
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+
+#endif
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 8a55a09..4d6edb6 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -79,6 +79,7 @@ typedef struct RelationData
bool rd_isvalid; /* relcache entry is valid */
char rd_indexvalid; /* state of rd_indexlist: 0 = not valid, 1 =
* valid, 2 = temporarily forced */
+ bool rd_mvstatvalid; /* state of rd_mvstatlist: true/false */
/*
* rd_createSubid is the ID of the highest subtransaction the rel has
@@ -111,6 +112,9 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
+
+ /* data managed by RelationGetMVStatList: */
+ List *rd_mvstatlist; /* list of OIDs of multivariate stats */
/* data managed by RelationGetIndexAttrBitmap: */
Bitmapset *rd_indexattr; /* identifies columns used in indexes */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 6953281..77efeff 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -38,6 +38,7 @@ extern void RelationClose(Relation relation);
* Routines to compute/retrieve additional cached information
*/
extern List *RelationGetIndexList(Relation relation);
+extern List *RelationGetMVStatList(Relation relation);
extern Oid RelationGetOidIndex(Relation relation);
extern Oid RelationGetReplicaIndex(Relation relation);
extern List *RelationGetIndexExpressions(Relation relation);
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 18404e2..bff702e 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,8 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATNAME,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 80374e4..428b1e8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1365,6 +1365,14 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index eb0bc88..92a0d8a 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
--
2.1.0
0003-clause-reduction-using-functional-dependencies.patchtext/x-diff; name=0003-clause-reduction-using-functional-dependencies.patchDownload
>From 039046f31843f2747a4fef4ed49b830b492ee459 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 19:42:18 +0200
Subject: [PATCH 3/7] clause reduction using functional dependencies
During planning, use functional dependencies to decide which
clauses to skip during cardinality estimation. Initial and
rather simplistic implementation.
This only works with regular WHERE clauses, not clauses used
for join clauses.
Note: The clause_is_mv_compatible() needs to identify the
relation (so that we can fetch the list of multivariate stats
by OID). planner_rt_fetch() seems like the appropriate way to
get the relation OID, but apparently it only works with simple
vars. Maybe examine_variable() would make this work with more
complex vars too?
Includes regression tests analyzing functional dependencies
(part of ANALYZE) on several datasets (no dependencies, no
transitive dependencies, ...).
Checks that a query with conditions on two columns, where one (B)
is functionally dependent on the other one (A), correctly ignores
the clause on (B) and chooses bitmap index scan instead of plain
index scan (which is what happens otherwise, thanks to assumption
of independence).
Note: Functional dependencies only work with equality clauses,
no inequalities etc.
---
src/backend/optimizer/path/clausesel.c | 912 +++++++++++++++++++++++++-
src/backend/utils/mvstats/common.c | 5 +-
src/backend/utils/mvstats/dependencies.c | 24 +
src/include/utils/mvstats.h | 16 +-
src/test/regress/expected/mv_dependencies.out | 172 +++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 150 +++++
8 files changed, 1278 insertions(+), 5 deletions(-)
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 6ce2726..c7f17e3 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -14,14 +14,19 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
+#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/selfuncs.h"
+#include "utils/typcache.h"
/*
@@ -41,6 +46,44 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+
+static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+
+static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, List *stats,
+ SpecialJoinInfo *sjinfo);
+
+static bool has_stats(List *stats, int type);
+
+static List * find_stats(PlannerInfo *root, List *clauses,
+ Oid varRelid, Index *relid);
+
+static Bitmapset* fdeps_collect_attnums(List *stats);
+
+static int *make_idx_to_attnum_mapping(Bitmapset *attnums);
+static int *make_attnum_to_idx_mapping(Bitmapset *attnums);
+
+static bool *build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx);
+
+static void multiply_adjacency_matrix(bool *matrix, int natts);
+
+static List* fdeps_reduce_clauses(List *clauses,
+ Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx,
+ Index relid);
+
+static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+
+static Bitmapset * get_varattnos(Node * node, Index relid);
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
@@ -60,7 +103,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -87,6 +130,88 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
*
* Of course this is all very dependent on the behavior of
* scalarltsel/scalargtsel; perhaps some day we can generalize the approach.
+ *
+ *
+ * Multivariate statististics
+ * --------------------------
+ * This also uses multivariate stats to estimate combinations of
+ * conditions, in a way (a) maximizing the estimate accuracy by using
+ * as many stats as possible, and (b) minimizing the overhead,
+ * especially when there are no suitable multivariate stats (so if you
+ * are not using multivariate stats, there's no additional overhead).
+ *
+ * The following checks are performed (in this order), and the optimizer
+ * falls back to regular stats on the first 'false'.
+ *
+ * NOTE: This explains how this works with all the patches applied, not
+ * just the functional dependencies.
+ *
+ * (0) check if there are multivariate stats on the relation
+ *
+ * If no, just skip all the following steps (directly to the
+ * original code).
+ *
+ * (1) check how many attributes are there in conditions compatible
+ * with functional dependencies
+ *
+ * Only simple equality clauses are considered compatible with
+ * functional dependencies (and that's unlikely to change, because
+ * that's the only case when functional dependencies are useful).
+ *
+ * If there are no conditions that might be handled by multivariate
+ * stats, or if the conditions reference just a single column, it
+ * makes no sense to use functional dependencies, so skip to (4).
+ *
+ * (2) reduce the clauses using functional dependencies
+ *
+ * This simply attempts to 'reduce' the clauses by applying functional
+ * dependencies. For example if there are two clauses:
+ *
+ * WHERE (a = 1) AND (b = 2)
+ *
+ * and we know that 'a' determines the value of 'b', we may remove
+ * the second condition (b = 2) when computing the selectivity.
+ * This is of course tricky - see mvstats/dependencies.c for details.
+ *
+ * After the reduction, step (1) is to be repeated.
+ *
+ * (3) check how many attributes are there in conditions compatible
+ * with MCV lists and histograms
+ *
+ * What conditions are compatible with multivariate stats is decided
+ * by clause_is_mv_compatible(). At this moment, only conditions
+ * of the form "column operator constant" (for simple comparison
+ * operators), IS [NOT] NULL and some AND/OR clauses are considered
+ * compatible with multivariate statistics.
+ *
+ * Again, see clause_is_mv_compatible() for details.
+ *
+ * (4) check how many attributes are there in conditions compatible
+ * with MCV lists and histograms
+ *
+ * If there are no conditions that might be handled by MCV lists
+ * or histograms, or if the conditions reference just a single
+ * column, it makes no sense to continue, so just skip to (7).
+ *
+ * (5) choose the stats matching the most columns
+ *
+ * If there are multiple instances of multivariate statistics (e.g.
+ * built on different sets of columns), we choose the stats covering
+ * the most columns from step (1). It may happen that all available
+ * stats match just a single column - for example with conditions
+ *
+ * WHERE a = 1 AND b = 2
+ *
+ * and statistics built on (a,c) and (b,c). In such case just fall
+ * back to the regular stats because it makes no sense to use the
+ * multivariate statistics.
+ *
+ * For more details about how exactly we choose the stats, see
+ * choose_mv_statistics().
+ *
+ * (6) use the multivariate stats to estimate matching clauses
+ *
+ * (7) estimate the remaining clauses using the regular statistics
*/
Selectivity
clauselist_selectivity(PlannerInfo *root,
@@ -99,6 +224,16 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+
+ /* attributes in mv-compatible clauses */
+ Bitmapset *mvattnums = NULL;
+ List *stats = NIL;
+
+ /* use clauses (not conditions), because those are always non-empty */
+ stats = find_stats(root, clauses, varRelid, &relid);
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +243,31 @@ clauselist_selectivity(PlannerInfo *root,
varRelid, jointype, sjinfo);
/*
+ * Check that there are some stats with functional dependencies
+ * built (by walking the stats list). We're going to find that
+ * anyway when trying to apply the functional dependencies, but
+ * this is probably a tad faster.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ {
+ /* collect attributes referenced by mv-compatible clauses */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+
+ /*
+ * If there are mv-compatible clauses, referencing at least two
+ * different columns (otherwise it makes no sense to use mv stats),
+ * try to reduce the clauses using functional dependencies, and
+ * recollect the attributes from the reduced list.
+ *
+ * We don't need to select a single statistics for this - we can
+ * apply all the functional dependencies we have.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ clauses = clauselist_apply_dependencies(root, clauses, varRelid,
+ stats, sjinfo);
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -763,3 +923,753 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
+ Index *relid, SpecialJoinInfo *sjinfo)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ *relid = InvalidOid;
+ }
+
+ return attnums;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+{
+
+ if (IsA(clause, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* no support for OR clauses at this point */
+ if (rinfo->orclause)
+ return false;
+
+ /* get the actual clause from the RestrictInfo (it's not an OR clause) */
+ clause = (Node*)rinfo->clause;
+
+ /* only simple opclauses are compatible with multivariate stats */
+ if (! is_opclause(clause))
+ return false;
+
+ /* we don't support join conditions at this moment */
+ if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
+ return false;
+
+ /* is it 'variable op constant' ? */
+ if (list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ *relid = var->varno;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+ *attnum = var->varattno;
+ return true;
+ }
+ }
+ }
+ }
+
+ return false;
+
+}
+
+/*
+ * Performs reduction of clauses using functional dependencies, i.e.
+ * removes clauses that are considered redundant. It simply walks
+ * through dependencies, and checks whether the dependency 'matches'
+ * the clauses, i.e. if there's a clause matching the condition. If yes,
+ * all clauses matching the implied part of the dependency are removed
+ * from the list.
+ *
+ * This simply looks at attnums references by the clauses, not at the
+ * type of the operator (equality, inequality, ...). This may not be the
+ * right way to do - it certainly works best for equalities, which is
+ * naturally consistent with functional dependencies (implications).
+ * It's not clear that other operators are handled sensibly - for
+ * example for inequalities, like
+ *
+ * WHERE (A >= 10) AND (B <= 20)
+ *
+ * and a trivial case where [A == B], resulting in symmetric pair of
+ * rules [A => B], [B => A], it's rather clear we can't remove either of
+ * those clauses.
+ *
+ * That only highlights that functional dependencies are most suitable
+ * for label-like data, where using non-equality operators is very rare.
+ * Using the common city/zipcode example, clauses like
+ *
+ * (zipcode <= 12345)
+ *
+ * or
+ *
+ * (cityname >= 'Washington')
+ *
+ * are rare. So restricting the reduction to equality should not harm
+ * the usefulness / applicability.
+ *
+ * The other assumption is that this assumes 'compatible' clauses. For
+ * example by using mismatching zip code and city name, this is unable
+ * to identify the discrepancy and eliminates one of the clauses. The
+ * usual approach (multiplying both selectivities) thus produces a more
+ * accurate estimate, although mostly by luck - the multiplication
+ * comes from assumption of statistical independence of the two
+ * conditions (which is not not valid in this case), but moves the
+ * estimate in the right direction (towards 0%).
+ *
+ * This might be somewhat improved by cross-checking the selectivities
+ * against MCV and/or histogram.
+ *
+ * The implementation needs to be careful about cyclic rules, i.e. rules
+ * like [A => B] and [B => A] at the same time. This must not reduce
+ * clauses on both attributes at the same time.
+ *
+ * Technically we might consider selectivities here too, somehow. E.g.
+ * when (A => B) and (B => A), we might use the clauses with minimum
+ * selectivity.
+ *
+ * TODO Consider restricting the reduction to equality clauses. Or maybe
+ * use equality classes somehow?
+ *
+ * TODO Merge this docs to dependencies.c, as it's saying mostly the
+ * same things as the comments there.
+ *
+ * TODO Currently this is applied only to the top-level clauses, but
+ * maybe we could apply it to lists at subtrees too, e.g. to the
+ * two AND-clauses in
+ *
+ * (x=1 AND y=2) OR (z=3 AND q=10)
+ *
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, List *stats,
+ SpecialJoinInfo *sjinfo)
+{
+ List *reduced_clauses = NIL;
+ Index relid;
+
+ /*
+ * matrix of (natts x natts), 1 means x=>y
+ *
+ * This serves two purposes - first, it merges dependencies from all
+ * the statistics, second it makes generating all the transitive
+ * dependencies easier.
+ *
+ * We need to build this only for attributes from the dependencies,
+ * not for all attributes in the table.
+ *
+ * We can't do that only for attributes from the clauses, because we
+ * want to build transitive dependencies (including those going
+ * through attributes not listed in the stats).
+ *
+ * This only works for A=>B dependencies, not sure how to do that
+ * for complex dependencies.
+ */
+ bool *deps_matrix;
+ int deps_natts; /* size of the matric */
+
+ /* mapping attnum <=> matrix index */
+ int *deps_idx_to_attnum;
+ int *deps_attnum_to_idx;
+
+ /* attnums in dependencies and clauses (and intersection) */
+ List *deps_clauses = NIL;
+ Bitmapset *deps_attnums = NULL;
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *intersect_attnums = NULL;
+
+ /*
+ * Is there at least one statistics with functional dependencies?
+ * If not, return the original clauses right away.
+ *
+ * XXX Isn't this pointless, thanks to exactly the same check in
+ * clauselist_selectivity()? Can we trigger the condition here?
+ */
+ if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ return clauses;
+
+ /*
+ * Build the dependency matrix, i.e. attribute adjacency matrix,
+ * where 1 means (a=>b). Once we have the adjacency matrix, we'll
+ * multiply it by itself, to get transitive dependencies.
+ *
+ * Note: This is pretty much transitive closure from graph theory.
+ *
+ * First, let's see what attributes are covered by functional
+ * dependencies (sides of the adjacency matrix), and also a maximum
+ * attribute (size of mapping to simple integer indexes);
+ */
+ deps_attnums = fdeps_collect_attnums(stats);
+
+ /*
+ * Walk through the clauses - clauses that are (one of)
+ *
+ * (a) not mv-compatible
+ * (b) are using more than a single attnum
+ * (c) using attnum not covered by functional depencencies
+ *
+ * may be copied directly to the result. The interesting clauses are
+ * kept in 'deps_clauses' and will be processed later.
+ */
+ clause_attnums = fdeps_filter_clauses(root, clauses, deps_attnums,
+ &reduced_clauses, &deps_clauses,
+ varRelid, &relid, sjinfo);
+
+ /*
+ * we need at least two clauses referencing two different attributes
+ * referencing to do the reduction
+ */
+ if ((list_length(deps_clauses) < 2) || (bms_num_members(clause_attnums) < 2))
+ {
+ bms_free(clause_attnums);
+ list_free(reduced_clauses);
+ list_free(deps_clauses);
+
+ return clauses;
+ }
+
+
+ /*
+ * We need at least two matching attributes in the clauses and
+ * dependencies, otherwise we can't really reduce anything.
+ */
+ intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
+ if (bms_num_members(intersect_attnums) < 2)
+ {
+ bms_free(clause_attnums);
+ bms_free(deps_attnums);
+ bms_free(intersect_attnums);
+
+ list_free(deps_clauses);
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ /*
+ * Build mapping between matrix indexes and attnums, and then the
+ * adjacency matrix itself.
+ */
+ deps_idx_to_attnum = make_idx_to_attnum_mapping(deps_attnums);
+ deps_attnum_to_idx = make_attnum_to_idx_mapping(deps_attnums);
+
+ /* build the adjacency matrix */
+ deps_matrix = build_adjacency_matrix(stats, deps_attnums,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx);
+
+ deps_natts = bms_num_members(deps_attnums);
+
+ /*
+ * Multiply the matrix N-times (N = size of the matrix), so that we
+ * get all the transitive dependencies. That makes the next step
+ * much easier and faster.
+ *
+ * This is essentially an adjacency matrix from graph theory, and
+ * by multiplying it we get transitive edges. We don't really care
+ * about the exact number (number of paths between vertices) though,
+ * so we can do the multiplication in-place (we don't care whether
+ * we found the dependency in this round or in the previous one).
+ *
+ * Track how many new dependencies were added, and stop when 0, but
+ * we can't multiply more than N-times (longest path in the graph).
+ */
+ multiply_adjacency_matrix(deps_matrix, deps_natts);
+
+ /*
+ * Walk through the clauses, and see which other clauses we may
+ * reduce. The matrix contains all transitive dependencies, which
+ * makes this very fast.
+ *
+ * We have to be careful not to reduce the clause using itself, or
+ * reducing all clauses forming a cycle (so we have to skip already
+ * eliminated clauses).
+ *
+ * I'm not sure whether this guarantees finding the best solution,
+ * i.e. reducing the most clauses, but it probably does (thanks to
+ * having all the transitive dependencies).
+ */
+ deps_clauses = fdeps_reduce_clauses(deps_clauses,
+ deps_attnums, deps_matrix,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx, relid);
+
+ /* join the two lists of clauses */
+ reduced_clauses = list_union(reduced_clauses, deps_clauses);
+
+ pfree(deps_matrix);
+ pfree(deps_idx_to_attnum);
+ pfree(deps_attnum_to_idx);
+
+ bms_free(deps_attnums);
+ bms_free(clause_attnums);
+ bms_free(intersect_attnums);
+
+ return reduced_clauses;
+}
+
+static bool
+has_stats(List *stats, int type)
+{
+ ListCell *s;
+
+ foreach (s, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Determing relid (either from varRelid or from clauses) and then
+ * lookup stats using the relid.
+ */
+static List *
+find_stats(PlannerInfo *root, List *clauses, Oid varRelid, Index *relid)
+{
+ /* unknown relid by default */
+ *relid = InvalidOid;
+
+ /*
+ * First we need to find the relid (index info simple_rel_array).
+ * If varRelid is not 0, we already have it, otherwise we have to
+ * look it up from the clauses.
+ */
+ if (varRelid != 0)
+ *relid = varRelid;
+ else
+ {
+ Relids relids = pull_varnos((Node*)clauses);
+
+ /*
+ * We only expect 0 or 1 members in the bitmapset. If there are
+ * no vars, we'll get empty bitmapset, otherwise we'll get the
+ * relid as the single member.
+ *
+ * FIXME For some reason we can get 2 relids here (e.g. \d in
+ * psql does that).
+ */
+ if (bms_num_members(relids) == 1)
+ *relid = bms_singleton_member(relids);
+
+ bms_free(relids);
+ }
+
+ /*
+ * if we found the relid, we can get the stats from simple_rel_array
+ *
+ * This only gets stats that are already built, because that's how
+ * we load it into RelOptInfo (see get_relation_info), but we don't
+ * detoast the whole stats yet. That'll be done later, after we
+ * decide which stats to use.
+ */
+ if (*relid != InvalidOid)
+ return root->simple_rel_array[*relid]->mvstatlist;
+
+ return NIL;
+}
+
+static Bitmapset*
+fdeps_collect_attnums(List *stats)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ int2vector *stakeys = info->stakeys;
+
+ /* skip stats without functional dependencies built */
+ if (! info->deps_built)
+ continue;
+
+ for (j = 0; j < stakeys->dim1; j++)
+ attnums = bms_add_member(attnums, stakeys->values[j]);
+ }
+
+ return attnums;
+}
+
+
+static int*
+make_idx_to_attnum_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum = -1;
+
+ int *mapping = (int*)palloc0(bms_num_members(attnums) * sizeof(int));
+
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attidx++] = attnum;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+static int*
+make_attnum_to_idx_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum = -1;
+ int maxattnum = -1;
+ int *mapping;
+
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ maxattnum = attnum;
+
+ mapping = (int*)palloc0((maxattnum+1) * sizeof(int));
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attnum] = attidx++;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+static bool*
+build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx)
+{
+ ListCell *lc;
+ int natts = bms_num_members(attnums);
+ bool *matrix = (bool*)palloc0(natts * natts * sizeof(bool));
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies dependencies = NULL;
+
+ /* skip stats without functional dependencies built */
+ if (! stat->deps_built)
+ continue;
+
+ /* fetch and deserialize dependencies */
+ dependencies = load_mv_dependencies(stat->mvoid);
+ if (dependencies == NULL)
+ {
+ elog(WARNING, "failed to deserialize func deps %d", stat->mvoid);
+ continue;
+ }
+
+ /* set matrix[a,b] to 'true' if 'a=>b' */
+ for (j = 0; j < dependencies->ndeps; j++)
+ {
+ int aidx = attnum_to_idx[dependencies->deps[j]->a];
+ int bidx = attnum_to_idx[dependencies->deps[j]->b];
+
+ /* a=> b */
+ matrix[aidx * natts + bidx] = true;
+ }
+ }
+
+ return matrix;
+}
+
+static void
+multiply_adjacency_matrix(bool *matrix, int natts)
+{
+ int i;
+
+ for (i = 0; i < natts; i++)
+ {
+ int k, l, m;
+ int nchanges = 0;
+
+ /* k => l */
+ for (k = 0; k < natts; k++)
+ {
+ for (l = 0; l < natts; l++)
+ {
+ /* we already have this dependency */
+ if (matrix[k * natts + l])
+ continue;
+
+ /* we don't really care about the exact value, just 0/1 */
+ for (m = 0; m < natts; m++)
+ {
+ if (matrix[k * natts + m] * matrix[m * natts + l])
+ {
+ matrix[k * natts + l] = true;
+ nchanges += 1;
+ break;
+ }
+ }
+ }
+ }
+
+ /* no transitive dependency added here, so terminate */
+ if (nchanges == 0)
+ break;
+ }
+}
+
+static List*
+fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx, Index relid)
+{
+ int i;
+ ListCell *lc;
+ List *reduced_clauses = NIL;
+
+ int nmvclauses; /* size of the arrays */
+ bool *reduced;
+ AttrNumber *mvattnums;
+ Node **mvclauses;
+
+ int natts = bms_num_members(attnums);
+
+ /*
+ * Preallocate space for all clauses (the list only containst
+ * compatible clauses at this point). This makes it somewhat easier
+ * to access the stats / attnums randomly.
+ *
+ * XXX This assumes each clause references exactly one Var, so the
+ * arrays are sized accordingly - for functional dependencies
+ * this is safe, because it only works with Var=Const.
+ */
+ mvclauses = (Node**)palloc0(list_length(clauses) * sizeof(Node*));
+ mvattnums = (AttrNumber*)palloc0(list_length(clauses) * sizeof(AttrNumber));
+ reduced = (bool*)palloc0(list_length(clauses) * sizeof(bool));
+
+ /* fill the arrays */
+ nmvclauses = 0;
+ foreach (lc, clauses)
+ {
+ Node * clause = (Node*)lfirst(lc);
+ Bitmapset * attnums = get_varattnos(clause, relid);
+
+ mvclauses[nmvclauses] = clause;
+ mvattnums[nmvclauses] = bms_singleton_member(attnums);
+ nmvclauses++;
+ }
+
+ Assert(nmvclauses == list_length(clauses));
+
+ /* now try to reduce the clauses (using the dependencies) */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ int j;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[i], attnums))
+ continue;
+
+ /* this clause was already reduced, so let's skip it */
+ if (reduced[i])
+ continue;
+
+ /* walk the potentially 'implied' clauses */
+ for (j = 0; j < nmvclauses; j++)
+ {
+ int aidx, bidx;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[j], attnums))
+ continue;
+
+ aidx = attnum_to_idx[mvattnums[i]];
+ bidx = attnum_to_idx[mvattnums[j]];
+
+ /* can't reduce the clause by itself, or if already reduced */
+ if ((i == j) || reduced[j])
+ continue;
+
+ /* mark the clause as reduced (if aidx => bidx) */
+ reduced[j] = matrix[aidx * natts + bidx];
+ }
+ }
+
+ /* now walk through the clauses, and keep only those not reduced */
+ for (i = 0; i < nmvclauses; i++)
+ if (! reduced[i])
+ reduced_clauses = lappend(reduced_clauses, mvclauses[i]);
+
+ pfree(reduced);
+ pfree(mvclauses);
+ pfree(mvattnums);
+
+ return reduced_clauses;
+}
+
+
+static Bitmapset *
+fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo)
+{
+ ListCell *lc;
+ Bitmapset *clause_attnums = NULL;
+
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(lc);
+
+ if (! clause_is_mv_compatible(root, clause, varRelid, relid,
+ &attnum, sjinfo))
+
+ /* clause incompatible with functional dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(attnum, deps_attnums))
+
+ /* clause not covered by the dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else
+ {
+ *deps_clauses = lappend(*deps_clauses, clause);
+ clause_attnums = bms_add_member(clause_attnums, attnum);
+ }
+ }
+
+ return clause_attnums;
+}
+
+/*
+ * Pull varattnos from the clauses, similarly to pull_varattnos() but:
+ *
+ * (a) only get attributes for a particular relation (relid)
+ * (b) ignore system attributes (we can't build stats on them anyway)
+ *
+ * This makes it possible to directly compare the result with attnum
+ * values from pg_attribute etc.
+ */
+static Bitmapset *
+get_varattnos(Node * node, Index relid)
+{
+ int k;
+ Bitmapset *varattnos = NULL;
+ Bitmapset *result = NULL;
+
+ /* get the varattnos */
+ pull_varattnos(node, relid, &varattnos);
+
+ k = -1;
+ while ((k = bms_next_member(varattnos, k)) >= 0)
+ {
+ if (k + FirstLowInvalidHeapAttributeNumber > 0)
+ result
+ = bms_add_member(result,
+ k + FirstLowInvalidHeapAttributeNumber);
+ }
+
+ bms_free(varattnos);
+
+ return result;
+}
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index a755c49..bd200bc 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -84,7 +84,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(stat->mvoid, deps, attrs);
@@ -163,6 +164,7 @@ list_mv_stats(Oid relid)
info->mvoid = HeapTupleGetOid(htup);
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
result = lappend(result, info);
@@ -274,6 +276,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 84b6561..0a08d12 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -636,3 +636,27 @@ pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(result));
}
+
+MVDependencies
+load_mv_dependencies(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->deps_enabled && mvstat->deps_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_dependencies(DatumGetByteaP(deps));
+}
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 411cd16..02a7dda 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -16,12 +16,20 @@
#include "commands/vacuum.h"
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
int16 a;
@@ -47,6 +55,8 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
+MVDependencies load_mv_dependencies(Oid mvoid);
+
bytea * serialize_mv_dependencies(MVDependencies dependencies);
/* deserialization of stats (serialization is private to analyze) */
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..e759997
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,172 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index b1bc7c7..81484f1 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -110,3 +110,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index ade9ef1..14ea574 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -161,3 +161,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..48dea4d
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,150 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
--
2.1.0
0004-multivariate-MCV-lists.patchtext/x-diff; name=0004-multivariate-MCV-lists.patchDownload
>From 1ce724f8813c5e680be3b845a6a8d2d3cf8f3560 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 16:52:15 +0200
Subject: [PATCH 4/7] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
---
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 45 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 1079 ++++++++++++++++++++++++++--
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 104 ++-
src/backend/utils/mvstats/common.h | 11 +-
src/backend/utils/mvstats/mcv.c | 1237 ++++++++++++++++++++++++++++++++
src/bin/psql/describe.c | 25 +-
src/include/catalog/pg_mv_statistic.h | 19 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 69 +-
src/test/regress/expected/mv_mcv.out | 207 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 178 +++++
19 files changed, 2898 insertions(+), 101 deletions(-)
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index e3f3387..6482aa7 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -165,7 +165,9 @@ CREATE VIEW pg_mv_stats AS
S.staname AS staname,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 3790082..f730253 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -134,7 +134,13 @@ CreateStatistics(CreateStatsStmt *stmt)
ObjectAddress parentobject, childobject;
/* by default build nothing */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -191,6 +197,29 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -199,10 +228,16 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -223,8 +258,12 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index cae21d0..0f58199 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1948,9 +1948,11 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
+ WRITE_BOOL_FIELD(mcv_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
+ WRITE_BOOL_FIELD(mcv_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index c7f17e3..f122045 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -15,6 +15,7 @@
#include "postgres.h"
#include "access/sysattr.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
@@ -47,17 +48,38 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+ Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int type);
static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
+ int type);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, List *stats,
SpecialJoinInfo *sjinfo);
+static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
+
+static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid,
+ List **mvclauses, MVStatisticInfo *mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
@@ -85,6 +107,13 @@ static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
static Bitmapset * get_varattnos(Node * node, Index relid);
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,isor) \
+ (m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -250,8 +279,12 @@ clauselist_selectivity(PlannerInfo *root,
*/
if (has_stats(stats, MV_CLAUSE_TYPE_FDEP))
{
- /* collect attributes referenced by mv-compatible clauses */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+ /*
+ * Collect attributes referenced by mv-compatible clauses (looking
+ * for clauses compatible with functional dependencies for now).
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_FDEP);
/*
* If there are mv-compatible clauses, referencing at least two
@@ -268,6 +301,48 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Check that there are statistics with MCV list. If not, we don't
+ * need to waste time with the optimization.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV))
+ {
+ /*
+ * Recollect attributes from mv-compatible clauses (maybe we've
+ * removed so many clauses we have a single mv-compatible attnum).
+ * From now on we're only interested in MCV-compatible clauses.
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_MCV);
+
+ /*
+ * If there still are at least two columns, we'll try to select
+ * a suitable multivariate stats.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /* see choose_mv_statistics() for details */
+ MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+
+ if (mvstat != NULL) /* we have a matching stats */
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -924,12 +999,129 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Estimate selectivity for the list of MV-compatible clauses, using
+ * using a MV statistics (combining a histogram and MCV list).
+ *
+ * This simply passes the estimation to the MCV list and then to the
+ * histogram, if available.
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities
+ * (i.e. the selectivity of the most restrictive clause), because
+ * that's the maximum we can ever get from ANDed list of clauses.
+ * This may probably prevent issues with hitting too many buckets
+ * and low precision histograms.
+ *
+ * TODO We may support some additional conditions, most importantly
+ * those matching multiple columns (e.g. "a = b" or "a < b").
+ * Ultimately we could track multi-table histograms for join
+ * cardinality estimation.
+ *
+ * TODO Further thoughts on processing equality clauses: Maybe it'd be
+ * better to look for stats (with MCV) covered by the equality
+ * clauses, because then we have a chance to find an exact match
+ * in the MCV list, which is pretty much the best we can do. We may
+ * also look at the least frequent MCV item, and use it as a upper
+ * boundary for the selectivity (had there been a more frequent
+ * item, it'd be in the MCV list).
+ *
+ * TODO There are several options for 'sanity clamping' the estimates.
+ *
+ * First, if we have selectivities for each condition, then
+ *
+ * P(A,B) <= MIN(P(A), P(B))
+ *
+ * Because additional conditions (connected by AND) can only lower
+ * the probability.
+ *
+ * So we can do some basic sanity checks using the single-variate
+ * stats (the ones we have right now).
+ *
+ * Second, when we have multivariate stats with a MCV list, then
+ *
+ * (a) if we have a full equality condition (one equality condition
+ * on each column) and we found a match in the MCV list, this is
+ * the selectivity (and it's supposed to be exact)
+ *
+ * (b) if we have a full equality condition and we haven't found a
+ * match in the MCV list, then the selectivity is below the
+ * lowest selectivity in the MCV list
+ *
+ * (c) if we have a equality condition (not full), we can still
+ * search the MCV for matches and use the sum of probabilities
+ * as a lower boundary for the histogram (if there are no
+ * matches in the MCV list, then we have no boundary)
+ *
+ * Third, if there are multiple (combinations of) multivariate
+ * stats for a set of clauses, we may compute all of them and then
+ * somehow aggregate them - e.g. by choosing the minimum, median or
+ * average. The stats are susceptible to overestimation (because
+ * we take 50% of the bucket for partial matches). Some stats may
+ * give better estimates than others, but it's very difficult to
+ * say that in advance which one is the best (it depends on the
+ * number of buckets, number of additional columns not referenced
+ * in the clauses, type of condition etc.).
+ *
+ * So we may compute them all and then choose a sane aggregation
+ * (minimum seems like a good approach). Of course, this may result
+ * in longer / more expensive estimation (CPU-wise), but it may be
+ * worth it.
+ *
+ * It's possible to add a GUC choosing whether to do a 'simple'
+ * (using a single stats expected to give the best estimate) and
+ * 'complex' (combining the multiple estimates).
+ *
+ * multivariate_estimates = (simple|full)
+ *
+ * Also, this might be enabled at a table level, by something like
+ *
+ * ALTER TABLE ... SET STATISTICS (simple|full)
+ *
+ * Which would make it possible to use this only for the tables
+ * where the simple approach does not work.
+ *
+ * Also, there are ways to optimize this algorithmically. E.g. we
+ * may try to get an estimate from a matching MCV list first, and
+ * if we happen to get a "full equality match" we may stop computing
+ * the estimates from other stats (for this condition) because
+ * that's probably the best estimate we can really get.
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive). But this
+ * requires really knowing the per-clause selectivities in advance,
+ * and that's not what we do now.
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
- Index *relid, SpecialJoinInfo *sjinfo)
+ Index *relid, SpecialJoinInfo *sjinfo, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
@@ -945,12 +1137,11 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
+ sjinfo, types);
}
/*
@@ -969,6 +1160,188 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
}
/*
+ * We're looking for statistics matching at least 2 attributes,
+ * referenced in the clauses compatible with multivariate statistics.
+ * The current selection criteria is very simple - we choose the
+ * statistics referencing the most attributes.
+ *
+ * If there are multiple statistics referencing the same number of
+ * columns (from the clauses), the one with less source columns
+ * (as listed in the ADD STATISTICS when creating the statistics) wins.
+ * Other wise the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns,
+ * but one has 100 buckets and the other one has 1000 buckets (thus
+ * likely providing better estimates), this is not currently
+ * considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list,
+ * another one with just a histogram and a third one with both,
+ * this is not considered.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts,
+ * so if there are multiple clauses on a single attribute, this
+ * still counts as a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example
+ * equality clauses probably work better with MCV lists than with
+ * histograms. But IS [NOT] NULL conditions may often work better
+ * with histograms (thanks to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
+ * selected as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list
+ * of clauses into two parts - conditions that are compatible with the
+ * selected stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while
+ * the last condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting
+ * conditions instead of just referenced attributes), but eventually
+ * the best option should be to combine multiple statistics. But that's
+ * much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing
+ * the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses,
+ * because 'dependencies' will probably work only with equality
+ * clauses.
+ */
+static MVStatisticInfo *
+choose_mv_statistics(List *stats, Bitmapset *attnums)
+{
+ int i;
+ ListCell *lc;
+
+ MVStatisticInfo *choice = NULL;
+
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements)
+ * and for each one count the referenced attributes (encoded in
+ * the 'attnums' bitmap).
+ */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = info->stakeys;
+ int numattrs = attrs->dim1;
+
+ /* skip dependencies-only stats */
+ if (! info->mcv_built)
+ continue;
+
+ /* count columns covered by the histogram */
+ for (i = 0; i < numattrs; i++)
+ if (bms_is_member(attrs->values[i], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = info;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen statistics, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes covered by the stats, so we can
+ * do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(root, clause, varRelid, NULL,
+ &attnums, sjinfo, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list
+ * of mv-compatible clauses. Otherwise, keep it in the list of
+ * 'regular' clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
* into simple_rte_array) and a bitmap of attributes. This is then
@@ -987,8 +1360,12 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
static bool
clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+ Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int types)
{
+ Relids clause_relids;
+ Relids left_relids;
+ Relids right_relids;
if (IsA(clause, RestrictInfo))
{
@@ -998,82 +1375,176 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
if (rinfo->pseudoconstant)
return false;
- /* no support for OR clauses at this point */
- if (rinfo->orclause)
- return false;
-
/* get the actual clause from the RestrictInfo (it's not an OR clause) */
clause = (Node*)rinfo->clause;
- /* only simple opclauses are compatible with multivariate stats */
- if (! is_opclause(clause))
- return false;
-
/* we don't support join conditions at this moment */
if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
return false;
+ clause_relids = rinfo->clause_relids;
+ left_relids = rinfo->left_relids;
+ right_relids = rinfo->right_relids;
+ }
+ else if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
+ {
+ left_relids = pull_varnos(get_leftop((Expr*)clause));
+ right_relids = pull_varnos(get_rightop((Expr*)clause));
+
+ clause_relids = bms_union(left_relids,
+ right_relids);
+ }
+ else
+ {
+ /* Not a binary opclause, so mark left/right relid sets as empty */
+ left_relids = NULL;
+ right_relids = NULL;
+ /* and get the total relid set the hard way */
+ clause_relids = pull_varnos((Node *) clause);
+ }
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
/* is it 'variable op constant' ? */
- if (list_length(((OpExpr *) clause)->args) == 2)
+
+ ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ left_relids)));
+
+ if (ok)
{
- OpExpr *expr = (OpExpr *) clause;
- bool varonleft = true;
- bool ok;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
- ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
- (is_pseudo_constant_clause_relids(lsecond(expr->args),
- rinfo->right_relids) ||
- (varonleft = false,
- is_pseudo_constant_clause_relids(linitial(expr->args),
- rinfo->left_relids)));
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
- if (ok)
- {
- Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
- /*
- * Simple variables only - otherwise the planner_rt_fetch seems to fail
- * (return NULL).
- *
- * TODO Maybe use examine_variable() would fix that?
- */
- if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
- return false;
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
- /*
- * Only consider this variable if (varRelid == 0) or when the varno
- * matches varRelid (see explanation at clause_selectivity).
- *
- * FIXME I suspect this may not be really necessary. The (varRelid == 0)
- * part seems to be enforced by treat_as_join_clause().
- */
- if (! ((varRelid == 0) || (varRelid == var->varno)))
- return false;
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
+ *relid = var->varno;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ /* not compatible with functional dependencies */
+ if (types & MV_CLAUSE_TYPE_MCV)
+ {
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return (types & MV_CLAUSE_TYPE_MCV);
+ }
+ return false;
+
+ case F_EQSEL:
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return true;
+ }
+ }
+ }
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ Var * var = (Var*)((NullTest*)clause)->arg;
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
- /* Also skip special varno values, and system attributes ... */
- if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
- return false;
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
*relid = var->varno;
- /*
- * If it's not a "<" or ">" or "=" operator, just ignore the
- * clause. Otherwise note the relid and attnum for the variable.
- * This uses the function for estimating selectivity, ont the
- * operator directly (a bit awkward, but well ...).
- */
- switch (get_oprrest(expr->opno))
- {
- case F_EQSEL:
- *attnum = var->varattno;
- return true;
- }
- }
+ *attnums = bms_add_member(*attnums, var->varattno);
+
+ return true;
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /*
+ * AND/OR-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses
+ * are supported and some are not, and treat all supported
+ * subclauses as a single clause, compute it's selectivity
+ * using mv stats, and compute the total selectivity using
+ * the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the
+ * orclause with nested RestrictInfo - we won't have to
+ * call pull_varnos() for each clause, saving time.
+ */
+ Bitmapset *tmp = NULL;
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ if (! clause_is_mv_compatible(root, (Node*)lfirst(l),
+ varRelid, relid, &tmp, sjinfo, types))
+ return false;
}
+
+ /* add the attnums from the OR-clause to the set of attnums */
+ *attnums = bms_join(*attnums, tmp);
+
+ return true;
}
return false;
-
}
/*
@@ -1322,6 +1793,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
}
return false;
@@ -1617,25 +2091,39 @@ fdeps_filter_clauses(PlannerInfo *root,
foreach (lc, clauses)
{
- AttrNumber attnum;
+ Bitmapset *attnums = NULL;
Node *clause = (Node *) lfirst(lc);
- if (! clause_is_mv_compatible(root, clause, varRelid, relid,
- &attnum, sjinfo))
+ if (! clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
+ sjinfo, MV_CLAUSE_TYPE_FDEP))
/* clause incompatible with functional dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
- else if (! bms_is_member(attnum, deps_attnums))
+ else if (bms_num_members(attnums) > 1)
+
+ /*
+ * clause referencing multiple attributes (strange, should
+ * this be handled by clause_is_mv_compatible directly)
+ */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(bms_singleton_member(attnums), deps_attnums))
/* clause not covered by the dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
else
{
+ /* ok, clause compatible with existing dependencies */
+ Assert(bms_num_members(attnums) == 1);
+
*deps_clauses = lappend(*deps_clauses, clause);
- clause_attnums = bms_add_member(clause_attnums, attnum);
+ clause_attnums = bms_add_member(clause_attnums,
+ bms_singleton_member(attnums));
}
+
+ bms_free(attnums);
}
return clause_attnums;
@@ -1673,3 +2161,454 @@ get_varattnos(Node * node, Index relid)
return result;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = load_mv_mcvlist(mvstats->mvoid);
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* used to 'scale' for MCV lists not covering all tuples */
+ u += mcvlist->items[i]->frequency;
+
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s*u;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /* frequency of the lowest MCV item */
+ *lowsel = 1.0;
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ *
+ * FIXME This would probably deserve a refactoring, I guess. Unify
+ * the two loops and put the checks inside, or something like
+ * that.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ /* operator */
+ FmgrInfo opproc;
+
+ /* get procedure computing operator selectivity */
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo ltproc, gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype,
+ TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* TODO consider bsearch here (list is sorted by values)
+ * TODO handle other operators too (LT, GT)
+ * TODO identify "full match" when the clauses fully
+ * match the whole MCV list (so that checking the
+ * histogram is not needed)
+ */
+ if (oprrest == F_EQSEL)
+ {
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ bool match = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (match)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ mismatch = (! match);
+ }
+ else if (oprrest == F_SCALARLTSEL) /* column < constant */
+ {
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ } /* (get_oprrest(expr->opno) == F_SCALARLTSEL) */
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+ }
+ else if (oprrest == F_SCALARGTSEL) /* column > constant */
+ {
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARGTSEL) */
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! mcvlist->items[i]->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (mcvlist->items[i]->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 60fd57f..0da7ad9 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -410,7 +410,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built)
+ if (mvstat->deps_built || mvstat->mcv_built)
{
info = makeNode(MVStatisticInfo);
@@ -419,9 +419,11 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
+ info->mcv_enabled = mvstat->mcv_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
+ info->mcv_built = mvstat->mcv_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..f9bf10c 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o dependencies.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index bd200bc..d1da714 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,12 +16,14 @@
#include "common.h"
+#include "utils/array.h"
+
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+ int natts,
+ VacAttrStats **vacattrstats);
static List* list_mv_stats(Oid relid);
-
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -49,6 +51,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int j;
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -87,8 +91,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (stat->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, attrs);
+ update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -166,6 +174,8 @@ list_mv_stats(Oid relid)
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
+ info->mcv_enabled = stats->mcv_enabled;
+ info->mcv_built = stats->mcv_built;
result = lappend(result, info);
}
@@ -180,8 +190,56 @@ list_mv_stats(Oid relid)
return result;
}
+
+/*
+ * Find attnims of MV stats using the mvoid.
+ */
+int2vector*
+find_mv_attnums(Oid mvoid, Oid *relid)
+{
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ HeapTuple htup;
+ int2vector *keys;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ htup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+ /* starelid */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_starelid, &isnull);
+ Assert(!isnull);
+
+ *relid = DatumGetObjectId(adatum);
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+ ReleaseSysCache(htup);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return keys;
+}
+
+
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -206,18 +264,29 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
@@ -246,6 +315,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -256,11 +340,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index 6d5465b..f4309f7 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -46,7 +46,15 @@ typedef struct
Datum value; /* a data value */
int tupno; /* position index for tuple it came from */
} ScalarItem;
-
+
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -71,5 +79,6 @@ int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..670dbda
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1237 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+
+/*
+ * Multivariate MCVs (most-common values lists) are a straightforward
+ * extension of regular MCV list, tracking combinations of values for
+ * several attributes (columns), including NULL flags, and frequency
+ * of the combination.
+ *
+ * For columns with small number of distinct values, this works quite
+ * well and may represent the distribution very accurately. For columns
+ * with large number of distinct values (e.g. stored as FLOAT), this
+ * does not work that well. Especially if the distribution is mostly
+ * uniform, with no very common combinations.
+ *
+ * If we can represent the distribution as a MCV list, we can estimate
+ * some clauses (e.g. equality clauses) much accurately than using
+ * histograms for example.
+ *
+ * Another benefit of MCV lists (compared to histograms) is that they
+ * don't require sorting of the values, so that they work better for
+ * data types that either don't support sorting at all, or when the
+ * sorting does not really match the meaning. For example we know how to
+ * sort strings, but it's unlikely to make much sense for city names.
+ *
+ *
+ * Hashed MCV (not yet implemented)
+ * --------------------------------
+ * By restricting to MCV list and equality conditions, we may use hash
+ * values instead of the long varlena values. This significantly reduces
+ * the storage requirements, and we can still use it to estimate the
+ * equality conditions (assuming the collisions are rare enough).
+ *
+ * This however complicates matching the columns to available stats, as
+ * it requires matching clauses (not columns) to stats. And it may get
+ * quite complex - e.g. what if there are multiple clauses, each
+ * compatible with different stats subset?
+ *
+ *
+ * Selectivity estimation
+ * ----------------------
+ * The estimation, implemented in clauselist_mv_selectivity_mcvlist(),
+ * is quite simple in principle - walk through the MCV items and sum
+ * frequencies of all the items that match all the clauses.
+ *
+ * The current implementation uses MCV lists to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ * (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ * (d) OR clauses WHERE (a < 1) OR (b >= 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (e) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ *
+ * Estimating equality clauses
+ * ---------------------------
+ * When computing selectivity estimate for equality clauses
+ *
+ * (a = 1) AND (b = 2)
+ *
+ * we can do this estimate pretty exactly assuming that two conditions
+ * are met:
+ *
+ * (1) there's an equality condition on each attribute
+ *
+ * (2) we find a matching item in the MCV list
+ *
+ * In that case we know the MCV item represents all the tuples matching
+ * the clauses, and the selectivity estimate is complete. This is what
+ * we call 'full match'.
+ *
+ * When only (1) holds, but there's no matching MCV item, we don't know
+ * whether there are no such rows or just are not very frequent. We can
+ * however use the frequency of the least frequent MCV item as an upper
+ * bound for the selectivity.
+ *
+ * If the equality conditions match only a subset of the attributes
+ * the MCV list is built on (i.e. we can't get a full match - we may get
+ * multiple MCV items matching the clauses, but even if we get a single
+ * match there may be items that did not get into the MCV list. But in
+ * this case we can still use the frequency of the last MCV item to clam
+ * the 'additional' selectivity not accounted for by the matching items.
+ *
+ * If there's no histogram, because the MCV list approximates the
+ * distribution accurately (not because the histogram was disabled),
+ * it does not really matter whether there are equality conditions on
+ * all the columns - we can do pretty accurate estimation using the MCV.
+ *
+ * TODO For a combination of equality conditions (not full-match case)
+ * we probably can clamp the selectivity by the minimum of
+ * selectivities for each condition. For example if we know the
+ * number of distinct values for each column, we can use 1/ndistinct
+ * as a per-column estimate. Or rather 1/ndistinct + selectivity
+ * derived from the MCV list.
+ *
+ * If we know the estimate of number of combinations of the columns
+ * (i.e. ndistinct(A,B)), we may estimate the average frequency of
+ * items in the remaining 10% as [10% / ndistinct(A,B)].
+ *
+ *
+ * Bounding estimates
+ * ------------------
+ * In general the MCV lists may not provide estimates as accurate as
+ * for the full-match equality case, but may provide some useful
+ * lower/upper boundaries for the estimation error.
+ *
+ * With equality clauses we can do a few more tricks to narrow this
+ * error range (see the previous section and TODO), but with inequality
+ * clauses (or generally non-equality clauses), it's rather dificult.
+ * There's nothing like a 'full match' - we have to consider both the
+ * MCV items and the remaining part every time. We can't use the minimum
+ * selectivity of MCV items, as the clauses may match multiple items.
+ *
+ * For example with a MCV list on columns (A, B), covering 90% of the
+ * table (computed while building the MCV list), about ~10% of the table
+ * is not represented by the MCV list. So even if the conditions match
+ * all the remaining rows (not represented by the MCV items), we can't
+ * get selectivity higher than those 10%. We may use 1/2 the remaining
+ * selectivity as an estimate (minimizing average error).
+ *
+ * TODO Most of these ideas (error limiting) are not yet implemented.
+ *
+ *
+ * General TODO
+ * ------------
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * TODO Add support for clauses referencing multiple columns (a < b).
+ *
+ * TODO It's possible to build a special case of MCV list, storing not
+ * the actual values but only 32/64-bit hash. This is only useful
+ * for estimating equality clauses and for large varlena types,
+ * which are very impractical for plain MCV list because of size.
+ * But for those data types we really want just the equality
+ * clauses, so it's actually a good solution.
+ *
+ * TODO Currently there's no logic to consider building only a MCV list
+ * (and not building the histogram at all), except for doing this
+ * decision manually in ADD STATISTICS.
+ */
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(int32))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(int32) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/* pointers into a flat serialized item of ITEM_SIZE(n) bytes */
+#define ITEM_INDEXES(item) ((uint16*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+/*
+ * Builds MCV list from sample rows, and removes rows represented by
+ * the MCV list from the sample (the number of remaining sample rows is
+ * returned by the numrows_filtered parameter).
+ *
+ * The method is quite simple - in short it does about these steps:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * For more details, see the comments in the code.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we
+ * want to check the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or
+ * float4). Maybe we could save some space here, but the bytea
+ * compression should handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed
+ * from the table, but rather estimate the number of distinct
+ * values in the table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * Preallocate space for all the items as a single chunk, and point
+ * the items to the appropriate parts of the array.
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ /* keep all the rows by default (as if there was no MCV list) */
+ *numrows_filtered = numrows;
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ items[j].values[i] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Count the number of distinct groups - just walk through the
+ * sorted list and count the number of key changes. We use this to
+ * determine the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array. We'll always
+ * require at least 4 rows per group.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e.
+ * if there are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS),
+ * we'll require only 2 rows per group.
+ *
+ * TODO For now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ * for the multivariate case.
+ *
+ * TODO We can do this only if we believe we got all the distinct
+ * values of the table.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog)
+ * instead of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /*
+ * Walk through the sorted data again, and see how many groups
+ * reach the mcv_threshold (and become an item in the MCV list).
+ */
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or new group, so check if we exceed mcv_threshold */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* group hits the threshold, count the group as MCV item */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* within group, so increase the number of items */
+ count += 1;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as
+ * we'll pass this outside this method and thus it needs to be
+ * easy to pfree() the data (and we wouldn't know where the
+ * arrays start).
+ *
+ * TODO Maybe the reasoning that we can't allocate a single
+ * piece because we're passing it out is bogus? Who'd
+ * free a single item of the MCV list, anyway?
+ *
+ * TODO Maybe with a proper encoding (stuffing all the values
+ * into a list-level array, this will be untrue)?
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /*
+ * Repeat the same loop as above, but this time copy the data
+ * into the MCV list (for items exceeding the threshold).
+ *
+ * TODO Maybe we could simply remember indexes of the last item
+ * in each group (from the previous loop)?
+ */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[nitems];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, items[(i-1)].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, items[(i-1)].isnull, sizeof(bool) * numattrs);
+
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ /* next item */
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows
+ * that are not represented by the MCV list).
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ *
+ * A better option would be keeping the ID of the row in
+ * the sort item, and then just walk through the items and
+ * mark rows to remove (in a bitmap of the same size).
+ * There's not space for that in SortItem at this moment,
+ * but it's trivial to add 'private' pointer, or just
+ * using another structure with extra field (starting with
+ * SortItem, so that the comparators etc. still work).
+ *
+ * Another option is to use the sorted array of items
+ * (because that's how we sorted the source data), and
+ * simply do a bsearch() into it. If we find a matching
+ * item, the row belongs to the MCV list.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* used for the searches */
+ SortItem item, mcvitem;;
+
+ item.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ item.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * FIXME we don't need to allocate this, we can reference
+ * the MCV item directly ...
+ */
+ mcvitem.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ mcvitem.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ item.values[j] = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &item.isnull[j]);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /*
+ * TODO Create a SortItem/MCVItem comparator so that
+ * we don't need to do memcpy() like crazy.
+ */
+ memcpy(mcvitem.values, mcvlist->items[j]->values,
+ numattrs * sizeof(Datum));
+ memcpy(mcvitem.isnull, mcvlist->items[j]->isnull,
+ numattrs * sizeof(bool));
+
+ if (multi_sort_compare(&item, &mcvitem, mss) == 0)
+ {
+ match = true;
+ break;
+ }
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the rows and remember how many rows we kept */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(rows_filtered);
+ pfree(item.values);
+ pfree(item.isnull);
+ pfree(mcvitem.values);
+ pfree(mcvitem.isnull);
+ }
+ }
+
+ pfree(values);
+ pfree(items);
+ pfree(isnull);
+
+ return mcvlist;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+MCVList
+load_mv_mcvlist(Oid mvoid)
+{
+ bool isnull = false;
+ Datum mcvlist;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->mcv_enabled && mvstat->mcv_built);
+#endif
+
+ mcvlist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_mcvlist(DatumGetByteaP(mcvlist));
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Serialize MCV list into a bytea value. The basic algorithm is simple:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use uint16 values for the indexes in step (3), as we don't
+ * allow more than 8k MCV items (see list max_mcv_items). We might
+ * increase this to 65k and still fit into uint16.
+ *
+ * We don't really expect the high compression as with histograms,
+ * because we're not doing any bucket splits etc. (which is the source
+ * of high redundancy there), but we need to do it anyway as we need
+ * to serialize varlena values etc. We might invent another way to
+ * serialize MCV lists, but let's keep it consistent.
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider using 16-bit values for the indexes in step (3).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ Size total_length = 0;
+
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for all values, including NULLs (won't use them) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ if (! mcvlist->items[j]->isnull[i]) /* skip NULL values */
+ {
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* do not exceed UINT16_MAX */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval || (info[i].typlen > 0))
+ /* by value pased by reference, but fixed length */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data.
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized MCV exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'ptr' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the values for each item */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! mcvlist->items[i]->isnull[j])
+ {
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ v = (Datum*)bsearch(&mcvlist->items[i]->values[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims),
+ mcvlist->items[i]->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims),
+ &mcvlist->items[i]->frequency, sizeof(double));
+
+ /* copy the item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/*
+ * Inverse to serialize_mv_mcvlist() - see the comment there.
+ *
+ * We'll do full deserialization, because we don't really expect high
+ * duplication of values so the caching may not be as efficient as with
+ * histograms.
+ */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ uint16 *indexes = NULL;
+ Datum **values = NULL;
+
+ /* local allocation buffer (used only for deserialization) */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* buffer used for the result */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert(nitems > 0);
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MCVListData,items) +
+ ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * We'll allocate one large chunk of memory for the intermediate
+ * data, needed only for deserializing the MCV list, and we'll pack
+ * use a local dense allocation to minimize the palloc overhead.
+ *
+ * Let's see how much space we'll actually need, and also include
+ * space for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ /* for full-size byval types, we reuse the serialized value */
+ if (! (info[i].typbyval && info[i].typlen == sizeof(Datum)))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ /* pased by reference, but fixed length (name, tid, ...) */
+ if (info[i].typlen > 0)
+ {
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should exhaust the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for the MCV items in a single piece */
+ rbufflen = (sizeof(MCVItem) + sizeof(MCVItemData) +
+ sizeof(Datum)*ndims + sizeof(bool)*ndims) * nitems;
+
+ rbuff = palloc(rbufflen);
+ rptr = rbuff;
+
+ mcvlist->items = (MCVItem*)rbuff;
+ rptr += (sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)rptr;
+ rptr += (sizeof(MCVItemData));
+
+ item->values = (Datum*)rptr;
+ rptr += (sizeof(Datum)*ndims);
+
+ item->isnull = (bool*)rptr;
+ rptr += (sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+ for (j = 0; j < ndims; j++)
+ Assert(indexes[j] <= UINT16_MAX);
+#endif
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ /* release the temporary buffer */
+ pfree(buff);
+
+ return mcvlist;
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_mcv_items);
+
+Datum
+pg_mv_mcv_items(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MCVList mcvlist;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ mcvlist = load_mv_mcvlist(PG_GETARG_OID(0));
+
+ funcctx->user_fctx = mcvlist;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = mcvlist->nitems;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MCVList mcvlist;
+ MCVItem item;
+
+ mcvlist = (MCVList)funcctx->user_fctx;
+
+ Assert(call_cntr < mcvlist->nitems);
+
+ item = mcvlist->items[call_cntr];
+
+ stakeys = find_mv_attnums(PG_GETARG_OID(0), &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(4 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+
+ /* frequency */
+ values[3] = (char *) palloc(64 * sizeof(char));
+
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * mcvlist->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* item ID */
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ Datum val, valout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == mcvlist->ndimensions-1)
+ format = "%s, %s}";
+
+ val = item->values[i];
+ valout = FunctionCall1(&fmgrinfo[i], val);
+
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index f6d60ad..cd0ed01 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,8 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, staname, stakeys,\n"
- " deps_enabled,\n"
- " deps_built,\n"
+ " deps_enabled, mcv_enabled,\n"
+ " deps_built, mcv_built,\n"
+ " mcv_max_items,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2128,6 +2129,8 @@ describeOneTableDetails(const char *schemaname,
printTableAddFooter(&cont, _("Statistics:"));
for (i = 0; i < tuples; i++)
{
+ bool first = true;
+
printfPQExpBuffer(&buf, " ");
/* statistics name */
@@ -2135,10 +2138,22 @@ describeOneTableDetails(const char *schemaname,
/* options */
if (!strcmp(PQgetvalue(result, i, 3), "t"))
- appendPQExpBuffer(&buf, "(dependencies)");
+ {
+ appendPQExpBuffer(&buf, "(dependencies");
+ first = false;
+ }
+
+ if (!strcmp(PQgetvalue(result, i, 4), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", mcv");
+ else
+ appendPQExpBuffer(&buf, "(mcv");
+ first = false;
+ }
- appendPQExpBuffer(&buf, " ON (%s)",
- PQgetvalue(result, i, 7));
+ appendPQExpBuffer(&buf, ") ON (%s)",
+ PQgetvalue(result, i, 9));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 8c33a92..7be6223 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -36,15 +36,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -60,12 +66,17 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-#define Natts_pg_mv_statistic 6
+
+#define Natts_pg_mv_statistic 10
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_deps_enabled 3
-#define Anum_pg_mv_statistic_deps_built 4
-#define Anum_pg_mv_statistic_stakeys 5
-#define Anum_pg_mv_statistic_stadeps 6
+#define Anum_pg_mv_statistic_mcv_enabled 4
+#define Anum_pg_mv_statistic_mcv_max_items 5
+#define Anum_pg_mv_statistic_deps_built 6
+#define Anum_pg_mv_statistic_mcv_built 7
+#define Anum_pg_mv_statistic_stakeys 8
+#define Anum_pg_mv_statistic_stadeps 9
+#define Anum_pg_mv_statistic_stamcv 10
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 85c638d..b16f2a9 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2743,6 +2743,10 @@ DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index baa0c88..7f2dc8a 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -592,9 +592,11 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
+ bool mcv_enabled; /* MCV list enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
+ bool mcv_built; /* MCV list built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 02a7dda..b028192 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -50,30 +50,89 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
*/
MVDependencies load_mv_dependencies(Oid mvoid);
+MCVList load_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..4958390
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON mcv_list (unknown_column) WITH (mcv);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON mcv_list (a) WITH (mcv);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a, b) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items 200);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items 10);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items 10000);
+ERROR: max number of MCV items is 8192
+-- correct command
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON mcv_list (a, b, c, d) WITH (mcv);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 428b1e8..50715db 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1369,7 +1369,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
c.relname AS tablename,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 81484f1..838c12b 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 14ea574..d97a0ec 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -162,3 +162,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..16d82cf
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,178 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON mcv_list (unknown_column) WITH (mcv);
+
+-- single column
+CREATE STATISTICS s1 ON mcv_list (a) WITH (mcv);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a) WITH (mcv);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a, b) WITH (mcv);
+
+-- unknown option
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (unknown_option);
+
+-- missing MCV statistics
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items 200);
+
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items 10);
+
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items 10000);
+
+-- correct command
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON mcv_list (a, b, c, d) WITH (mcv);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DROP TABLE mcv_list;
--
2.1.0
0005-multivariate-histograms.patchtext/x-diff; name=0005-multivariate-histograms.patchDownload
>From 133193c6e1546d2b3a595c04c0213400ea3c7990 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 5/7] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 44 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 718 ++++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 37 +-
src/backend/utils/mvstats/histogram.c | 2316 ++++++++++++++++++++++++++++
src/bin/psql/describe.c | 17 +-
src/include/catalog/pg_mv_statistic.h | 25 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 136 +-
src/test/regress/expected/mv_histogram.out | 207 +++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 176 +++
18 files changed, 3662 insertions(+), 39 deletions(-)
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 6482aa7..cb6eff3 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -167,7 +167,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index f730253..68e1685 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -135,12 +135,15 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -220,6 +223,29 @@ CreateStatistics(CreateStatsStmt *stmt)
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("maximum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -228,10 +254,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -239,6 +265,11 @@ CreateStatistics(CreateStatsStmt *stmt)
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -259,11 +290,14 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 0f58199..46463cc 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1949,10 +1949,12 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
WRITE_BOOL_FIELD(mcv_enabled);
+ WRITE_BOOL_FIELD(hist_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
WRITE_BOOL_FIELD(mcv_built);
+ WRITE_BOOL_FIELD(hist_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index f122045..6c99f02 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -49,6 +49,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
@@ -73,6 +74,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStatisticInfo *mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -80,6 +83,12 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
@@ -114,6 +123,7 @@ static Bitmapset * get_varattnos(Node * node, Index relid);
#define UPDATE_RESULT(m,r,isor) \
(m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -304,7 +314,7 @@ clauselist_selectivity(PlannerInfo *root,
* Check that there are statistics with MCV list. If not, we don't
* need to waste time with the optimization.
*/
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV))
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
/*
* Recollect attributes from mv-compatible clauses (maybe we've
@@ -312,7 +322,7 @@ clauselist_selectivity(PlannerInfo *root,
* From now on we're only interested in MCV-compatible clauses.
*/
mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- MV_CLAUSE_TYPE_MCV);
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/*
* If there still are at least two columns, we'll try to select
@@ -331,7 +341,7 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, sjinfo, clauses,
varRelid, &mvclauses, mvstat,
- MV_CLAUSE_TYPE_MCV);
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -1098,6 +1108,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -1111,9 +1122,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1255,7 +1281,7 @@ choose_mv_statistics(List *stats, Bitmapset *attnums)
int numattrs = attrs->dim1;
/* skip dependencies-only stats */
- if (! info->mcv_built)
+ if (! (info->mcv_built || info->hist_built))
continue;
/* count columns covered by the histogram */
@@ -1415,7 +1441,6 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
bool ok;
/* is it 'variable op constant' ? */
-
ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
(is_pseudo_constant_clause_relids(lsecond(expr->args),
right_relids) ||
@@ -1465,10 +1490,10 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
case F_SCALARLTSEL:
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (types & MV_CLAUSE_TYPE_MCV)
+ if (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
*attnums = bms_add_member(*attnums, var->varattno);
- return (types & MV_CLAUSE_TYPE_MCV);
+ return (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
}
return false;
@@ -1796,6 +1821,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
}
return false;
@@ -2612,3 +2640,675 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ int nmatches = 0;
+ char *matches = NULL;
+
+ MVSerializedHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = load_mv_histogram(mvstats->mvoid);
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * Find out what part of the data is covered by the histogram,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample got into the histogram, and the rest
+ * is in a MCV list).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole histogram, which might save us some time
+ * spent accessing the not-matching part of the histogram.
+ * Although it's likely in a cache, so it's very fast.
+ */
+ u += mvhist->buckets[i]->ntuples;
+
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s * u;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ /*
+ * Used for caching function calls, only once per deduplicated value.
+ *
+ * We know may have up to (2 * nbuckets) values per dimension. It's
+ * probably overkill, but let's allocate that once for all clauses,
+ * to minimize overhead.
+ *
+ * Also, we only need two bits per value, but this allocates byte
+ * per value. Might be worth optimizing.
+ *
+ * 0x00 - not yet called
+ * 0x01 - called, result is 'false'
+ * 0x03 - called, result is 'true'
+ */
+ char *callcache = palloc(mvhist->nbuckets);
+
+ Assert(mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ /* reset the cache (per clause) */
+ memset(callcache, 0, mvhist->nbuckets);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ *
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR
+ | TYPECACHE_GT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ bool tmp;
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /* histogram boundaries */
+ Datum minval, maxval;
+
+ /* values from the call cache */
+ char mincached, maxcached;
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* lookup the values and cache of function calls */
+ minval = mvhist->values[idx][bucket->min[idx]];
+ maxval = mvhist->values[idx][bucket->max[idx]];
+
+ mincached = callcache[bucket->min[idx]];
+ maxcached = callcache[bucket->max[idx]];
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ *
+ * FIXME Once the min/max values are deduplicated, we can easily minimize
+ * the number of calls to the comparator (assuming we keep the
+ * deduplicated structure). See the note on compression at MVBucket
+ * serialize/deserialize methods.
+ */
+ switch (oprrest)
+ {
+ case F_SCALARLTSEL: /* column < constant */
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (minval, constvalue).
+ */
+ callcache[bucket->min[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(mincached & 0x02); /* get call result from the cache (inverse) */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ maxval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (minval, constvalue).
+ */
+ callcache[bucket->max[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(maxcached & 0x02); /* extract the result (reverse) */
+
+ if (tmp) /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+
+ }
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->max[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ minval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (mincached & 0x02); /* extract the result */
+
+ if (tmp) /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ break;
+
+ case F_SCALARGTSEL: /* column > constant */
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ maxval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (val, constvalue).
+ */
+ callcache[bucket->max[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (val, constvalue).
+ */
+ callcache[bucket->min[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(mincached & 0x02); /* extract the result */
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ minval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (mincached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->max[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the lt/gt
+ * operators fetched from type cache.
+ *
+ * TODO We'll use the default 50% estimate, but that's probably way off
+ * if there are multiple distinct values. Consider tweaking this a
+ * somehow, e.g. using only a part inversely proportional to the
+ * estimated number of distinct values in the bucket.
+ *
+ * TODO This does not handle inclusion flags at the moment, thus counting
+ * some buckets twice (when hitting the boundary).
+ *
+ * TODO Optimization is that if max[i] == min[i], it's effectively a MCV
+ * item and we can count the whole bucket as a complete match (thus
+ * using 100% bucket selectivity and not just 50%).
+ *
+ * TODO Technically some buckets may "degenerate" into single-value
+ * buckets (not necessarily for all the dimensions) - maybe this
+ * is better than keeping a separate MCV list (multi-dimensional).
+ * Update: Actually, that's unlikely to be better than a separate
+ * MCV list for two reasons - first, it requires ~2x the space
+ * (because of storing lower/upper boundaries) and second because
+ * the buckets are ranges - depending on the partitioning algorithm
+ * it may not even degenerate into (min=max) bucket. For example the
+ * the current partitioning algorithm never does that.
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /* Update the cache. */
+ callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (mincached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->max[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+
+ break;
+ }
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+
+ /* free the call cache */
+ pfree(callcache);
+
+#ifdef DEBUG_MVHIST
+ debug_histogram_matches(mvhist, matches);
+#endif
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 0da7ad9..9aded52 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -410,7 +410,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
{
info = makeNode(MVStatisticInfo);
@@ -420,10 +420,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
+ info->hist_enabled = mvstat->hist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
+ info->hist_built = mvstat->hist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index f9bf10c..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index d1da714..ffb76f4 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -13,11 +13,11 @@
*
*-------------------------------------------------------------------------
*/
+#include "postgres.h"
+#include "utils/array.h"
#include "common.h"
-#include "utils/array.h"
-
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts,
VacAttrStats **vacattrstats);
@@ -52,7 +52,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -95,8 +96,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (stat->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
}
}
@@ -176,6 +181,8 @@ list_mv_stats(Oid relid)
info->deps_built = stats->deps_built;
info->mcv_enabled = stats->mcv_enabled;
info->mcv_built = stats->mcv_built;
+ info->hist_enabled = stats->hist_enabled;
+ info->hist_built = stats->hist_built;
result = lappend(result, info);
}
@@ -190,7 +197,6 @@ list_mv_stats(Oid relid)
return result;
}
-
/*
* Find attnims of MV stats using the mvoid.
*/
@@ -236,9 +242,16 @@ find_mv_attnums(Oid mvoid, Oid *relid)
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -271,22 +284,34 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..933700f
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,2316 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+#include <math.h>
+
+/*
+ * Multivariate histograms
+ * -----------------------
+ *
+ * Histograms are a collection of buckets, represented by n-dimensional
+ * rectangles. Each rectangle is delimited by a min/max value in each
+ * dimension, stored in an array, so that the bucket includes values
+ * fulfilling condition
+ *
+ * min[i] <= value[i] <= max[i]
+ *
+ * where 'i' is the dimension. In 1D this corresponds to a simple
+ * interval, in 2D to a rectangle, and in 3D to a block. If you can
+ * imagine this in 4D, congrats!
+ *
+ * In addition to the bounaries, each bucket tracks additional details:
+ *
+ * * frequency (fraction of tuples it matches)
+ * * whether the boundaries are inclusive or exclusive
+ * * whether the dimension contains only NULL values
+ * * number of distinct values in each dimension (for building)
+ *
+ * and possibly some additional information.
+ *
+ * We do expect to support multiple histogram types, with different
+ * features etc. The 'type' field is used to identify those types.
+ * Technically some histogram types might use completely different
+ * bucket representation, but that's not expected at the moment.
+ *
+ * Although the current implementation builds non-overlapping buckets,
+ * the code does not (and should not) rely on the non-overlapping
+ * nature - there are interesting types of histograms / histogram
+ * building algorithms producing overlapping buckets.
+ *
+ *
+ * NULL handling (create_null_buckets)
+ * -----------------------------------
+ * Another thing worth mentioning is handling of NULL values. It would
+ * be quite difficult to work with buckets containing NULL and non-NULL
+ * values for a single dimension. To work around this, the initial step
+ * in building a histogram is building a set of 'NULL-buckets', i.e.
+ * buckets with one or more NULL-only dimensions.
+ *
+ * After that, no buckets are mixing NULL and non-NULL values in one
+ * dimension, and the actual histogram building starts. As that only
+ * splits the buckets into smaller ones, the resulting buckets can't
+ * mix NULL and non-NULL values either.
+ *
+ * The maximum number of NULL-buckets is determined by the number of
+ * attributes the histogram is built on. For N-dimensional histogram,
+ * the maximum number of NULL-buckets is 2^N. So for 8 attributes
+ * (which is the current value of MVSTATS_MAX_DIMENSIONS), there may be
+ * up to 256 NULL-buckets.
+ *
+ * Those buckets are only built if needed - if there are no NULL values
+ * in the data, no such buckets are built.
+ *
+ *
+ * Estimating selectivity
+ * ----------------------
+ * With histograms, we always "match" a whole bucket, not indivitual
+ * rows (or values), irrespectedly of the type of clause. Therefore we
+ * can't use the optimizations for equality clauses, as in MCV lists.
+ *
+ * The current implementation uses histograms to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ * (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ * (d) OR-clauses WHERE (a = 1) OR (b = 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (e) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ * When used on low-cardinality data, histograms usually perform
+ * considerably worse than MCV lists (which are a good fit for this
+ * kind of data). This is especially true on categorical data, where
+ * ordering of the values is mostly unrelated to meaning of the data,
+ * as proper ordering is crucial for histograms.
+ *
+ * On high-cardinality data the histograms are usually a better choice,
+ * because MCV lists can't represent the distribution accurately enough.
+ *
+ * By evaluating a clause on a bucket, we may get one of three results:
+ *
+ * (a) FULL_MATCH - The bucket definitely matches the clause.
+ *
+ * (b) PARTIAL_MATCH - The bucket matches the clause, but not
+ * necessarily all the tuples it represents.
+ *
+ * (c) NO_MATCH - The bucket definitely does not match the clause.
+ *
+ * This may be illustrated using a range [1, 5], which is essentially
+ * a 1-D bucket. With clause
+ *
+ * WHERE (a < 10) => FULL_MATCH (all range values are below
+ * 10, so the whole bucket matches)
+ *
+ * WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ * the clause, but we don't know how many)
+ *
+ * WHERE (a < 0) => NO_MATCH (the whole range is above 1, so
+ * no values from the bucket can match)
+ *
+ * Some clauses may produce only some of those results - for example
+ * equality clauses may never produce FULL_MATCH as we always hit only
+ * part of the bucket (we can't match both boundaries at the same time).
+ * This results in less accurate estimates compared to MCV lists, where
+ * we can hit a MCV items exactly (there's no PARTIAL match in MCV).
+ *
+ * There are clauses that may not produce any PARTIAL_MATCH results.
+ * A nice example of that is 'IS [NOT] NULL' clause, which either
+ * matches the bucket completely (FULL_MATCH) or not at all (NO_MATCH),
+ * thanks to how the NULL-buckets are constructed.
+ *
+ * Computing the total selectivity estimate is trivial - simply sum
+ * selectivities from all the FULL_MATCH and PARTIAL_MATCH buckets (but
+ * multiply the PARTIAL_MATCH buckets by 0.5 to minimize average error).
+ *
+ *
+ * Serialization
+ * -------------
+ * After building, the histogram is serialized into a more efficient
+ * form (dedup boundary values etc.). See serialize_mv_histogram() for
+ * more details about how it's done.
+ *
+ * Serialized histograms are marked with 'magic' constant, to make it
+ * easier to check the bytea value really is a serialized histogram.
+ *
+ * In the serialized form, values for each dimension are deduplicated,
+ * and referenced using an uint16 index. This saves a lot of space,
+ * because every time we split a bucket, we introduce a single new
+ * boundary value (to split the bucket by the selected dimension), but
+ * we actually copy all the boundary values for all dimensions. So for
+ * a histogram with 4 dimensions and 1000 buckets, we do have
+ *
+ * 1000 * 4 * 2 = 8000
+ *
+ * boundary values, but many of them are actually duplicated because
+ * the histogram started with a single bucket (8 boundary values) and
+ * then there were 999 splits (each introducing 1 new value):
+ *
+ * 8 + 999 = 1007
+ *
+ * So that's quite large diffence. Let's assume the Datum values are
+ * 8 bytes each. Storing the raw histogram would take ~ 64 kB, while
+ * with deduplication it's only ~18 kB.
+ *
+ * The difference may be removed by the transparent bytea compression,
+ * but the deduplication is also used to optimize the estimation. It's
+ * possible to process the deduplicated values, and then use this as
+ * a cache to minimize the actual function calls while checking the
+ * buckets. This significantly reduces the number of calls to the
+ * (often quite expensive) operator functions etc.
+ *
+ *
+ * The current limit on number of buckets (16384) is mostly arbitrary,
+ * but set so that it makes sure we don't exceed the number of distinct
+ * values indexable by uint16. In practice we could handle more buckets,
+ * because we index each dimension independently, and we do the splits
+ * over multiple dimensions.
+ *
+ * Histograms with more than 16k buckets are quite expensive to build
+ * and process, so the current limit is somewhat reasonable.
+ *
+ * The actual number of buckets is also related to statistics target,
+ * because we require MIN_BUCKET_ROWS (10) tuples per bucket before
+ * a split, so we can't have more than (2 * 300 * target / 10) buckets.
+ *
+ *
+ * TODO Maybe the distinct stats (both for combination of all columns
+ * and for combinations of various subsets of columns) should be
+ * moved to a separate structure (next to histogram/MCV/...) to
+ * make it useful even without a histogram computed etc.
+ *
+ * This would actually make mvcoeff (proposed by Kyotaro Horiguchi
+ * in [1]) possible. Seems like a good way to estimate GROUP BY
+ * cardinality, and also some other cases, pointed out by Kyotaro:
+ *
+ * [1] http://www.postgresql.org/message-id/20150515.152936.83796179.horiguchi.kyotaro@lab.ntt.co.jp
+ *
+ * This is not implemented at the moment, though. Also, Kyotaro's
+ * patch only works with pairs of columns, but maybe tracking all
+ * the combinations would be useful to handle more complex
+ * conditions. It only seems to handle equalities, though (but for
+ * GROUP BY estimation that's not a big deal).
+ */
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(int32))
+ * - max boundary indexes (2 * ndim * sizeof(int32))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(int32) + 3 * sizeof(bool)) +
+ * 2 * sizeof(float)
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) ((float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* can't split bucket with less than 10 rows */
+#define MIN_BUCKET_ROWS 10
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size.
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket
+ * for more details about the algorithm.
+ *
+ * The current algorithm works like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (largest bucket)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (largest dimension)
+ * split the bucket into two buckets
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ int *ndistvalues;
+ Datum **distvalues;
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets
+ = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * Collect info on distinct values in each dimension (used later
+ * to select dimension to partition).
+ */
+ ndistvalues = (int*)palloc0(sizeof(int) * numattrs);
+ distvalues = (Datum**)palloc0(sizeof(Datum*) * numattrs);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ int j;
+ int nvals;
+ Datum *tmp;
+
+ SortSupportData ssup;
+ StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ nvals = 0;
+ tmp = (Datum*)palloc0(sizeof(Datum) * numrows);
+
+ for (j = 0; j < numrows; j++)
+ {
+ bool isnull;
+
+ /* remember the index of the sample row, to make the partitioning simpler */
+ Datum value = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+
+ if (isnull)
+ continue;
+
+ tmp[nvals++] = value;
+ }
+
+ /* do the sort and stuff only if there are non-NULL values */
+ if (nvals > 0)
+ {
+ /* sort the array of values */
+ qsort_arg((void *) tmp, nvals, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /* count distinct values */
+ ndistvalues[i] = 1;
+ for (j = 1; j < nvals; j++)
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ ndistvalues[i] += 1;
+
+ /* FIXME allocate only needed space (count ndistinct first) */
+ distvalues[i] = (Datum*)palloc0(sizeof(Datum) * ndistvalues[i]);
+
+ /* now collect distinct values into the array */
+ distvalues[i][0] = tmp[0];
+ ndistvalues[i] = 1;
+
+ for (j = 1; j < nvals; j++)
+ {
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ {
+ distvalues[i][ndistvalues[i]] = tmp[j];
+ ndistvalues[i] += 1;
+ }
+ }
+ }
+
+ pfree(tmp);
+ }
+
+ /*
+ * The initial bucket may contain NULL values, so we have to create
+ * buckets with NULL-only dimensions.
+ *
+ * FIXME We may need up to 2^ndims buckets - check that there are
+ * enough buckets (MVSTAT_HIST_MAX_BUCKETS >= 2^ndims).
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets]
+ = partition_bucket(bucket, attrs, stats,
+ ndistvalues, distvalues);
+
+ histogram->nbuckets += 1;
+ }
+
+ /* finalize the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data
+ = ((HistogramBuild)histogram->buckets[i]->build_data);
+
+ /*
+ * The frequency has to be computed from the whole sample, in
+ * case some of the rows were used for MCV (and thus are missing
+ * from the histogram).
+ */
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVSerializedHistogram
+load_mv_histogram(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram(DatumGetByteaP(histogram));
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVSerializedHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm
+ * is simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use 32-bit values for the indexes in step (3), although we
+ * could probably use just 16 bits as we don't allow more than 8k
+ * buckets in the histogram max_buckets (well, we might increase this
+ * to 16k and still fit into signed 16-bits). But let's be lazy and rely
+ * on the varlena compression to kick in. If most bytes will be 0x00
+ * so it should work nicely.
+ *
+ *
+ * Deduplication in serialization
+ * ------------------------------
+ * The deduplication is very effective and important here, because every
+ * time we split a bucket, we keep all the boundary values, except for
+ * the dimension that was used for the split. Another way to look at
+ * this is that each split introduces 1 new value (the value used to do
+ * the split). A histogram with M buckets was created by (M-1) splits
+ * of the initial bucket, and each bucket has 2*N boundary values. So
+ * assuming the initial bucket does not have any 'collapsed' dimensions,
+ * the number of distinct values is
+ *
+ * (2*N + (M-1))
+ *
+ * but the total number of boundary values is
+ *
+ * 2*N*M
+ *
+ * which is clearly much higher. For a histogram on two columns, with
+ * 1024 buckets, it's 1027 vs. 4096. Of course, we're not saving all
+ * the difference (because we'll use 32-bit indexes into the values).
+ * But with large values (e.g. stored as varlena), this saves a lot.
+ *
+ * An interesting feature is that the total number of distinct values
+ * does not really grow with the number of dimensions, except for the
+ * size of the initial bucket. After that it only depends on number of
+ * buckets (i.e. number of splits).
+ *
+ * XXX Of course this only holds for the current histogram building
+ * algorithm. Algorithms doing the splits differently (e.g.
+ * producing overlapping buckets) may behave differently.
+ *
+ * TODO This only confirms we can use the uint16 indexes. The worst
+ * that could happen is if all the splits happened by a single
+ * dimension. To exhaust the uint16 this would require ~64k
+ * splits (needs to be reflected in MVSTAT_HIST_MAX_BUCKETS).
+ *
+ * TODO We don't need to use a separate boolean for each flag, instead
+ * use a single char and set bits.
+ *
+ * TODO We might get a bit better compression by considering the actual
+ * data type length. The current implementation treats all data
+ * types passed by value as requiring 8B, but for INT it's actually
+ * just 4B etc.
+ *
+ * OTOH this is only related to the lookup table, and most of the
+ * space is occupied by the buckets (with int16 indexes).
+ *
+ *
+ * Varlena compression
+ * -------------------
+ * This encoding may prevent automatic varlena compression (similarly
+ * to JSONB), because first part of the serialized bytea will be an
+ * array of unique values (although sorted), and pglz decides whether
+ * to compress by trying to compress the first part (~1kB or so). Which
+ * is likely to be poor, due to the lack of repetition.
+ *
+ * One possible cure to that might be storing the buckets first, and
+ * then the deduplicated arrays. The buckets might be better suited
+ * for compression.
+ *
+ * On the other hand the encoding scheme is a context-aware compression,
+ * usually compressing to ~30% (or less, with large data types). So the
+ * lack of pglz compression may be OK.
+ *
+ * XXX But maybe we don't really want to compress this, to save on
+ * planning time?
+ *
+ * TODO Try storing the buckets / deduplicated arrays in reverse order,
+ * measure impact on compression.
+ *
+ *
+ * Deserialization
+ * ---------------
+ * The deserialization is currently implemented so that it reconstructs
+ * the histogram back into the same structures - this involves quite
+ * a few of memcpy() and palloc(), but maybe we could create a special
+ * structure for the serialized histogram, and access the data directly,
+ * without the unpacking.
+ *
+ * Not only it would save some memory and CPU time, but might actually
+ * work better with CPU caches (not polluting the caches).
+ *
+ * TODO Try to keep the compressed form, instead of deserializing it to
+ * MVHistogram/MVBucket.
+ *
+ *
+ * General TODOs
+ * -------------
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs
+ * (we won't use them, but we don't know how many are there),
+ * and then collect all non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* make sure we fit into uint16 */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typlen > 0)
+ /* byval or byref, but with fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (10 * 1024 * 1024))
+ elog(ERROR, "serialized histogram exceeds 10MB (%ld > %d)",
+ total_length, (10 * 1024 * 1024));
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typlen > 0)
+ {
+ /* pased by value or reference, but fixed length */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the histogram buckets */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ *BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ uint16 idx;
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ /* min boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* FIXME free the values/counts arrays here */
+
+ return output;
+}
+
+/*
+ * Returns histogram in a partially-serialized form (keeps the boundary
+ * values deduplicated, so that it's possible to optimize the estimation
+ * part by caching function call results between buckets etc.).
+ */
+MVSerializedHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+
+ MVSerializedHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram
+ = (MVSerializedHistogram)palloc(sizeof(MVSerializedHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVSerializedHistogramData, buckets));
+ tmp += offsetof(MVSerializedHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVSerializedHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* now let's allocate a single buffer for all the values and counts */
+
+ bufflen = (sizeof(int) + sizeof(Datum*)) * ndims;
+ for (i = 0; i < ndims; i++)
+ {
+ /* don't allocate space for byval types, matching Datum */
+ if (! (info[i].typbyval && (info[i].typlen == sizeof(Datum))))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+ }
+
+ /* also, include space for the result, tracking the buckets */
+ bufflen += nbuckets * (
+ sizeof(MVSerializedBucket) + /* bucket pointer */
+ sizeof(MVSerializedBucketData)); /* bucket data */
+
+ buff = palloc(bufflen);
+ ptr = buff;
+
+ histogram->nvalues = (int*)ptr;
+ ptr += (sizeof(int) * ndims);
+
+ histogram->values = (Datum**)ptr;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ histogram->nvalues[i] = info[i].nvalues;
+
+ if (info[i].typbyval && info[i].typlen == sizeof(Datum))
+ {
+ /* passed by value / Datum - simply reuse the array */
+ histogram->values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typbyval)
+ {
+ /* pased by value, but smaller than Datum */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&histogram->values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ histogram->buckets = (MVSerializedBucket*)ptr;
+ ptr += (sizeof(MVSerializedBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVSerializedBucket bucket = (MVSerializedBucket)ptr;
+ ptr += sizeof(MVSerializedBucketData);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+ bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+ bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+ bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+ bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+ bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ /* we should exhaust the output buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller ones.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size. We
+ * select the bucket with the highest number of distinct values, and
+ * then split it by the longest dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this
+ * is used to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this
+ * contains values for all the tuples from the sample, not just
+ * the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ * For example use the "bucket volume" (product of dimension
+ * lengths) to select the bucket.
+ *
+ * We need buckets containing about the same number of tuples (so
+ * about the same frequency), as that limits the error when we
+ * match the bucket partially (in that case use 1/2 the bucket).
+ *
+ * We also need buckets with "regular" size, i.e. not "narrow" in
+ * some dimensions and "wide" in the others, because that makes
+ * partial matches more likely and increases the estimation error,
+ * especially when the clauses match many buckets partially. This
+ * is especially serious for OR-clauses, because in that case any
+ * of them may add the bucket as a (partial) match. With AND-clauses
+ * all the clauses have to match the bucket, which makes this issue
+ * somewhat less pressing.
+ *
+ * For example this table:
+ *
+ * CREATE TABLE t AS SELECT i AS a, i AS b
+ * FROM generate_series(1,1000000) s(i);
+ * ALTER TABLE t ADD STATISTICS (histogram) ON (a,b);
+ * ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because
+ * every bucket always has exactly the same number of distinct
+ * values in all dimensions, which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ * SELECT * FROM t WHERE a < 10 AND b < 10;
+ *
+ * is estimated to return ~120 rows, while in reality it returns 9.
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=117 width=8)
+ * (actual time=0.185..270.774 rows=9 loops=1)
+ * Filter: ((a < 10) AND (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * while the query using OR clauses is estimated like this:
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=8100 width=8)
+ * (actual time=0.118..189.919 rows=9 loops=1)
+ * Filter: ((a < 10) OR (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * which is clearly much worse. This happens because the histogram
+ * contains buckets like this:
+ *
+ * bucket 592 [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the
+ * length of "b" is (30593-30134)=459. So the "b" dimension is much
+ * narrower than "a". Of course, there are buckets where "b" is the
+ * wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension
+ * in partition_bucket() but that only happens after we already
+ * selected the bucket. So if we never select the bucket, we can't
+ * really fix it there.
+ *
+ * The other reason why this particular example behaves so poorly
+ * is due to the way we split the partition in partition_bucket().
+ * Currently we attempt to divide the bucket into two parts with
+ * the same number of sampled tuples (frequency), but that does not
+ * work well when all the tuples are squashed on one end of the
+ * bucket (e.g. exactly at the diagonal, as a=b). In that case we
+ * split the bucket into a tiny bucket on the diagonal, and a huge
+ * remaining part of the bucket, which is still going to be narrow
+ * and we're unlikely to fix that.
+ *
+ * So perhaps we need two partitioning strategies - one aiming to
+ * split buckets with high frequency (number of sampled rows), the
+ * other aiming to split "large" buckets. And alternating between
+ * them, somehow.
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int numrows = 0;
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+ /* if the number of rows is higher, use this bucket */
+ if ((data->ndistinct > 2) &&
+ (data->numrows > numrows) &&
+ (data->numrows >= MIN_BUCKET_ROWS)) {
+ bucket = buckets[i];
+ numrows = data->numrows;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest
+ * bucket dimension, measured using the array of distinct values built
+ * at the very beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly
+ * distributed, and then use this to measure length. It's essentially
+ * a number of distinct values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts
+ * with roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning
+ * the new bucket (essentially shrinking the existing one in-place and
+ * returning the other "half" as a new bucket). The caller is responsible
+ * for adding the new bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case
+ * of strongly dependent columns - e.g. y=x). The current algorithm
+ * minimizes this, but may still happen for perfectly dependent
+ * examples (when all the dimensions have equal length, the first
+ * one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ // int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+ double delta;
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split.
+ */
+ delta = 0.0;
+ dimension = -1;
+
+ for (i = 0; i < numattrs; i++)
+ {
+ Datum *a, *b;
+
+ mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ /* can't split NULL-only dimension */
+ if (bucket->nullsonly[i])
+ continue;
+
+ /* can't split dimension with a single ndistinct value */
+ if (data->ndistincts[i] <= 1)
+ continue;
+
+ /* sort support for the bsearch_comparator */
+ ssup_private = &ssup;
+
+ /* search for min boundary in the distinct list */
+ a = (Datum*)bsearch(&bucket->min[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ b = (Datum*)bsearch(&bucket->max[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ /* if this dimension is 'larger' then partition by it */
+ if (((b-a)*1.0 / ndistvalues[i]) > delta)
+ {
+ delta = ((b-a)*1.0 / ndistvalues[i]);
+ dimension = i;
+ }
+ }
+
+ /*
+ * If we haven't found a dimension here, we've done something
+ * wrong in select_bucket_to_partition.
+ */
+ Assert(dimension != -1);
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ delta = fabs(data->numrows);
+ split_value = values[0].value;
+
+ for (i = 1; i < data->numrows; i++)
+ {
+ if (values[i].value != values[i-1].value)
+ {
+ /* are we closer to splitting the bucket in half? */
+ if (fabs(i - data->numrows/2.0) < delta)
+ {
+ /* let's assume we'll use this value for the split */
+ split_value = values[i].value;
+ delta = fabs(i - data->numrows/2.0);
+ nrows = i;
+ }
+ }
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and
+ * non-NULL values in a single dimension. Each dimension may either be
+ * marked as 'nulls only', and thus containing only NULL values, or
+ * it must not contain any NULL values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns,
+ * it's necessary to build those NULL-buckets. This is done in an
+ * iterative way using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL
+ * and non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not
+ * marked as NULL-only, mark it as NULL-only and run the
+ * algorithm again (on this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the
+ * bucket into two parts - one with NULL values, one with
+ * non-NULL values (replacing the current one). Then run
+ * the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions
+ * should be quite low - limited by the number of NULL-buckets. Also,
+ * in each branch the number of nested calls is limited by the number
+ * of dimensions (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The
+ * number of buckets produced by this algorithm is rather limited - with
+ * N dimensions, there may be only 2^N such buckets (each dimension may
+ * be either NULL or non-NULL). So with 8 dimensions (current value of
+ * MVSTATS_MAX_DIMENSIONS) there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further
+ * optimizing the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL
+ * in a dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ /*
+ * FIXME We don't need to start from the first attribute
+ * here - we can start from the last known dimension.
+ */
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only,
+ * but is not yet marked like that. It's enough to mark it and
+ * repeat the process recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in
+ * the dimension, one with non-NULL values. We don't need to sort
+ * the data or anything, but otherwise it's similar to what's done
+ * in partition_bucket().
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each
+ * bucket (NULL is not a value, so 0, and the other bucket got
+ * all the ndistinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram (or if there's no
+ * statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ * - prints actual values
+ * - using the output function of the data type (as string)
+ * - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ * - prints index of the distinct value (into the serialized array)
+ * - makes it easier to spot neighbor buckets, etc.
+ * - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ * - prints index of the distinct value, but normalized into [0,1]
+ * - similar to 1, but shows how 'long' the bucket range is
+ * - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options
+ * skew the lengths by distributing the distinct values uniformly. For
+ * data types without a clear meaning of 'distance' (e.g. strings) that
+ * is not a big deal, but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_histogram_buckets);
+
+Datum
+pg_mv_histogram_buckets(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ Oid mvoid = PG_GETARG_OID(0);
+ int otype = PG_GETARG_INT32(1);
+
+ if ((otype < 0) || (otype > 2))
+ elog(ERROR, "invalid output type specified");
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MVSerializedHistogram histogram;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ histogram = load_mv_histogram(mvoid);
+
+ funcctx->user_fctx = histogram;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = histogram->nbuckets;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+ double bucket_size = 1.0;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MVSerializedHistogram histogram;
+ MVSerializedBucket bucket;
+
+ histogram = (MVSerializedHistogram)funcctx->user_fctx;
+
+ Assert(call_cntr < histogram->nbuckets);
+
+ bucket = histogram->buckets[call_cntr];
+
+ stakeys = find_mv_attnums(mvoid, &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(9 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+ values[3] = (char *) palloc0(1024 * sizeof(char));
+ values[4] = (char *) palloc0(1024 * sizeof(char));
+ values[5] = (char *) palloc0(1024 * sizeof(char));
+
+ values[6] = (char *) palloc(64 * sizeof(char));
+ values[7] = (char *) palloc(64 * sizeof(char));
+ values[8] = (char *) palloc(64 * sizeof(char));
+
+ /* we need to do this only when printing the actual values */
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * histogram->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* bucket ID */
+
+ /*
+ * currently we only print array of indexes, but the deduplicated
+ * values should be sorted, so this is actually quite useful
+ *
+ * TODO print the actual min/max values, using the output
+ * function of the attribute type
+ */
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bucket_size *= (bucket->max[i] - bucket->min[i]) * 1.0
+ / (histogram->nvalues[i]-1);
+
+ /* print the actual values, i.e. use output function etc. */
+ if (otype == 0)
+ {
+ Datum minval, maxval;
+ Datum minout, maxout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ minval = histogram->values[i][bucket->min[i]];
+ minout = FunctionCall1(&fmgrinfo[i], minval);
+
+ maxval = histogram->values[i][bucket->max[i]];
+ maxout = FunctionCall1(&fmgrinfo[i], maxval);
+
+ // snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(minout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ // snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ snprintf(buff, 1024, format, values[2], DatumGetPointer(maxout));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else if (otype == 1)
+ {
+ format = "%s, %d";
+ if (i == 0)
+ format = "{%s%d";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %d}";
+
+ snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else
+ {
+ format = "%s, %f";
+ if (i == 0)
+ format = "{%s%f";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %f}";
+
+ snprintf(buff, 1024, format, values[1],
+ bucket->min[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2],
+ bucket->max[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ snprintf(buff, 1024, format, values[3], bucket->nullsonly[i] ? "t" : "f");
+ strncpy(values[3], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[4], bucket->min_inclusive[i] ? "t" : "f");
+ strncpy(values[4], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[5], bucket->max_inclusive[i] ? "t" : "f");
+ strncpy(values[5], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[6], 64, "%f", bucket->ntuples); /* frequency */
+ snprintf(values[7], 64, "%f", bucket->ntuples / bucket_size); /* density */
+ snprintf(values[8], 64, "%f", bucket_size); /* bucket_size */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+ pfree(values[4]);
+ pfree(values[5]);
+ pfree(values[6]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
+
+#ifdef DEBUG_MVHIST
+/*
+ * prints debugging info about matched histogram buckets (full/partial)
+ *
+ * XXX Currently works only for INT data type.
+ */
+void
+debug_histogram_matches(MVSerializedHistogram mvhist, char *matches)
+{
+ int i, j;
+
+ float ffull = 0, fpartial = 0;
+ int nfull = 0, npartial = 0;
+
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ char ranges[1024];
+
+ if (! matches[i])
+ continue;
+
+ /* increment the counters */
+ nfull += (matches[i] == MVSTATS_MATCH_FULL) ? 1 : 0;
+ npartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? 1 : 0;
+
+ /* and also update the frequencies */
+ ffull += (matches[i] == MVSTATS_MATCH_FULL) ? bucket->ntuples : 0;
+ fpartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? bucket->ntuples : 0;
+
+ memset(ranges, 0, sizeof(ranges));
+
+ /* build ranges for all the dimentions */
+ for (j = 0; j < mvhist->ndimensions; j++)
+ {
+ sprintf(ranges, "%s [%d %d]", ranges,
+ DatumGetInt32(mvhist->values[j][bucket->min[j]]),
+ DatumGetInt32(mvhist->values[j][bucket->max[j]]));
+ }
+
+ elog(WARNING, "bucket %d %s => %d [%f]", i, ranges, matches[i], bucket->ntuples);
+ }
+
+ elog(WARNING, "full=%f partial=%f (%f)", ffull, fpartial, (ffull + 0.5 * fpartial));
+}
+#endif
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index cd0ed01..c630f96 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,9 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, staname, stakeys,\n"
- " deps_enabled, mcv_enabled,\n"
- " deps_built, mcv_built,\n"
- " mcv_max_items,\n"
+ " deps_enabled, mcv_enabled, hist_enabled,\n"
+ " deps_built, mcv_built, hist_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2152,8 +2152,17 @@ describeOneTableDetails(const char *schemaname,
first = false;
}
+ if (!strcmp(PQgetvalue(result, i, 5), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", histogram");
+ else
+ appendPQExpBuffer(&buf, "(histogram");
+ first = false;
+ }
+
appendPQExpBuffer(&buf, ") ON (%s)",
- PQgetvalue(result, i, 9));
+ PQgetvalue(result, i, 10));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 7be6223..df6a61c 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -37,13 +37,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -51,6 +54,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -66,17 +70,20 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-
-#define Natts_pg_mv_statistic 10
+#define Natts_pg_mv_statistic 14
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_deps_enabled 3
#define Anum_pg_mv_statistic_mcv_enabled 4
-#define Anum_pg_mv_statistic_mcv_max_items 5
-#define Anum_pg_mv_statistic_deps_built 6
-#define Anum_pg_mv_statistic_mcv_built 7
-#define Anum_pg_mv_statistic_stakeys 8
-#define Anum_pg_mv_statistic_stadeps 9
-#define Anum_pg_mv_statistic_stamcv 10
+#define Anum_pg_mv_statistic_hist_enabled 5
+#define Anum_pg_mv_statistic_mcv_max_items 6
+#define Anum_pg_mv_statistic_hist_max_buckets 7
+#define Anum_pg_mv_statistic_deps_built 8
+#define Anum_pg_mv_statistic_mcv_built 9
+#define Anum_pg_mv_statistic_hist_built 10
+#define Anum_pg_mv_statistic_stakeys 11
+#define Anum_pg_mv_statistic_stadeps 12
+#define Anum_pg_mv_statistic_stamcv 13
+#define Anum_pg_mv_statistic_stahist 14
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index b16f2a9..9d20db5 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2747,6 +2747,10 @@ DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f
DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
DESCR("details about MCV list items");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3374 ( pg_mv_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_size}" _null_ _null_ pg_mv_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 7f2dc8a..3706525 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -593,10 +593,12 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
+ bool hist_enabled; /* histogram enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
+ bool hist_built; /* histogram built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index b028192..aa07000 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -91,6 +91,123 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ *
+ * TODO add more detailed description here
+ */
+
+typedef struct MVSerializedBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ uint16 *min;
+ bool *min_inclusive;
+
+ /* indexes of upper boundaries - values and information about the
+ * inequalities (exclusive vs. inclusive) */
+ uint16 *max;
+ bool *max_inclusive;
+
+} MVSerializedBucketData;
+
+typedef MVSerializedBucketData *MVSerializedBucket;
+
+typedef struct MVSerializedHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ /*
+ * keep this the same with MVHistogramData, because of
+ * deserialization (same offset)
+ */
+ MVSerializedBucket *buckets; /* array of buckets */
+
+ /*
+ * serialized boundary values, one array per dimension, deduplicated
+ * (the min/max indexes point into these arrays)
+ */
+ int *nvalues;
+ Datum **values;
+
+} MVSerializedHistogramData;
+
+typedef MVSerializedHistogramData *MVSerializedHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -98,20 +215,25 @@ typedef MCVListData *MCVList;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
+MVSerializedHistogram load_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVSerializedHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
* dimension within the stats).
*/
int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
@@ -120,6 +242,8 @@ extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_histogram_buckets(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -129,10 +253,20 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
+#ifdef DEBUG_MVHIST
+extern void debug_histogram_matches(MVSerializedHistogram mvhist, char *matches);
+#endif
+
+
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..c3c5216
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON mv_histogram (unknown_column) WITH (histogram);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON mv_histogram (a) WITH (histogram);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a, b) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets 200);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets 10);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets 100000);
+ERROR: maximum number of buckets is 16384
+-- correct command
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON mv_histogram (a, b, c, d) WITH (histogram);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 50715db..b08f977 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1371,7 +1371,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 838c12b..fbed683 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index d97a0ec..c60c0b2 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -163,3 +163,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..0ac21b8
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,176 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON mv_histogram (unknown_column) WITH (histogram);
+
+-- single column
+CREATE STATISTICS s1 ON mv_histogram (a) WITH (histogram);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a) WITH (histogram);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a, b) WITH (histogram);
+
+-- unknown option
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (unknown_option);
+
+-- missing histogram statistics
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets 200);
+
+-- invalid max_buckets value / too low
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets 10);
+
+-- invalid max_buckets value / too high
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets 100000);
+
+-- correct command
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON mv_histogram (a, b, c, d) WITH (histogram);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DROP TABLE mv_histogram;
--
2.1.0
0006-multi-statistics-estimation.patchtext/x-diff; name=0006-multi-statistics-estimation.patchDownload
>From f1b003fb1eabc654e102718945fc785da4a7f023 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 6/7] multi-statistics estimation
The general idea is that a probability (which
is what selectivity is) can be split into a product of
conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part
may be simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute
the original probability.
The implementation works in the other direction, though.
We know what probability P(A & B & C) we need to compute,
and also what statistics are available.
So we search for a combinations of statistics, covering
the clauses in an optimal way (most clauses covered, most
dependencies exploited).
There are two possible approaches - exhaustive and greedy.
The exhaustive one walks through all permutations of
stats using dynamic programming, so it's guaranteed to
find the optimal solution, but it soon gets very slow as
it's roughly O(N!). The dynamic programming may improve
that a bit, but it's still far too expensive for large
numbers of statistics (on a single table).
The greedy algorithm is very simple - in every step choose
the best solution. That may not guarantee the best solution
globally (but maybe it does?), but it only needs N steps
to find the solution, so it's very fast (processing the
selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with
respect to runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply
them to the clauses using the conditional probabilities.
We process the selected stats one by one, and for each
we select the estimated clauses and conditions. See
clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to
be covered by a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single
multivariate statistics.
Clauses not covered by a single statistics at this level
will be passed to clause_selectivity() but this will treat
them as a collection of simpler clauses (connected by AND
or OR), and the clauses from the previous level will be
used as conditions.
So using the same example, the last clause will be passed
to clause_selectivity() with 'clause1' and 'clause2' as
conditions, and it will be processed using multivariate
stats if possible.
The other limitation is that all the expressions have to
be mv-compatible, i.e. there can't be a mix of expressions.
If this is violated, the clause may be passed to the next
level (just like with list of clauses not covered by
a single statistics), which splits that into clauses
handled by multivariate stats and clauses handler by
regular statistics.
rework clauselist_selectivity_or to handle OR-clauses correctly
We might invent a completely new set of functions here, resembling
clauselist_selectivity but adapting the ideas to OR-clauses.
But luckily we know that each OR-clause
(a OR b OR c)
may be rewritten as an equivalent AND-clause using negation:
NOT ((NOT a) AND (NOT b) AND (NOT c))
And that's something we can pass to clauselist_selectivity.
histogram call cache
--------------------
The call cache was removed because it did not initially work
well with OR clauses, but that was just a stupid thinko in the
implementation. This patch re-adds it, hopefully correctly.
The code in update_match_bitmap_histogram() is overly complex,
the branches handling various inequality cases are redundant.
This needs to be simplified somehow.
---
contrib/file_fdw/file_fdw.c | 3 +-
contrib/postgres_fdw/postgres_fdw.c | 6 +-
src/backend/optimizer/path/clausesel.c | 2224 +++++++++++++++++++++++++++-----
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
9 files changed, 2003 insertions(+), 308 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 83bbfa1..1d1571c 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -954,7 +954,8 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
baserel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 9a014d4..7d09fe3 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -454,7 +454,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
@@ -1836,7 +1837,8 @@ estimate_path_cost_size(PlannerInfo *root,
local_join_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 6c99f02..8d15d3c8 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -29,6 +29,8 @@
#include "utils/selfuncs.h"
#include "utils/typcache.h"
+#include "miscadmin.h"
+
/*
* Data structure for accumulating info about possible range-query
@@ -44,6 +46,13 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
+
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -59,23 +68,29 @@ static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
int type);
+static Bitmapset *clause_mv_get_attnums(PlannerInfo *root, Node *clause);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, List *stats,
SpecialJoinInfo *sjinfo);
-static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
-
static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
List *clauses, Oid varRelid,
List **mvclauses, MVStatisticInfo *mvstats, int types);
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats, List *clauses,
+ List *conditions, bool is_or);
+
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats,
- bool *fullmatch, Selectivity *lowsel);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or, bool *fullmatch,
+ Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -89,11 +104,59 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static List *choose_mv_statistics(PlannerInfo *root,
+ List *mvstats,
+ List *clauses, List *conditions,
+ Oid varRelid,
+ SpecialJoinInfo *sjinfo);
+
+static List *filter_clauses(PlannerInfo *root, Oid varRelid,
+ SpecialJoinInfo *sjinfo, int type,
+ List *stats, List *clauses,
+ Bitmapset **attnums);
+
+static List *filter_stats(List *stats, Bitmapset *new_attnums,
+ Bitmapset *all_attnums);
+
+static Bitmapset **make_stats_attnums(MVStatisticInfo *mvstats,
+ int nmvstats);
+
+static MVStatisticInfo *make_stats_array(List *stats, int *nmvstats);
+
+static List* filter_redundant_stats(List *stats,
+ List *clauses, List *conditions);
+
+static Node** make_clauses_array(List *clauses, int *nclauses);
+
+static Bitmapset ** make_clauses_attnums(PlannerInfo *root, Oid varRelid,
+ SpecialJoinInfo *sjinfo, int type,
+ Node **clauses, int nclauses);
+
+static bool* make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
Oid varRelid, Index *relid);
-
+
static Bitmapset* fdeps_collect_attnums(List *stats);
static int *make_idx_to_attnum_mapping(Bitmapset *attnums);
@@ -116,6 +179,8 @@ static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
static Bitmapset * get_varattnos(Node * node, Index relid);
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
+
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
#define MIN(x, y) (((x) < (y)) ? (x) : (y))
@@ -257,14 +322,15 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
/* processing mv stats */
- Oid relid = InvalidOid;
+ Index relid = InvalidOid;
/* attributes in mv-compatible clauses */
Bitmapset *mvattnums = NULL;
@@ -274,12 +340,13 @@ clauselist_selectivity(PlannerInfo *root,
stats = find_stats(root, clauses, varRelid, &relid);
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, or matching multivariate statistics, so just go directly
+ * to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
+ varRelid, jointype, sjinfo, conditions);
/*
* Check that there are some stats with functional dependencies
@@ -311,8 +378,8 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
- * Check that there are statistics with MCV list. If not, we don't
- * need to waste time with the optimization.
+ * Check that there are statistics with MCV list or histogram.
+ * If not, we don't need to waste time with the optimization.
*/
if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
@@ -326,33 +393,194 @@ clauselist_selectivity(PlannerInfo *root,
/*
* If there still are at least two columns, we'll try to select
- * a suitable multivariate stats.
+ * a suitable combination of multivariate stats. If there are
+ * multiple combinations, we'll try to choose the best one.
+ * See choose_mv_statistics for more details.
*/
if (bms_num_members(mvattnums) >= 2)
{
- /* see choose_mv_statistics() for details */
- MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+ int k;
+ ListCell *s;
+
+ /*
+ * Copy the list of conditions, so that we can build a list
+ * of local conditions (and keep the original intact, for
+ * the other clauses at the same level).
+ */
+ List *conditions_local = list_copy(conditions);
+
+ /* find the best combination of statistics */
+ List *solution = choose_mv_statistics(root, stats,
+ clauses, conditions,
+ varRelid, sjinfo);
- if (mvstat != NULL) /* we have a matching stats */
+ /* we have a good solution (list of stats) */
+ foreach (s, solution)
{
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
/* clauses compatible with multi-variate stats */
List *mvclauses = NIL;
+ List *mvclauses_new = NIL;
+ List *mvclauses_conditions = NIL;
+ Bitmapset *stat_attnums = NULL;
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, sjinfo, clauses,
+ /* build attnum bitmapset for this statistics */
+ for (k = 0; k < mvstat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ mvstat->stakeys->values[k]);
+
+ /*
+ * Append the compatible conditions (passed from above)
+ * to mvclauses_conditions.
+ */
+ foreach (l, conditions)
+ {
+ Node *c = (Node*)lfirst(l);
+ Bitmapset *tmp = clause_mv_get_attnums(root, c);
+
+ if (bms_is_subset(tmp, stat_attnums))
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, c);
+
+ bms_free(tmp);
+ }
+
+ /* split the clauselist into regular and mv-clauses
+ *
+ * We keep the list of clauses (we don't remove the
+ * clauses yet, because we want to use the clauses
+ * as conditions of other clauses).
+ *
+ * FIXME Do this only once, i.e. filter the clauses
+ * once (selecting clauses covered by at least
+ * one statistics) and then convert them into
+ * smaller per-statistics lists of conditions
+ * and estimated clauses.
+ */
+ clauselist_mv_split(root, sjinfo, clauses,
varRelid, &mvclauses, mvstat,
(MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
- /* we've chosen the histogram to match the clauses */
+ /*
+ * We've chosen the statistics to match the clauses, so
+ * each statistics from the solution should have at least
+ * one new clause (not covered by the previous stats).
+ */
Assert(mvclauses != NIL);
+ /*
+ * Mvclauses now contains only clauses compatible
+ * with the currently selected stats, but we have to
+ * split that into conditions (already matched by
+ * the previous stats), and the new clauses we need
+ * to estimate using this stats.
+ */
+ foreach (l, mvclauses)
+ {
+ ListCell *p;
+ bool covered = false;
+ Node *clause = (Node *) lfirst(l);
+ Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
+
+ /*
+ * If already covered by previous stats, add it to
+ * conditions.
+ *
+ * TODO Maybe this could be relaxed a bit? Because
+ * with complex and/or clauses, this might
+ * mean no statistics actually covers such
+ * complex clause.
+ */
+ foreach (p, solution)
+ {
+ int k;
+ Bitmapset *stat_attnums = NULL;
+
+ MVStatisticInfo *prev_stat
+ = (MVStatisticInfo *)lfirst(p);
+
+ /* break if we've ran into current statistic */
+ if (prev_stat == mvstat)
+ break;
+
+ for (k = 0; k < prev_stat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ prev_stat->stakeys->values[k]);
+
+ covered = bms_is_subset(clause_attnums, stat_attnums);
+
+ bms_free(stat_attnums);
+
+ if (covered)
+ break;
+ }
+
+ if (covered)
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, clause);
+ else
+ mvclauses_new
+ = lappend(mvclauses_new, clause);
+ }
+
+ /*
+ * We need at least one new clause (not just conditions).
+ */
+ Assert(mvclauses_new != NIL);
+
/* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ mvclauses_new,
+ mvclauses_conditions,
+ false); /* AND */
+ }
+
+ /*
+ * And now finally remove all the mv-compatible clauses.
+ *
+ * This only repeats the same split as above, but this
+ * time we actually use the result list (and feed it to
+ * the next call).
+ */
+ foreach (s, solution)
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
+ /* split the list into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * Add the clauses to the conditions (to be passed
+ * to regular clauses), irrespectedly whether it
+ * will be used as a condition or a clause here.
+ *
+ * We only keep the remaining conditions in the
+ * clauses (we keep what clauselist_mv_split returns)
+ * so we add each MV condition exactly once.
+ */
+ conditions_local = list_concat(conditions_local, mvclauses);
}
+
+ /* from now on, work with the 'local' list of conditions */
+ conditions = conditions_local;
}
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ return clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo, conditions);
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -364,7 +592,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions);
/*
* Check for being passed a RestrictInfo.
@@ -523,6 +752,55 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Similar to clauselist_selectivity(), but for OR-clauses. We can't
+ * simply apply exactly the same logic as to AND-clauses, because there
+ * are a few key differences:
+ *
+ * - functional dependencies don't really apply to OR-clauses
+ *
+ * - clauselist_selectivity() works by decomposing the selectivity
+ * into conditional selectivities (probabilities), but that can be
+ * done only for AND-clauses. That means problems with applying
+ * multiple statistics (and reusing clauses as conditions, etc.).
+ *
+ * We might invent a completely new set of functions here, resembling
+ * clauselist_selectivity but adapting the ideas to OR-clauses.
+ *
+ * But luckily we know that each OR-clause
+ *
+ * (a OR b OR c)
+ *
+ * may be rewritten as an equivalent AND-clause using negation:
+ *
+ * NOT ((NOT a) AND (NOT b) AND (NOT c))
+ *
+ * And that's something we can pass to clauselist_selectivity.
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
+{
+ List *args = NIL;
+ ListCell *l;
+ Expr *expr;
+
+ /* (NOT ...) */
+ foreach (l, clauses)
+ args = lappend(args, makeBoolExpr(NOT_EXPR, list_make1(lfirst(l)), -1));
+
+ /* ((NOT ...) AND (NOT ...)) */
+ expr = makeBoolExpr(AND_EXPR, args, -1);
+
+ /* NOT (... AND ...) */
+ return 1.0 - clauselist_selectivity(root, list_make1(expr), varRelid,
+ jointype, sjinfo, conditions);
+}
+
+/*
* addRangeClause --- add a new range clause for clauselist_selectivity
*
* Here is where we try to match up pairs of range-query clauses
@@ -729,7 +1007,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -849,7 +1128,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -858,29 +1138,18 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
- /*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
- */
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(arg),
- varRelid,
- jointype,
- sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ /* just call to clauselist_selectivity_or() */
+ s1 = clauselist_selectivity_or(root,
+ ((BoolExpr *) clause)->args,
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -970,7 +1239,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -979,7 +1249,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else
{
@@ -1103,9 +1374,67 @@ clause_selectivity(PlannerInfo *root,
* them without inspection, which is more expensive). But this
* requires really knowing the per-clause selectivities in advance,
* and that's not what we do now.
+ *
+ * TODO All this is based on the assumption that the statistics represent
+ * the necessary dependencies, i.e. that if two colunms are not in
+ * the same statistics, there's no dependency. If that's not the
+ * case, we may get misestimates, just like before. For example
+ * assume we have a table with three columns [a,b,c] with exactly
+ * the same values, and statistics on [a,b] and [b,c]. So somthing
+ * like this:
+ *
+ * CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+ *
+ * ALTER TABLE test ADD STATISTICS (mcv) ON (a,b);
+ * ALTER TABLE test ADD STATISTICS (mcv) ON (b,c);
+ *
+ * ANALYZE test;
+ *
+ * EXPLAIN ANALYZE SELECT * FROM test
+ * WHERE (a < 10) AND (b < 20) AND (c < 10);
+ *
+ * The problem here is that the only shared column between the two
+ * statistics is 'b' so the probability will be computed like this
+ *
+ * P[(a < 10) & (b < 20) & (c < 10)]
+ * = P[(a < 10) & (b < 20)] * P[(c < 10) | (a < 10) & (b < 20)]
+ * = P[(a < 10) & (b < 20)] * P[(c < 10) | (b < 20)]
+ *
+ * or like this
+ *
+ * P[(a < 10) & (b < 20) & (c < 10)]
+ * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20) & (c < 10)]
+ * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20)]
+ *
+ * In both cases the conditional probabilities will be evaluated as
+ * 0.5, because they lack the other column (which would make it 1.0).
+ *
+ * Theoretically it might be possible to transfer the dependency,
+ * e.g. by building bitmap for [a,b] and then combine it with [b,c]
+ * by doing something like this:
+ *
+ * 1) build bitmap on [a,b] using [(a<10) & (b < 20)]
+ * 2) for each element in [b,c] check the bitmap
+ *
+ * But that's certainly nontrivial - for example the statistics may
+ * be different (MCV list vs. histogram) and/or the items may not
+ * match (e.g. MCV items or histogram buckets will be built
+ * differently). Also, for one value of 'b' there might be multiple
+ * MCV items (because of the other column values) with different
+ * bitmap values (some will match, some won't) - so it's not exactly
+ * bitmap but a partial match.
+ *
+ * Maybe a hash table with number of matches and mismatches (or
+ * maybe sums of frequencies) would work? The step (2) would then
+ * lookup the values and use that to weight the item somehow.
+ *
+ * Currently the only solution is to build statistics on all three
+ * columns.
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
@@ -1123,7 +1452,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
*/
/* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions, is_or,
&fullmatch, &mcv_low);
/*
@@ -1136,7 +1466,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
/* FIXME if (fullmatch) without matching MCV item, use the mcv_low
* selectivity as upper bound */
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions, is_or);
/* TODO clamp to <= 1.0 (or more strictly, when possible) */
return s1 + s2;
@@ -1176,8 +1507,7 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
if (bms_num_members(attnums) <= 1)
{
- if (attnums != NULL)
- pfree(attnums);
+ bms_free(attnums);
attnums = NULL;
*relid = InvalidOid;
}
@@ -1186,202 +1516,931 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
}
/*
- * We're looking for statistics matching at least 2 attributes,
- * referenced in the clauses compatible with multivariate statistics.
- * The current selection criteria is very simple - we choose the
- * statistics referencing the most attributes.
+ * Selects the best combination of multivariate statistics, in an
+ * exhaustive way, where 'best' means:
*
- * If there are multiple statistics referencing the same number of
- * columns (from the clauses), the one with less source columns
- * (as listed in the ADD STATISTICS when creating the statistics) wins.
- * Other wise the first one wins.
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ * (c) using the most conditions to exploit dependency
*
- * This is a very simple criteria, and has several weaknesses:
+ * There may be other optimality criteria, not considered in the initial
+ * implementation (more on that 'weaknesses' section).
*
- * (a) does not consider the accuracy of the statistics
+ * This pretty much splits the probability of clauses (aka selectivity)
+ * into a sequence of conditional probabilities, like this
*
- * If there are two histograms built on the same set of columns,
- * but one has 100 buckets and the other one has 1000 buckets (thus
- * likely providing better estimates), this is not currently
- * considered.
+ * P(A,B,C,D) = P(A,B) * P(C|A,B) * P(D|A,B,C)
*
- * (b) does not consider the type of statistics
+ * and removing the attributes not referenced by the existing stats,
+ * under the assumption that there's no dependency (otherwise the DBA
+ * would create the stats).
*
- * If there are three statistics - one containing just a MCV list,
- * another one with just a histogram and a third one with both,
- * this is not considered.
+ * The last criteria means that when we have the choice to compute like
+ * this
*
- * (c) does not consider the number of clauses
+ * P(A,B,C,D) = P(A,B,C) * P(D|B,C)
*
- * As explained, only the number of referenced attributes counts,
- * so if there are multiple clauses on a single attribute, this
- * still counts as a single attribute.
+ * or like this
*
- * (d) does not consider type of condition
+ * P(A,B,C,D) = P(A,B,C) * P(D|C)
*
- * Some clauses may work better with some statistics - for example
- * equality clauses probably work better with MCV lists than with
- * histograms. But IS [NOT] NULL conditions may often work better
- * with histograms (thanks to NULL-buckets).
+ * we should use the first option, as that exploits more dependencies.
*
- * So for example with five WHERE conditions
+ * The order of statistics in the solution implicitly determines the
+ * order of estimation of clauses, because as we apply a statistics,
+ * we always use it to estimate all the clauses covered by it (and
+ * then we use those clauses as conditions for the next statistics).
*
- * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ * Don't call this directly but through choose_mv_statistics().
*
- * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
- * selected as it references the most columns.
*
- * Once we have selected the multivariate statistics, we split the list
- * of clauses into two parts - conditions that are compatible with the
- * selected stats, and conditions are estimated using simple statistics.
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with
+ * maximum 'depth' equal to the number of multi-variate statistics
+ * available on the table.
*
- * From the example above, conditions
+ * It explores all the possible permutations of the stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it
+ * matches are divided into 'conditions' (clauses already matched by at
+ * least one previous statistics) and clauses that are estimated.
*
- * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ * Then several checks are performed:
*
- * will be estimated using the multivariate statistics (a,b,c,d) while
- * the last condition (e = 1) will get estimated using the regular ones.
+ * (a) The statistics covers at least 2 columns, referenced in the
+ * estimated clauses (otherwise multi-variate stats are useless).
*
- * There are various alternative selection criteria (e.g. counting
- * conditions instead of just referenced attributes), but eventually
- * the best option should be to combine multiple statistics. But that's
- * much harder to do correctly.
+ * (b) The statistics covers at least 1 new column, i.e. column not
+ * refefenced by the already used stats (and the new column has
+ * to be referenced by the clauses, of couse). Otherwise the
+ * statistics would not add any new information.
*
- * TODO Select multiple statistics and combine them when computing
- * the estimate.
+ * There are some other sanity checks (e.g. that the stats must not be
+ * used twice etc.).
*
- * TODO This will probably have to consider compatibility of clauses,
- * because 'dependencies' will probably work only with equality
- * clauses.
+ * Finally the new solution is compared to the currently best one, and
+ * if it's considered better, it's used instead.
+ *
+ *
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a somewhat simple optimality criteria,
+ * suffering by the following weaknesses.
+ *
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but
+ * with statistics in a different order). It's unclear which solution
+ * is the best one - in a sense all of them are equal.
+ *
+ * TODO It might be possible to compute estimate for each of those
+ * solutions, and then combine them to get the final estimate
+ * (e.g. by using average or median).
+ *
+ * (b) Does not consider that some types of stats are a better match for
+ * some types of clauses (e.g. MCV list is a good match for equality
+ * than a histogram).
+ *
+ * XXX Maybe MCV is almost always better / more accurate?
+ *
+ * But maybe this is pointless - generally, each column is either
+ * a label (it's not important whether because of the data type or
+ * how it's used), or a value with ordering that makes sense. So
+ * either a MCV list is more appropriate (labels) or a histogram
+ * (values with orderings).
+ *
+ * Now sure what to do with statistics on columns mixing columns of
+ * both types - maybe it'd be beeter to invent a new type of stats
+ * combining MCV list and histogram (keeping a small histogram for
+ * each MCV item, and a separate histogram for values not on the
+ * MCV list). But that's not implemented at this moment.
+ *
+ * TODO The algorithm should probably count number of Vars (not just
+ * attnums) when computing the 'score' of each solution. Computing
+ * the ratio of (num of all vars) / (num of condition vars) as a
+ * measure of how well the solution uses conditions might be
+ * useful.
*/
-static MVStatisticInfo *
-choose_mv_statistics(List *stats, Bitmapset *attnums)
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
{
- int i;
- ListCell *lc;
+ int i, j;
- MVStatisticInfo *choice = NULL;
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
/*
- * Walk through the statistics (simple array with nmvstats elements)
- * and for each one count the referenced attributes (encoded in
- * the 'attnums' bitmap).
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
*/
- foreach (lc, stats)
+ for (i = 0; i < nmvstats; i++)
{
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+ int c;
- /* columns matching this statistics */
- int matches = 0;
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
- int2vector * attrs = info->stakeys;
- int numattrs = attrs->dim1;
+ Bitmapset *all_attnums = NULL;
+ Bitmapset *new_attnums = NULL;
- /* skip dependencies-only stats */
- if (! (info->mcv_built || info->hist_built))
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
continue;
- /* count columns covered by the histogram */
- for (i = 0; i < numattrs; i++)
- if (bms_is_member(attrs->values[i], attnums))
- matches++;
-
/*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
*/
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
+ for (c = 0; c < nclauses; c++)
{
- choice = info;
- current_matches = matches;
- current_dims = numattrs;
- }
- }
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
- return choice;
-}
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
-/*
- * This splits the clauses list into two parts - one containing clauses
- * that will be evaluated using the chosen statistics, and the remaining
- * clauses (either non-mvcompatible, or not related to the histogram).
- */
-static List *
-clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
- List *clauses, Oid varRelid, List **mvclauses,
- MVStatisticInfo *mvstats, int types)
-{
- int i;
- ListCell *l;
- List *non_mvclauses = NIL;
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
- /* FIXME is there a better way to get info on int2vector? */
- int2vector * attrs = mvstats->stakeys;
- int numattrs = mvstats->stakeys->dim1;
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
- Bitmapset *mvattnums = NULL;
+ if (covered)
+ break;
+ }
- /* build bitmap of attributes covered by the stats, so we can
- * do bms_is_subset later */
- for (i = 0; i < numattrs; i++)
- mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
- /* erase the list of mv-compatible clauses */
- *mvclauses = NIL;
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
- foreach (l, clauses)
- {
- bool match = false; /* by default not mv-compatible */
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(l);
+ /* add the attnums into attnums from 'new clauses' */
+ // new_attnums = bms_union(new_attnums, clause_attnums);
+ }
- if (clause_is_mv_compatible(root, clause, varRelid, NULL,
- &attnums, sjinfo, types))
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+ bms_free(new_attnums);
+
+ all_attnums = NULL;
+ new_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
{
- /* are all the attributes part of the selected stats? */
- if (bms_is_subset(attnums, mvattnums))
- match = true;
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
}
/*
- * The clause matches the selected stats, so put it to the list
- * of mv-compatible clauses. Otherwise, keep it in the list of
- * 'regular' clauses (that may be selected later).
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
*/
- if (match)
- *mvclauses = lappend(*mvclauses, clause);
- else
- non_mvclauses = lappend(non_mvclauses, clause);
- }
+ ruled_out[i] = step;
- /*
- * Perform regular estimation using the clauses incompatible
- * with the chosen histogram (or MV stats in general).
- */
- return non_mvclauses;
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
-}
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
-/*
- * Determines whether the clause is compatible with multivariate stats,
- * and if it is, returns some additional information - varno (index
- * into simple_rte_array) and a bitmap of attributes. This is then
- * used to fetch related multivariate statistics.
- *
- * At this moment we only support basic conditions of the form
- *
- * variable OP constant
- *
- * where OP is one of [=,<,<=,>=,>] (which is however determined by
- * looking at the associated function for estimating selectivity, just
- * like with the single-dimensional case).
- *
- * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ /* we can't get more conditions that clauses and conditions combined
+ *
+ * FIXME This assert does not work because we count the conditions
+ * repeatedly (once for each statistics covering it).
+ */
+ /* Assert((nconditions + nclauses) >= current->nconditions); */
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics
+ * covering the clauses. This chooses the "best" statistics at each step,
+ * so the resulting solution may not be the best solution globally, but
+ * this produces the solution in only N steps (where N is the number of
+ * statistics), while the exhaustive approach may have to walk through
+ * ~N! combinations (although some of those are terminated early).
+ *
+ * See the comments at choose_mv_statistics_exhaustive() as this does
+ * the same thing (but in a different way).
+ *
+ * Don't call this directly, but through choose_mv_statistics().
+ *
+ * TODO There are probably other metrics we might use - e.g. using
+ * number of columns (num_cond_columns / num_cov_columns), which
+ * might work better with a mix of simple and complex clauses.
+ *
+ * TODO Also the choice at the very first step should be handled
+ * in a special way, because there will be 0 conditions at that
+ * moment, so there needs to be some other criteria - e.g. using
+ * the simplest (or most complex?) clause might be a good idea.
+ *
+ * TODO We might also select multiple stats using different criteria,
+ * and branch the search. This is however tricky, because if we
+ * choose k statistics at each step, we get k^N branches to
+ * walk through (with N steps). That's not really good with
+ * large number of stats (yet better than exhaustive search).
+ */
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
+
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
+
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
+ */
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *attnums_new
+ = bms_union(attnums_covered, clauses_attnums[j]);
+
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = attnums_new;
+
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
+
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ attnums_new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = attnums_new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
+
+ if (gain > max_gain)
+ {
+ max_gain = gain;
+ best_stat = i;
+ }
+ }
+
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
+ }
+
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
+}
+
+/*
+ * Chooses the combination of statistics, optimal for estimation of
+ * a particular clause list.
+ *
+ * This only handles a 'preparation' shared by the exhaustive and greedy
+ * implementations (see the previous methods), mostly trying to reduce
+ * the size of the problem (eliminate clauses/statistics that can't be
+ * really used in the solution).
+ *
+ * It also precomputes bitmaps for attributes covered by clauses and
+ * statistics, so that we don't need to do that over and over in the
+ * actual optimizations (as it's both CPU and memory intensive).
+ *
+ * TODO This will probably have to consider compatibility of clauses,
+ * because 'dependencies' will probably work only with equality
+ * clauses.
+ *
+ * TODO Another way to make the optimization problems smaller might
+ * be splitting the statistics into several disjoint subsets, i.e.
+ * if we can split the graph of statistics (after the elimination)
+ * into multiple components (so that stats in different components
+ * share no attributes), we can do the optimization for each
+ * component separately.
+ *
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew
+ * that we can cover 10 clauses and reuse 8 dependencies, maybe
+ * covering 9 clauses and 7 dependencies would be OK?
+ */
+static List*
+choose_mv_statistics(PlannerInfo *root, List *stats,
+ List *clauses, List *conditions,
+ Oid varRelid, SpecialJoinInfo *sjinfo)
+{
+ int i;
+ mv_solution_t *best = NULL;
+ List *result = NIL;
+
+ int nmvstats;
+ MVStatisticInfo *mvstats;
+
+ /* we only work with MCV lists and histograms here */
+ int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
+
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
+
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
+ stats = list_copy(stats);
+
+ /*
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
+ */
+ while (true)
+ {
+ List *tmp;
+
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
+
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. We'll also keep info
+ * about attnums in clauses (without conditions) so that we can
+ * ignore stats covering just conditions (which is pointless).
+ */
+ tmp = filter_clauses(root, varRelid, sjinfo, type,
+ stats, clauses, &compatible_attnums);
+
+ /* discard the original list */
+ list_free(clauses);
+ clauses = tmp;
+
+ /*
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new
+ * attribute (by comparing with clauses).
+ */
+ if (conditions != NIL)
+ {
+ tmp = filter_clauses(root, varRelid, sjinfo, type,
+ stats, conditions, &condition_attnums);
+
+ /* discard the original list */
+ list_free(conditions);
+ conditions = tmp;
+ }
+
+ /* get a union of attnums (from conditions and new clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ tmp = filter_stats(stats, compatible_attnums, all_attnums);
+
+ /* if we've not eliminated anything, terminate */
+ if (list_length(stats) == list_length(tmp))
+ break;
+
+ /* work only with filtered statistics from now */
+ list_free(stats);
+ stats = tmp;
+ }
+
+ /* only do the optimization if we have clauses/statistics */
+ if ((list_length(stats) == 0) || (list_length(clauses) == 0))
+ return NULL;
+
+ /* remove redundant stats (stats covered by another stats) */
+ stats = filter_redundant_stats(stats, clauses, conditions);
+
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* collect clauses an bitmap of attnums */
+ clauses_array = make_clauses_array(clauses, &nclauses);
+ clauses_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
+ clauses_array, nclauses);
+
+ /* collect conditions and bitmap of attnums */
+ conditions_array = make_clauses_array(conditions, &nconditions);
+ conditions_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
+ conditions_array, nconditions);
+
+ /*
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
+ */
+ clause_cover_map = make_cover_map(stats_attnums, nmvstats,
+ clauses_attnums, nclauses);
+
+ condition_cover_map = make_cover_map(stats_attnums, nmvstats,
+ conditions_attnums, nconditions);
+
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ /* no stats are ruled out by default */
+ for (i = 0; i < nmvstats; i++)
+ ruled_out[i] = -1;
+
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* create a list of statistics from the array */
+ if (best != NULL)
+ {
+ for (i = 0; i < best->nstats; i++)
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
+ result = lappend(result, info);
+ }
+ pfree(best);
+ }
+
+ /* cleanup (maybe leave it up to the memory context?) */
+ for (i = 0; i < nmvstats; i++)
+ bms_free(stats_attnums[i]);
+
+ for (i = 0; i < nclauses; i++)
+ bms_free(clauses_attnums[i]);
+
+ for (i = 0; i < nconditions; i++)
+ bms_free(conditions_attnums[i]);
+
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(conditions_attnums);
+
+ pfree(clauses_array);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+ pfree(mvstats);
+
+ list_free(clauses);
+ list_free(conditions);
+ list_free(stats);
+
+ return result;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen statistics, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes covered by the stats, so we can
+ * do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(root, clause, varRelid, NULL,
+ &attnums, sjinfo, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list
+ * of mv-compatible clauses. Otherwise, keep it in the list of
+ * 'regular' clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
* evaluate them using multivariate stats.
*/
static bool
@@ -1539,10 +2598,10 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
return true;
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/*
- * AND/OR-clauses are supported if all sub-clauses are supported
+ * AND/OR/NOT-clauses are supported if all sub-clauses are supported
*
* TODO We might support mixed case, where some of the clauses
* are supported and some are not, and treat all supported
@@ -1552,7 +2611,10 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
*
* TODO For RestrictInfo above an OR-clause, we might use the
* orclause with nested RestrictInfo - we won't have to
- * call pull_varnos() for each clause, saving time.
+ * call pull_varnos() for each clause, saving time.
+ *
+ * TODO Perhaps this needs a bit more thought for functional
+ * dependencies? Those don't quite work for NOT cases.
*/
Bitmapset *tmp = NULL;
ListCell *l;
@@ -1572,6 +2634,51 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
return false;
}
+
+static Bitmapset *
+clause_mv_get_attnums(PlannerInfo *root, Node *clause)
+{
+ Bitmapset * attnums = NULL;
+
+ /* Extract clause from restrict info, if needed. */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+
+ if (IsA(linitial(expr->args), Var))
+ attnums = bms_add_member(attnums,
+ ((Var*)linitial(expr->args))->varattno);
+ else
+ attnums = bms_add_member(attnums,
+ ((Var*)lsecond(expr->args))->varattno);
+ }
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ attnums = bms_add_member(attnums,
+ ((Var*)((NullTest*)clause)->arg)->varattno);
+ }
+ else if (or_clause(clause) || and_clause(clause) || or_clause(clause))
+ {
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ attnums = bms_join(attnums,
+ clause_mv_get_attnums(root, (Node*)lfirst(l)));
+ }
+ }
+
+ return attnums;
+}
+
/*
* Performs reduction of clauses using functional dependencies, i.e.
* removes clauses that are considered redundant. It simply walks
@@ -2223,22 +3330,26 @@ get_varattnos(Node * node, Index relid)
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -2249,32 +3360,85 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
- nmatches, matches,
- lowsel, fullmatch, false);
+ ((is_or) ? 0 : nmatches), matches,
+ lowsel, fullmatch, is_or);
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
{
- /* used to 'scale' for MCV lists not covering all tuples */
+ /*
+ * Find out what part of the data is covered by the MCV list,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample items got into the MCV, and the rest
+ * is either in a histogram, or not covered by stats).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole list, which might save us a bit of time
+ * spent on accessing the not-matching part of the MCV list.
+ * Although it's likely in a cache, so it's very fast.
+ */
u += mcvlist->items[i]->frequency;
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s*u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -2567,64 +3731,57 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
}
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each MCV item */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching MCV items */
- or_nmatches = mcvlist->nitems;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
/* by default none of the MCV items matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
stakeys, mcvlist,
- or_nmatches, or_matches,
+ tmp_nmatches, tmp_matches,
lowsel, fullmatch, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mcvlist->nitems; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
+ pfree(tmp_matches);
}
else
- {
elog(ERROR, "unknown clause type: %d", clause->type);
- }
}
/*
@@ -2682,15 +3839,18 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
MVSerializedHistogram mvhist = NULL;
@@ -2701,27 +3861,57 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
/* There may be no histogram in the stats (check hist_built flag) */
mvhist = load_mv_histogram(mvstats->mvoid);
- Assert (mvhist != NULL);
- Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
/*
- * Bitmap of bucket matches (mismatch, partial, full). by default
- * all buckets fully match (and we'll eliminate them).
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
-
- nmatches = mvhist->nbuckets;
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, false);
- /* build the match bitmap */
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
- nmatches, matches, false);
+ ((is_or) ? 0 : nmatches), matches,
+ is_or);
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
/*
* Find out what part of the data is covered by the histogram,
* so that we can 'scale' the selectivity properly (e.g. when
@@ -2735,17 +3925,35 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
*/
u += mvhist->buckets[i]->ntuples;
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s * u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -2775,7 +3983,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
{
int i;
ListCell * l;
-
+
/*
* Used for caching function calls, only once per deduplicated value.
*
@@ -2818,7 +4026,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
FmgrInfo opproc; /* operator */
fmgr_info(get_opcode(expr->opno), &opproc);
-
+
/* reset the cache (per clause) */
memset(callcache, 0, mvhist->nbuckets);
@@ -2870,7 +4078,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
/* histogram boundaries */
Datum minval, maxval;
-
+
/* values from the call cache */
char mincached, maxcached;
@@ -2959,7 +4167,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
}
/*
- * Now check whether the upper boundary is below the constant (in that
+ * Now check whether constant is below the upper boundary (in that
* case it's a partial match).
*/
if (! maxcached)
@@ -2978,8 +4186,32 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
else
tmp = !(maxcached & 0x02); /* extract the result (reverse) */
- if (tmp) /* partial match */
+ if (tmp)
+ {
+ /* partial match */
UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ continue;
+ }
+
+
+ /*
+ * And finally check whether the whether the constant is above the the upper
+ * boundary (in that case it's a full match match).
+ *
+ * XXX We need to do this because of the OR clauses (which start with no
+ * matches and we incrementally add more and more matches), but maybe
+ * we don't need to do the check and can just do UPDATE_RESULT?
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ if (tmp)
+ {
+ /* full match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_FULL, is_or);
+ }
}
else /* (const < var) */
@@ -3018,15 +4250,36 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
DEFAULT_COLLATION_OID,
minval,
cst->constvalue));
-
/* Update the cache. */
callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
- }
+ }
else
tmp = (mincached & 0x02); /* extract the result */
- if (tmp) /* partial match */
+ if (tmp)
+ {
+ /* partial match */
UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ *
+ * XXX We need to do this because of the OR clauses (which start with no
+ * matches and we incrementally add more and more matches), but maybe
+ * we don't need to do the check and can just do UPDATE_RESULT?
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_FULL, is_or);
+
}
break;
@@ -3082,8 +4335,29 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
tmp = !(mincached & 0x02); /* extract the result */
if (tmp)
+ {
/* partial match */
UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ *
+ * XXX We need to do this because of the OR clauses (which start with no
+ * matches and we incrementally add more and more matches), but maybe
+ * we don't need to do the check and can just do UPDATE_RESULT?
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ minval,
+ cst->constvalue));
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_FULL, is_or);
+
}
else /* (const > var) */
{
@@ -3129,8 +4403,30 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
tmp = (maxcached & 0x02); /* extract the result */
if (tmp)
+ {
/* partial match */
UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ *
+ * XXX We need to do this because of the OR clauses (which start with no
+ * matches and we incrementally add more and more matches), but maybe
+ * we don't need to do the check and can just do UPDATE_RESULT?
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ maxval));
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_FULL, is_or);
+ continue;
+
}
break;
@@ -3195,6 +4491,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
else
tmp = (maxcached & 0x02); /* extract the result */
+
if (tmp)
{
/* no match */
@@ -3246,64 +4543,57 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each bucket */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching buckets */
- or_nmatches = mvhist->nbuckets;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mvhist->nbuckets;
- /* by default none of the buckets matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ /* by default none of the buckets matches the clauses (OR clause) */
+ tmp_matches = palloc0(sizeof(char) * mvhist->nbuckets);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* but AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ tmp_nmatches = update_match_bitmap_histogram(root, tmp_clauses,
stakeys, mvhist,
- or_nmatches, or_matches, or_clause(clause));
+ tmp_nmatches, tmp_matches, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mvhist->nbuckets; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
-
+ pfree(tmp_matches);
}
else
elog(ERROR, "unknown clause type: %d", clause->type);
}
- /* free the call cache */
pfree(callcache);
#ifdef DEBUG_MVHIST
@@ -3312,3 +4602,363 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics.
+ */
+static List *
+filter_clauses(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
+ int type, List *stats, List *clauses, Bitmapset **attnums)
+{
+ ListCell *c;
+ ListCell *s;
+
+ /* results (list of compatible clauses, attnums) */
+ List *rclauses = NIL;
+
+ foreach (c, clauses)
+ {
+ Node *clause = (Node*)lfirst(c);
+ Bitmapset *clause_attnums = NULL;
+ Index relid;
+
+ /*
+ * The clause has to be mv-compatible (suitable operators etc.).
+ */
+ if (! clause_is_mv_compatible(root, clause, varRelid,
+ &relid, &clause_attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ /* is there a statistics covering this clause? */
+ foreach (s, stats)
+ {
+ int k, matches = 0;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ if (bms_is_member(stat->stakeys->values[k],
+ clause_attnums))
+ matches += 1;
+ }
+
+ /*
+ * The clause is compatible if all attributes it references
+ * are covered by the statistics.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ *attnums = bms_union(*attnums, clause_attnums);
+ rclauses = lappend(rclauses, clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(clauses) >= list_length(rclauses));
+
+ return rclauses;
+}
+
+
+/*
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ *
+ * This check might be made more strict by checking against individual
+ * clauses, because by using the bitmapsets of all attnums we may
+ * actually use attnums from clauses that are not covered by the
+ * statistics. For example, we may have a condition
+ *
+ * (a=1 AND b=2)
+ *
+ * and a new clause
+ *
+ * (c=1 AND d=1)
+ *
+ * With only bitmapsets, statistics on [b,c] will pass through this
+ * (assuming there are some statistics covering both clases).
+ *
+ * TODO Do the more strict check.
+ */
+static List *
+filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+{
+ ListCell *s;
+ List *stats_filtered = NIL;
+
+ foreach (s, stats)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* see how many attributes the statistics covers */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ /* attributes from new clauses */
+ if (bms_is_member(stat->stakeys->values[k], new_attnums))
+ matches_new += 1;
+
+ /* attributes from onditions */
+ if (bms_is_member(stat->stakeys->values[k], all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ stats_filtered = lappend(stats_filtered, stat);
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(list_length(stats) >= list_length(stats_filtered));
+
+ return stats_filtered;
+}
+
+static MVStatisticInfo *
+make_stats_array(List *stats, int *nmvstats)
+{
+ int i;
+ ListCell *l;
+
+ MVStatisticInfo *mvstats = NULL;
+ *nmvstats = list_length(stats);
+
+ mvstats
+ = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
+
+ i = 0;
+ foreach (l, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
+ memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
+ }
+
+ return mvstats;
+}
+
+static Bitmapset **
+make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
+{
+ int i, j;
+ Bitmapset **stats_attnums = NULL;
+
+ Assert(nmvstats > 0);
+
+ /* build bitmaps of attnums for the stats (easier to compare) */
+ stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i]
+ = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+
+ return stats_attnums;
+}
+
+
+/*
+ * Now let's remove redundant statistics, covering the same columns
+ * as some other stats, when restricted to the attributes from
+ * remaining clauses.
+ *
+ * If statistics S1 covers S2 (covers S2 attributes and possibly
+ * some more), we can probably remove S2. What actually matters are
+ * attributes from covered clauses (not all the attributes). This
+ * might however prefer larger, and thus less accurate, statistics.
+ *
+ * When a redundancy is detected, we simply keep the smaller
+ * statistics (less number of columns), on the assumption that it's
+ * more accurate and faster to process. That might be incorrect for
+ * two reasons - first, the accuracy really depends on number of
+ * buckets/MCV items, not the number of columns. Second, we might
+ * prefer MCV lists over histograms or something like that.
+ */
+static List*
+filter_redundant_stats(List *stats, List *clauses, List *conditions)
+{
+ int i, j, nmvstats;
+
+ MVStatisticInfo *mvstats;
+ bool *redundant;
+ Bitmapset **stats_attnums;
+ Bitmapset *varattnos;
+ Index relid;
+
+ Assert(list_length(stats) > 0);
+ Assert(list_length(clauses) > 0);
+
+ /*
+ * We'll convert the list of statistics into an array now, because
+ * the reduction of redundant statistics is easier to do that way
+ * (we can mark previous stats as redundant, etc.).
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* by default, none of the stats is redundant (so palloc0) */
+ redundant = palloc0(nmvstats * sizeof(bool));
+
+ /*
+ * We only expect a single relid here, and also we should get the
+ * same relid from clauses and conditions (but we get it from
+ * clauses, because those are certainly non-empty).
+ */
+ relid = bms_singleton_member(pull_varnos((Node*)clauses));
+
+ /*
+ * Get the varattnos from both conditions and clauses.
+ *
+ * This skips system attributes, although that should be impossible
+ * thanks to previous filtering out of incompatible clauses.
+ *
+ * XXX Is that really true?
+ */
+ varattnos = bms_union(get_varattnos((Node*)clauses, relid),
+ get_varattnos((Node*)conditions, relid));
+
+ for (i = 1; i < nmvstats; i++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
+
+ /* walk through 'previous' stats and check redundancy */
+ for (j = 0; j < i; j++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *prev;
+
+ /* skip stats already identified as redundant */
+ if (redundant[j])
+ continue;
+
+ prev = bms_intersect(stats_attnums[j], varattnos);
+
+ switch (bms_subset_compare(curr, prev))
+ {
+ case BMS_EQUAL:
+ /*
+ * Use the smaller one (hopefully more accurate).
+ * If both have the same size, use the first one.
+ */
+ if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
+ redundant[i] = TRUE;
+ else
+ redundant[j] = TRUE;
+
+ break;
+
+ case BMS_SUBSET1: /* curr is subset of prev */
+ redundant[i] = TRUE;
+ break;
+
+ case BMS_SUBSET2: /* prev is subset of curr */
+ redundant[j] = TRUE;
+ break;
+
+ case BMS_DIFFERENT:
+ /* do nothing - keep both stats */
+ break;
+ }
+
+ bms_free(prev);
+ }
+
+ bms_free(curr);
+ }
+
+ /* can't reduce all statistics (at least one has to remain) */
+ Assert(nmvstats > 0);
+
+ /* now, let's remove the reduced statistics from the arrays */
+ list_free(stats);
+ stats = NIL;
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ MVStatisticInfo *info;
+
+ pfree(stats_attnums[i]);
+
+ if (redundant[i])
+ continue;
+
+ info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
+
+ stats = lappend(stats, info);
+ }
+
+ pfree(mvstats);
+ pfree(stats_attnums);
+ pfree(redundant);
+
+ return stats;
+}
+
+static Node**
+make_clauses_array(List *clauses, int *nclauses)
+{
+ int i;
+ ListCell *l;
+
+ Node** clauses_array;
+
+ *nclauses = list_length(clauses);
+ clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
+
+ i = 0;
+ foreach (l, clauses)
+ clauses_array[i++] = (Node *)lfirst(l);
+
+ *nclauses = i;
+
+ return clauses_array;
+}
+
+static Bitmapset **
+make_clauses_attnums(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
+ int type, Node **clauses, int nclauses)
+{
+ int i;
+ Index relid;
+ Bitmapset **clauses_attnums
+ = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
+
+ for (i = 0; i < nclauses; i++)
+ {
+ Bitmapset * attnums = NULL;
+
+ if (! clause_is_mv_compatible(root, clauses[i], varRelid,
+ &relid, &attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ clauses_attnums[i] = attnums;
+ }
+
+ return clauses_attnums;
+}
+
+static bool*
+make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses)
+{
+ int i, j;
+ bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < nclauses; j++)
+ cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+
+ return cover_map;
+}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 990486c..9e001ee 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3431,7 +3431,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3454,7 +3455,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3621,7 +3623,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3657,7 +3659,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3694,7 +3697,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3832,12 +3836,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3849,7 +3855,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index f0acc14..e41508b 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 15121bc..7341cd6 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1625,13 +1625,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6257,7 +6259,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6582,7 +6585,8 @@ btcostestimate(PG_FUNCTION_ARGS)
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7343,7 +7347,8 @@ gincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7575,7 +7580,7 @@ brincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index a185749..909c2c7 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -380,6 +381,15 @@ static const struct config_enum_entry huge_pages_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3672,6 +3682,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index ac21a3a..2431751 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -191,11 +191,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index aa07000..9fd1314 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -16,6 +16,14 @@
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
--
2.1.0
0007-initial-version-of-ndistinct-conefficient-statistics.patchtext/x-diff; name=0007-initial-version-of-ndistinct-conefficient-statistics.patchDownload
>From 55211180c650c22924a4b0a261e7d36ec83c0d8c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Wed, 23 Dec 2015 02:07:58 +0100
Subject: [PATCH 7/7] initial version of ndistinct conefficient statistics
---
src/backend/commands/statscmds.c | 11 ++-
src/backend/optimizer/path/clausesel.c | 7 ++
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 20 ++++-
src/backend/utils/mvstats/mvdist.c | 147 +++++++++++++++++++++++++++++++++
src/include/catalog/pg_mv_statistic.h | 26 +++---
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 6 ++
9 files changed, 208 insertions(+), 17 deletions(-)
create mode 100644 src/backend/utils/mvstats/mvdist.c
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 68e1685..de140b4 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -136,7 +136,8 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
build_mcv = false,
- build_histogram = false;
+ build_histogram = false,
+ build_ndistinct = false;
int32 max_buckets = -1,
max_mcv_items = -1;
@@ -200,6 +201,8 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "ndistinct") == 0)
+ build_ndistinct = defGetBoolean(opt);
else if (strcmp(opt->defname, "mcv") == 0)
build_mcv = defGetBoolean(opt);
else if (strcmp(opt->defname, "max_mcv_items") == 0)
@@ -254,10 +257,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv || build_histogram))
+ if (! (build_dependencies || build_mcv || build_histogram || build_ndistinct))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram, ndistinct) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -291,6 +294,7 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+ values[Anum_pg_mv_statistic_ndist_enabled-1] = BoolGetDatum(build_ndistinct);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
@@ -298,6 +302,7 @@ CreateStatistics(CreateStatsStmt *stmt)
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
nulls[Anum_pg_mv_statistic_stahist -1] = true;
+ nulls[Anum_pg_mv_statistic_standist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 8d15d3c8..c717f96 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -59,6 +59,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
#define MV_CLAUSE_TYPE_HIST 0x04
+#define MV_CLAUSE_TYPE_NDIST 0x08
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
@@ -377,6 +378,9 @@ clauselist_selectivity(PlannerInfo *root,
stats, sjinfo);
}
+ if (has_stats(stats, MV_CLAUSE_TYPE_NDIST))
+ elog(WARNING, "has ndistinct coefficient stats");
+
/*
* Check that there are statistics with MCV list or histogram.
* If not, we don't need to waste time with the optimization.
@@ -2931,6 +2935,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_NDIST) && stat->ndist_built)
+ return true;
}
return false;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 9aded52..f4edfe6 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -410,7 +410,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built || mvstat->ndist_built)
{
info = makeNode(MVStatisticInfo);
@@ -421,11 +421,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
info->hist_enabled = mvstat->hist_enabled;
+ info->ndist_enabled = mvstat->ndist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
info->hist_built = mvstat->hist_built;
+ info->ndist_built = mvstat->ndist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 9dbb3b6..d4b88e9 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o histogram.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o mvdist.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index ffb76f4..c42ca8f 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -53,6 +53,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
MVHistogram histogram = NULL;
+ double ndist = -1;
int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
@@ -92,6 +93,9 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->ndist_enabled)
+ ndist = build_mv_ndistinct(numrows, rows, attrs, stats);
+
/* build the MCV list */
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
@@ -101,7 +105,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, ndist, attrs, stats);
}
}
@@ -183,6 +187,8 @@ list_mv_stats(Oid relid)
info->mcv_built = stats->mcv_built;
info->hist_enabled = stats->hist_enabled;
info->hist_built = stats->hist_built;
+ info->ndist_enabled = stats->ndist_enabled;
+ info->ndist_built = stats->ndist_built;
result = lappend(result, info);
}
@@ -252,7 +258,7 @@ find_mv_attnums(Oid mvoid, Oid *relid)
void
update_mv_stats(Oid mvoid,
MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
- int2vector *attrs, VacAttrStats **stats)
+ double ndistcoeff, int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -292,26 +298,36 @@ update_mv_stats(Oid mvoid,
= PointerGetDatum(data);
}
+ if (ndistcoeff > 1.0)
+ {
+ nulls[Anum_pg_mv_statistic_standist -1] = false;
+ values[Anum_pg_mv_statistic_standist-1] = Float8GetDatum(ndistcoeff);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
replaces[Anum_pg_mv_statistic_stahist-1] = true;
+ replaces[Anum_pg_mv_statistic_standist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_hist_built-1] = false;
+ nulls[Anum_pg_mv_statistic_ndist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_ndist_built-1] = true;
replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
+ values[Anum_pg_mv_statistic_ndist_built-1] = BoolGetDatum(ndistcoeff > 1.0);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/mvdist.c b/src/backend/utils/mvstats/mvdist.c
new file mode 100644
index 0000000..6df7411
--- /dev/null
+++ b/src/backend/utils/mvstats/mvdist.c
@@ -0,0 +1,147 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvdist.c
+ * POSTGRES multivariate distinct coefficients
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mvdist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ *
+ */
+double
+build_mv_ndistinct(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ MultiSortSupport mss = multi_sort_init(numattrs);
+ int ndistinct;
+ double result;
+
+ /*
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simpler / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ Assert(numattrs >= 2);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* accumulate all the data into the array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ items[j].values[i]
+ = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+ }
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /* count number of distinct combinations */
+
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ {
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct++;
+ }
+
+ result = 1 / (double)ndistinct;
+
+ /*
+ * now count distinct values for each attribute and incrementally
+ * compute ndistinct(a,b) / (ndistinct(a) * ndistinct(b))
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ memset(values, 0, sizeof(Datum) * numrows);
+
+ /* accumulate all the data into the array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ bool isnull;
+ values[j] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+ }
+
+ qsort_arg((void *)values, numrows, sizeof(Datum),
+ compare_scalars_simple, &ssup);
+
+ ndistinct = 1;
+ for (j = 1; j < numrows; j++)
+ {
+ if (compare_scalars_simple(&values[j], &values[j-1], &ssup) != 0)
+ ndistinct++;
+ }
+
+ result *= ndistinct;
+ }
+
+ return result;
+}
+
+double
+load_mv_ndistinct(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->ndist_enabled && mvstat->ndist_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_standist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return DatumGetFloat8(deps);
+}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index df6a61c..fb9ee22 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -38,6 +38,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
bool hist_enabled; /* build histogram? */
+ bool ndist_enabled; /* build ndist coefficient? */
/* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
@@ -47,6 +48,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
bool hist_built; /* histogram was built */
+ bool ndist_built; /* ndistinct coeff built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -55,6 +57,7 @@ CATALOG(pg_mv_statistic,3381)
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
bytea stahist; /* MV histogram (serialized) */
+ float8 standcoeff; /* ndistinct coeff (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -70,20 +73,23 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_attrdef
* ----------------
*/
-#define Natts_pg_mv_statistic 14
+#define Natts_pg_mv_statistic 17
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_deps_enabled 3
#define Anum_pg_mv_statistic_mcv_enabled 4
#define Anum_pg_mv_statistic_hist_enabled 5
-#define Anum_pg_mv_statistic_mcv_max_items 6
-#define Anum_pg_mv_statistic_hist_max_buckets 7
-#define Anum_pg_mv_statistic_deps_built 8
-#define Anum_pg_mv_statistic_mcv_built 9
-#define Anum_pg_mv_statistic_hist_built 10
-#define Anum_pg_mv_statistic_stakeys 11
-#define Anum_pg_mv_statistic_stadeps 12
-#define Anum_pg_mv_statistic_stamcv 13
-#define Anum_pg_mv_statistic_stahist 14
+#define Anum_pg_mv_statistic_ndist_enabled 6
+#define Anum_pg_mv_statistic_mcv_max_items 7
+#define Anum_pg_mv_statistic_hist_max_buckets 8
+#define Anum_pg_mv_statistic_deps_built 9
+#define Anum_pg_mv_statistic_mcv_built 10
+#define Anum_pg_mv_statistic_hist_built 11
+#define Anum_pg_mv_statistic_ndist_built 12
+#define Anum_pg_mv_statistic_stakeys 13
+#define Anum_pg_mv_statistic_stadeps 14
+#define Anum_pg_mv_statistic_stamcv 15
+#define Anum_pg_mv_statistic_stahist 16
+#define Anum_pg_mv_statistic_standist 17
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 3706525..6ecbc4e 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -594,11 +594,13 @@ typedef struct MVStatisticInfo
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
bool hist_enabled; /* histogram enabled */
+ bool ndist_enabled; /* ndistinct coefficient enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
bool hist_built; /* histogram built */
+ bool ndist_built; /* ndistinct coefficient built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 9fd1314..d3f9de3 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -224,6 +224,7 @@ typedef MVSerializedHistogramData *MVSerializedHistogram;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
MVSerializedHistogram load_mv_histogram(Oid mvoid);
+double load_mv_ndistinct(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
@@ -265,11 +266,16 @@ MVHistogram
build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int numrows_total);
+double
+build_mv_ndistinct(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
void update_mv_stats(Oid relid, MVDependencies dependencies,
MCVList mcvlist, MVHistogram histogram,
+ double ndistcoeff,
int2vector *attrs, VacAttrStats **stats);
#ifdef DEBUG_MVHIST
--
2.1.0
Hi,
attached is v9 of the patch series, including mostly these changes:
1) CREATE STATISTICS cleanup
Firstly, I forgot to make the STATISTICS keyword unreserved again.
I've also removed additional stuff from the grammar that turned out
to be unnecessary / could be replaced with existing pieces.
2) making statistics schema-specific
Similarly to the other objects (e.g. types), statistics names are now
unique within a schema. This also means that the statistics may be
created using qualified name, and also may belong to a different
schema than a table.
It seems to me we probably also need to track owner, and only allow
the owner (or superuser / schema owner) to manipulate the statistics.
The initial intention was to inherit all this from the parent table,
but as we're designing this for the multi-table case, it's not
really working anymore.
3) adding IF [NOT] EXISTS to DROP STATISTICS / CREATE STATISTICS
4) basic documentation of the DDL commands
It's really simple at this point and some of the paragraphs are
still empty. I also think that we'll have to add stuff explaining
how to use statistics, not just docs for the DDL commands.
5) various fixes of the regression tests, related to the above
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchapplication/x-patch; name=0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchDownload
From 55c1eee0e734e6e36da2e6f705b70228c2fce67c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 1/7] teach pull_(varno|varattno)_walker about RestrictInfo
otherwise pull_varnos fails when processing OR clauses
---
src/backend/optimizer/util/var.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index dff52c4..80d01bd 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -197,6 +197,13 @@ pull_varnos_walker(Node *node, pull_varnos_context *context)
context->sublevels_up--;
return result;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo*)node;
+ context->varnos = bms_add_members(context->varnos,
+ rinfo->clause_relids);
+ return false;
+ }
return expression_tree_walker(node, pull_varnos_walker,
(void *) context);
}
@@ -245,6 +252,15 @@ pull_varattnos_walker(Node *node, pull_varattnos_context *context)
return false;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *)node;
+
+ return expression_tree_walker((Node*)rinfo->clause,
+ pull_varattnos_walker,
+ (void*) context);
+ }
+
/* Should not find an unplanned subquery */
Assert(!IsA(node, Query));
--
2.1.0
0002-shared-infrastructure-and-functional-dependencies.patchapplication/x-patch; name=0002-shared-infrastructure-and-functional-dependencies.patchDownload
From 5b0f45b77134f3b8db327f76f0351dc6119a0417 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 2/7] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate
stats, most importantly:
- adds a new system catalog (pg_mv_statistic)
- CREATE STATISTICS name ON table (columns) WITH (options)
- DROP STATISTICS name
- implementation of functional dependencies (the simplest
type of multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e.
it does not influence the query planning (subject to
follow-up patches).
The current implementation requires a valid 'ltopr' for
the columns, so that we can sort the sample rows in various
ways, both in this patch and other kinds of statistics.
Maybe this restriction could be relaxed in the future,
requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
Maybe some of the stats (functional dependencies and MCV
list with limited functionality) might be made to work
with hashes of the values, which is sufficient for equality
comparisons. But the queries would require the equality
operator anyway, so it's not really a weaker requirement.
The hashes might reduce space requirements, though.
The algorithm detecting the dependencies is rather simple
and probably needs improvements, so that it detects more
complicated dependencies, and also validation of the math.
The name 'functional dependencies' is more correct (than
'association rules') as it's exactly the name used in
relational theory (esp. Normal Forms) for tracking
column-level dependencies.
The multivariate statistics are automatically removed in
two situations
(a) after a DROP TABLE (obviously)
(b) after ALTER TABLE ... DROP COLUMN, if the statistics
would be defined on less than 2 columns (remaining)
If there are at least 2 columns remaining, we keep
the statistics but perform cleanup on the next ANALYZE.
The dropped columns are removed from stakeys, and the new
statistics is built on the smaller set.
We can't do this at DROP COLUMN, because that'd leave us
with invalid statistics, or we'd have to throw it away
although we can still use it. This lazy approach lets us
use the statistics although some of the columns are dead.
This also adds a simple list of statistics to \d in psql.
This means the statistics are created within a schema by
using a qualified name (or using the default schema)
CREATE STATISTICS schema.statistics ON ...
and then dropped by specifying qualified name
DROP STATISTICS schema.statistics
or searching through search_path (just like with other objects).
The commands also include IF [NOT] EXISTS clauses, similarly
to the other DDL commands.
I'm not entirely sure making statistics schema-specific is that
a great idea. Maybe it should be "global", but that does not seem
right (e.g. it makes multi-tenant systems based on schemas more
difficult to manage, because tenants would interact).
Includes basic SGML documentation for the DDL commands, although
some of the sections are empty at the moment. In the end, there
should probably be a separate section about statistics elsewhere
in the documentation, explaining how to use the stats.
---
doc/src/sgml/ref/allfiles.sgml | 2 +
doc/src/sgml/ref/create_statistics.sgml | 174 ++++++++
doc/src/sgml/ref/drop_statistics.sgml | 90 ++++
doc/src/sgml/reference.sgml | 2 +
src/backend/catalog/Makefile | 1 +
src/backend/catalog/dependency.c | 11 +-
src/backend/catalog/heap.c | 102 +++++
src/backend/catalog/namespace.c | 51 +++
src/backend/catalog/objectaddress.c | 22 +
src/backend/catalog/system_views.sql | 11 +
src/backend/commands/Makefile | 6 +-
src/backend/commands/analyze.c | 21 +
src/backend/commands/dropcmds.c | 4 +
src/backend/commands/event_trigger.c | 3 +
src/backend/commands/statscmds.c | 331 +++++++++++++++
src/backend/commands/tablecmds.c | 8 +-
src/backend/nodes/copyfuncs.c | 16 +
src/backend/nodes/outfuncs.c | 18 +
src/backend/optimizer/util/plancat.c | 63 +++
src/backend/parser/gram.y | 34 +-
src/backend/tcop/utility.c | 11 +
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/relcache.c | 59 +++
src/backend/utils/cache/syscache.c | 23 ++
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/common.c | 356 ++++++++++++++++
src/backend/utils/mvstats/common.h | 75 ++++
src/backend/utils/mvstats/dependencies.c | 638 +++++++++++++++++++++++++++++
src/bin/psql/describe.c | 44 ++
src/include/catalog/dependency.h | 5 +-
src/include/catalog/heap.h | 1 +
src/include/catalog/indexing.h | 7 +
src/include/catalog/namespace.h | 2 +
src/include/catalog/pg_mv_statistic.h | 73 ++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/commands/defrem.h | 4 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 12 +
src/include/nodes/relation.h | 28 ++
src/include/utils/mvstats.h | 70 ++++
src/include/utils/rel.h | 4 +
src/include/utils/relcache.h | 1 +
src/include/utils/syscache.h | 2 +
src/test/regress/expected/rules.out | 8 +
src/test/regress/expected/sanity_check.out | 1 +
46 files changed, 2410 insertions(+), 11 deletions(-)
create mode 100644 doc/src/sgml/ref/create_statistics.sgml
create mode 100644 doc/src/sgml/ref/drop_statistics.sgml
create mode 100644 src/backend/commands/statscmds.c
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index bf95453..c0f7653 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -76,6 +76,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY createSchema SYSTEM "create_schema.sgml">
<!ENTITY createSequence SYSTEM "create_sequence.sgml">
<!ENTITY createServer SYSTEM "create_server.sgml">
+<!ENTITY createStatistics SYSTEM "create_statistics.sgml">
<!ENTITY createTable SYSTEM "create_table.sgml">
<!ENTITY createTableAs SYSTEM "create_table_as.sgml">
<!ENTITY createTableSpace SYSTEM "create_tablespace.sgml">
@@ -119,6 +120,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY dropSchema SYSTEM "drop_schema.sgml">
<!ENTITY dropSequence SYSTEM "drop_sequence.sgml">
<!ENTITY dropServer SYSTEM "drop_server.sgml">
+<!ENTITY dropStatistics SYSTEM "drop_statistics.sgml">
<!ENTITY dropTable SYSTEM "drop_table.sgml">
<!ENTITY dropTableSpace SYSTEM "drop_tablespace.sgml">
<!ENTITY dropTransform SYSTEM "drop_transform.sgml">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
new file mode 100644
index 0000000..a86eae3
--- /dev/null
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -0,0 +1,174 @@
+<!--
+doc/src/sgml/ref/create_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-CREATESTATISTICS">
+ <indexterm zone="sql-createstatistics">
+ <primary>CREATE STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>CREATE STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>CREATE STATISTICS</refname>
+ <refpurpose>define a new statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable> ON <replaceable class="PARAMETER">table_name</replaceable> ( [
+ { <replaceable class="PARAMETER">column_name</replaceable> } ] [, ...])
+[ WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )
+</synopsis>
+
+ </refsynopsisdiv>
+
+ <refsect1 id="SQL-CREATESTATISTICS-description">
+ <title>Description</title>
+
+ <para>
+ <command>CREATE STATISTICS</command> will create a new multivariate
+ statistics on the table. The statistics will be created in the in the
+ current database. The statistics will be owned by the user issuing
+ the command.
+ </para>
+
+ <para>
+ If a schema name is given (for example, <literal>CREATE STATISTICS
+ myschema.mystat ...</>) then the statistics is created in the specified
+ schema. Otherwise it is created in the current schema. The name of
+ the table must be distinct from the name of any other statistics in the
+ same schema.
+ </para>
+
+ <para>
+ To be able to create a table, you must have <literal>USAGE</literal>
+ privilege on all column types or the type in the <literal>OF</literal>
+ clause, respectively.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>IF NOT EXISTS</></term>
+ <listitem>
+ <para>
+ Do not throw an error if a statistics with the same name already exists.
+ A notice is issued in this case. Note that there is no guarantee that
+ the existing statistics is anything like the one that would have been
+ created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">statistics_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to be created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">table_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the table the statistics should
+ be created on.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name</replaceable></term>
+ <listitem>
+ <para>
+ The name of a column to be included in the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ ...
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <refsect2 id="SQL-CREATESTATISTICS-parameters">
+ <title id="SQL-CREATESTATISTICS-parameters-title">Statistics Parameters</title>
+
+ <indexterm zone="sql-createstatistics-parameters">
+ <primary>statistics parameters</primary>
+ </indexterm>
+
+ <para>
+ The <literal>WITH</> clause can specify <firstterm>statistics parameters</>
+ for statistics. The currently available parameters are listed below.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>dependencies</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables functional dependencies for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1 id="SQL-CREATESTATISTICS-notes">
+ <title>Notes</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+
+ <refsect1 id="SQL-CREATESTATISTICS-examples">
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>CREATE STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-dropstatistics"></member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/doc/src/sgml/ref/drop_statistics.sgml b/doc/src/sgml/ref/drop_statistics.sgml
new file mode 100644
index 0000000..4cc0b70
--- /dev/null
+++ b/doc/src/sgml/ref/drop_statistics.sgml
@@ -0,0 +1,90 @@
+<!--
+doc/src/sgml/ref/drop_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-DROPSTATISTICS">
+ <indexterm zone="sql-dropstatistics">
+ <primary>DROP STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>DROP STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>DROP STATISTICS</refname>
+ <refpurpose>remove a statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> [, ...]
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <command>DROP STATISTICS</command> removes statistics from the database.
+ Only the statistics owner, the schema owner, and superuser can drop a
+ statistics.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>IF EXISTS</literal></term>
+ <listitem>
+ <para>
+ Do not throw an error if the statistics does not exist. A notice is
+ issued in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to drop.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>DROP STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-createstatistics"></member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 03020df..2b07b2d 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -104,6 +104,7 @@
&createSchema;
&createSequence;
&createServer;
+ &createStatistics;
&createTable;
&createTableAs;
&createTableSpace;
@@ -147,6 +148,7 @@
&dropSchema;
&dropSequence;
&dropServer;
+ &dropStatistics;
&dropTable;
&dropTableSpace;
&dropTSConfig;
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 25130ec..058b8a9 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index d657c20..8b72d88 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -39,6 +39,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -159,7 +160,8 @@ static const Oid object_classes[] = {
ExtensionRelationId, /* OCLASS_EXTENSION */
EventTriggerRelationId, /* OCLASS_EVENT_TRIGGER */
PolicyRelationId, /* OCLASS_POLICY */
- TransformRelationId /* OCLASS_TRANSFORM */
+ TransformRelationId, /* OCLASS_TRANSFORM */
+ MvStatisticRelationId /* OCLASS_STATISTICS */
};
@@ -1271,6 +1273,10 @@ doDeletion(const ObjectAddress *object, int flags)
DropTransformById(object->objectId);
break;
+ case OCLASS_STATISTICS:
+ RemoveStatisticsById(object->objectId);
+ break;
+
default:
elog(ERROR, "unrecognized object class: %u",
object->classId);
@@ -2414,6 +2420,9 @@ getObjectClass(const ObjectAddress *object)
case TransformRelationId:
return OCLASS_TRANSFORM;
+
+ case MvStatisticRelationId:
+ return OCLASS_STATISTICS;
}
/* shouldn't get here */
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index d14cbb7..82c3632 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -46,6 +46,7 @@
#include "catalog/pg_constraint.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_tablespace.h"
@@ -1612,7 +1613,10 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
if (attnum > 0)
+ {
RemoveStatistics(relid, attnum);
+ RemoveMVStatistics(relid, attnum);
+ }
relation_close(rel, NoLock);
}
@@ -1840,6 +1844,11 @@ heap_drop_with_catalog(Oid relid)
RemoveStatistics(relid, 0);
/*
+ * delete multi-variate statistics
+ */
+ RemoveMVStatistics(relid, 0);
+
+ /*
* delete attribute tuples
*/
DeleteAttributeTuples(relid);
@@ -2695,6 +2704,99 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
/*
+ * RemoveMVStatistics --- remove entries in pg_mv_statistic for a rel
+ *
+ * If attnum is zero, remove all entries for rel; else remove only the one(s)
+ * for that column.
+ */
+void
+RemoveMVStatistics(Oid relid, AttrNumber attnum)
+{
+ Relation pgmvstatistic;
+ TupleDesc tupdesc = NULL;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ /*
+ * When dropping a column, we'll drop statistics with a single
+ * remaining (undropped column). To do that, we need the tuple
+ * descriptor.
+ *
+ * We already have the relation locked (as we're running ALTER
+ * TABLE ... DROP COLUMN), so we'll just get the descriptor here.
+ */
+ if (attnum != 0)
+ {
+ Relation rel = relation_open(relid, NoLock);
+
+ /* multivariate stats are supported on tables and matviews */
+ if (rel->rd_rel->relkind == RELKIND_RELATION ||
+ rel->rd_rel->relkind == RELKIND_MATVIEW)
+ tupdesc = RelationGetDescr(rel);
+
+ relation_close(rel, NoLock);
+ }
+
+ if (tupdesc == NULL)
+ return;
+
+ pgmvstatistic = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ scan = systable_beginscan(pgmvstatistic,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ bool delete = true;
+
+ if (attnum != 0)
+ {
+ Datum adatum;
+ bool isnull;
+ int i;
+ int ncolumns = 0;
+ ArrayType *arr;
+ int16 *attnums;
+
+ /* get the columns */
+ adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+ attnums = (int16*)ARR_DATA_PTR(arr);
+
+ for (i = 0; i < ARR_DIMS(arr)[0]; i++)
+ {
+ /* count the column unless it's has been / is being dropped */
+ if ((! tupdesc->attrs[attnums[i]-1]->attisdropped) &&
+ (attnums[i] != attnum))
+ ncolumns += 1;
+ }
+
+ /* delete if there are less than two attributes */
+ delete = (ncolumns < 2);
+ }
+
+ if (delete)
+ simple_heap_delete(pgmvstatistic, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(pgmvstatistic, RowExclusiveLock);
+}
+
+
+/*
* RelationTruncateIndexes - truncate all indexes associated
* with the heap relation to zero tuples.
*
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 446b2ac..dfd5bef 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -4201,3 +4201,54 @@ pg_is_other_temp_schema(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(isOtherTempNamespace(oid));
}
+
+Oid
+get_statistics_oid(List *names, bool missing_ok)
+{
+ char *schemaname;
+ char *stats_name;
+ Oid namespaceId;
+ Oid stats_oid = InvalidOid;
+ ListCell *l;
+
+ /* deconstruct the name list */
+ DeconstructQualifiedName(names, &schemaname, &stats_name);
+
+ if (schemaname)
+ {
+ /* use exact schema given */
+ namespaceId = LookupExplicitNamespace(schemaname, missing_ok);
+ if (missing_ok && !OidIsValid(namespaceId))
+ stats_oid = InvalidOid;
+ else
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ }
+ else
+ {
+ /* search for it in search path */
+ recomputeNamespacePath();
+
+ foreach(l, activeSearchPath)
+ {
+ namespaceId = lfirst_oid(l);
+
+ if (namespaceId == myTempNamespace)
+ continue; /* do not look in temp namespace */
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ if (OidIsValid(stats_oid))
+ break;
+ }
+ }
+
+ if (!OidIsValid(stats_oid) && !missing_ok)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics \"%s\" does not exist",
+ NameListToString(names))));
+
+ return stats_oid;
+}
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index 65cf3ed..080c33c 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -37,6 +37,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_opfamily.h"
@@ -436,9 +437,22 @@ static const ObjectPropertyType ObjectProperty[] =
Anum_pg_type_typacl,
ACL_KIND_TYPE,
true
+ },
+ {
+ MvStatisticRelationId,
+ MvStatisticOidIndexId,
+ MVSTATOID,
+ MVSTATNAMENSP,
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ InvalidAttrNumber, /* XXX same owner as relation */
+ InvalidAttrNumber, /* no ACL (same as relation) */
+ -1, /* no ACL */
+ true
}
};
+
/*
* This struct maps the string object types as returned by
* getObjectTypeDescription into ObjType enum values. Note that some enum
@@ -911,6 +925,11 @@ get_object_address(ObjectType objtype, List *objname, List *objargs,
address = get_object_address_defacl(objname, objargs,
missing_ok);
break;
+ case OBJECT_STATISTICS:
+ address.classId = MvStatisticRelationId;
+ address.objectId = get_statistics_oid(objname, missing_ok);
+ address.objectSubId = 0;
+ break;
default:
elog(ERROR, "unrecognized objtype: %d", (int) objtype);
/* placate compiler, in case it thinks elog might return */
@@ -2183,6 +2202,9 @@ check_object_ownership(Oid roleid, ObjectType objtype, ObjectAddress address,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser")));
break;
+ case OBJECT_STATISTICS:
+ /* FIXME do the right owner checks here */
+ break;
default:
elog(ERROR, "unrecognized object type: %d",
(int) objtype);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 923fe58..2423985 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,6 +158,17 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.staname AS staname,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats WITH (security_barrier) AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index b1ac704..5151001 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -18,8 +18,8 @@ OBJS = aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
event_trigger.o explain.o extension.o foreigncmds.o functioncmds.o \
indexcmds.o lockcmds.o matview.o operatorcmds.o opclasscmds.o \
policy.o portalcmds.o prepare.o proclang.o \
- schemacmds.o seclabel.o sequence.o tablecmds.o tablespace.o trigger.o \
- tsearchcmds.o typecmds.o user.o vacuum.o vacuumlazy.o \
- variable.o view.o
+ schemacmds.o seclabel.o sequence.o statscmds.o \
+ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o \
+ user.o vacuum.o vacuumlazy.o variable.o view.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 070df29..cbaa4e1 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -55,7 +56,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Per-index data for ANALYZE */
typedef struct AnlIndexData
@@ -460,6 +465,19 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, because we need to build more
+ * detailed stats (more MCV items / histogram buckets) to get
+ * good accuracy. Maybe it'd be appropriate to use samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -562,6 +580,9 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/dropcmds.c b/src/backend/commands/dropcmds.c
index 522027a..cd65b58 100644
--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -292,6 +292,10 @@ does_not_exist_skipping(ObjectType objtype, List *objname, List *objargs)
msg = gettext_noop("schema \"%s\" does not exist, skipping");
name = NameListToString(objname);
break;
+ case OBJECT_STATISTICS:
+ msg = gettext_noop("statistics \"%s\" does not exist, skipping");
+ name = NameListToString(objname);
+ break;
case OBJECT_TSPARSER:
if (!schema_does_not_exist_skipping(objname, &msg, &name))
{
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 9e32f8d..09061bb 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -110,6 +110,7 @@ static event_trigger_support_data event_trigger_support[] = {
{"SCHEMA", true},
{"SEQUENCE", true},
{"SERVER", true},
+ {"STATISTICS", true},
{"TABLE", true},
{"TABLESPACE", false},
{"TRANSFORM", true},
@@ -1106,6 +1107,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
case OBJECT_RULE:
case OBJECT_SCHEMA:
case OBJECT_SEQUENCE:
+ case OBJECT_STATISTICS:
case OBJECT_TABCONSTRAINT:
case OBJECT_TABLE:
case OBJECT_TRANSFORM:
@@ -1167,6 +1169,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
case OCLASS_DEFACL:
case OCLASS_EXTENSION:
case OCLASS_POLICY:
+ case OCLASS_STATISTICS:
return true;
}
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
new file mode 100644
index 0000000..84a8b13
--- /dev/null
+++ b/src/backend/commands/statscmds.c
@@ -0,0 +1,331 @@
+/*-------------------------------------------------------------------------
+ *
+ * statscmds.c
+ * Commands for creating and altering multivariate statistics
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/commands/statscmds.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/multixact.h"
+#include "access/reloptions.h"
+#include "access/relscan.h"
+#include "access/sysattr.h"
+#include "access/xact.h"
+#include "access/xlog.h"
+#include "catalog/catalog.h"
+#include "catalog/dependency.h"
+#include "catalog/heap.h"
+#include "catalog/index.h"
+#include "catalog/indexing.h"
+#include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_constraint.h"
+#include "catalog/pg_depend.h"
+#include "catalog/pg_foreign_table.h"
+#include "catalog/pg_inherits.h"
+#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
+#include "catalog/pg_namespace.h"
+#include "catalog/pg_opclass.h"
+#include "catalog/pg_tablespace.h"
+#include "catalog/pg_trigger.h"
+#include "catalog/pg_type.h"
+#include "catalog/pg_type_fn.h"
+#include "catalog/storage.h"
+#include "catalog/toasting.h"
+#include "commands/cluster.h"
+#include "commands/comment.h"
+#include "commands/defrem.h"
+#include "commands/event_trigger.h"
+#include "commands/policy.h"
+#include "commands/sequence.h"
+#include "commands/tablecmds.h"
+#include "commands/tablespace.h"
+#include "commands/trigger.h"
+#include "commands/typecmds.h"
+#include "commands/user.h"
+#include "executor/executor.h"
+#include "foreign/foreign.h"
+#include "miscadmin.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/parsenodes.h"
+#include "optimizer/clauses.h"
+#include "optimizer/planner.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_coerce.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_expr.h"
+#include "parser/parse_oper.h"
+#include "parser/parse_relation.h"
+#include "parser/parse_type.h"
+#include "parser/parse_utilcmd.h"
+#include "parser/parser.h"
+#include "pgstat.h"
+#include "rewrite/rewriteDefine.h"
+#include "rewrite/rewriteHandler.h"
+#include "rewrite/rewriteManip.h"
+#include "storage/bufmgr.h"
+#include "storage/lmgr.h"
+#include "storage/lock.h"
+#include "storage/predicate.h"
+#include "storage/smgr.h"
+#include "utils/acl.h"
+#include "utils/builtins.h"
+#include "utils/fmgroids.h"
+#include "utils/inval.h"
+#include "utils/lsyscache.h"
+#include "utils/memutils.h"
+#include "utils/relcache.h"
+#include "utils/ruleutils.h"
+#include "utils/snapmgr.h"
+#include "utils/syscache.h"
+#include "utils/tqual.h"
+#include "utils/typcache.h"
+#include "utils/mvstats.h"
+
+
+/* used for sorting the attnums in ExecCreateStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the CREATE STATISTICS name ON table (columns) WITH (options)
+ *
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+ObjectAddress
+CreateStatistics(CreateStatsStmt *stmt)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+ ObjectAddress address = InvalidObjectAddress;
+ char *namestr;
+ NameData staname;
+ Oid statoid;
+ Oid namespaceId;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+ Relation rel;
+ ObjectAddress parentobject, childobject;
+
+ /* by default build nothing */
+ bool build_dependencies = false;
+
+ Assert(IsA(stmt, CreateStatsStmt));
+
+ /* resolve the pieces of the name (namespace etc.) */
+ namespaceId = QualifiedNameGetCreationNamespace(stmt->defnames, &namestr);
+ namestrcpy(&staname, namestr);
+
+ /*
+ * If if_not_exists was given and the statistics already exists, bail out.
+ */
+ if (stmt->if_not_exists &&
+ SearchSysCacheExists2(MVSTATNAMENSP,
+ PointerGetDatum(&staname),
+ ObjectIdGetDatum(namespaceId)))
+ {
+ ereport(NOTICE,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("statistics \"%s\" already exists, skipping",
+ namestr)));
+ return InvalidObjectAddress;
+ }
+
+ rel = heap_openrv(stmt->relation, AccessExclusiveLock);
+
+ /* transform the column names to attnum values */
+
+ foreach(l, stmt->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, stmt->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+ values[Anum_pg_mv_statistic_staname -1] = NameGetDatum(&staname);
+ values[Anum_pg_mv_statistic_stanamespace -1] = ObjectIdGetDatum(namespaceId);
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ statoid = HeapTupleGetOid(htup);
+
+ heap_freetuple(htup);
+
+
+ /*
+ * Store a dependency too, so that statistics are dropped on DROP TABLE
+ */
+ parentobject.classId = RelationRelationId;
+ parentobject.objectId = ObjectIdGetDatum(RelationGetRelid(rel));
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+ /*
+ * Also record dependency on the schema (to drop statistics on DROP SCHEMA)
+ */
+ parentobject.classId = NamespaceRelationId;
+ parentobject.objectId = ObjectIdGetDatum(namespaceId);
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ relation_close(rel, NoLock);
+
+ /*
+ * Invalidate relcache so that others see the new statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ ObjectAddressSet(address, MvStatisticRelationId, statoid);
+
+ return address;
+}
+
+
+/*
+ * Implements the DROP STATISTICS
+ *
+ * DROP STATISTICS stats_name ON table_name
+ *
+ * The first one requires an exact match, the second one just drops
+ * all the statistics on a table.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+ Relation relation;
+ HeapTuple tup;
+
+ /*
+ * Delete the pg_proc tuple.
+ */
+ relation = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ tup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(statsOid));
+ if (!HeapTupleIsValid(tup)) /* should not happen */
+ elog(ERROR, "cache lookup failed for statistics %u", statsOid);
+
+ simple_heap_delete(relation, &tup->t_self);
+
+ ReleaseSysCache(tup);
+
+ heap_close(relation, RowExclusiveLock);
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 0b4a334..5f4220d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -35,6 +35,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_tablespace.h"
@@ -93,7 +94,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -141,8 +142,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index f47e0da..407b4a0 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4119,6 +4119,19 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static CreateStatsStmt *
+_copyCreateStatsStmt(const CreateStatsStmt *from)
+{
+ CreateStatsStmt *newnode = makeNode(CreateStatsStmt);
+
+ COPY_NODE_FIELD(defnames);
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4966,6 +4979,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_CreateStatsStmt:
+ retval = _copyCreateStatsStmt(from);
+ break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index d95e151..5ecc9ef 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1939,6 +1939,21 @@ _outIndexOptInfo(StringInfo str, const IndexOptInfo *node)
}
static void
+_outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
+{
+ WRITE_NODE_TYPE("MVSTATISTICINFO");
+
+ /* NB: this isn't a complete set of fields */
+ WRITE_OID_FIELD(mvoid);
+
+ /* enabled statistics */
+ WRITE_BOOL_FIELD(deps_enabled);
+
+ /* built/available statistics */
+ WRITE_BOOL_FIELD(deps_built);
+}
+
+static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
@@ -3359,6 +3374,9 @@ _outNode(StringInfo str, const void *obj)
case T_PlannerParamItem:
_outPlannerParamItem(str, obj);
break;
+ case T_MVStatisticInfo:
+ _outMVStatisticInfo(str, obj);
+ break;
case T_CreateStmt:
_outCreateStmt(str, obj);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index d5528e0..83bd85c 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -27,6 +27,7 @@
#include "catalog/catalog.h"
#include "catalog/dependency.h"
#include "catalog/heap.h"
+#include "catalog/pg_mv_statistic.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -39,7 +40,9 @@
#include "parser/parsetree.h"
#include "rewrite/rewriteManip.h"
#include "storage/bufmgr.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
+#include "utils/syscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -93,6 +96,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
Relation relation;
bool hasindex;
List *indexinfos = NIL;
+ List *stainfos = NIL;
/*
* We need not lock the relation since it was already locked, either by
@@ -381,6 +385,65 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->indexlist = indexinfos;
+ if (true)
+ {
+ List *mvstatoidlist;
+ ListCell *l;
+
+ mvstatoidlist = RelationGetMVStatList(relation);
+
+ foreach(l, mvstatoidlist)
+ {
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ Oid mvoid = lfirst_oid(l);
+ Form_pg_mv_statistic mvstat;
+ MVStatisticInfo *info;
+
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ continue;
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /* unavailable stats are not interesting for the planner */
+ if (mvstat->deps_built)
+ {
+ info = makeNode(MVStatisticInfo);
+
+ info->mvoid = mvoid;
+ info->rel = rel;
+
+ /* enabled statistics */
+ info->deps_enabled = mvstat->deps_enabled;
+
+ /* built/available statistics */
+ info->deps_built = mvstat->deps_built;
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ info->stakeys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+
+ stainfos = lcons(info, stainfos);
+ }
+
+ ReleaseSysCache(htup);
+ }
+
+ list_free(mvstatoidlist);
+ }
+
+ rel->mvstatlist = stainfos;
+
/* Grab foreign-table info using the relcache, while we have it */
if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b307b48..3be3f02 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -241,7 +241,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
ConstraintsSetStmt CopyStmt CreateAsStmt CreateCastStmt
CreateDomainStmt CreateExtensionStmt CreateGroupStmt CreateOpClassStmt
CreateOpFamilyStmt AlterOpFamilyStmt CreatePLangStmt
- CreateSchemaStmt CreateSeqStmt CreateStmt CreateTableSpaceStmt
+ CreateSchemaStmt CreateSeqStmt CreateStmt CreateStatsStmt CreateTableSpaceStmt
CreateFdwStmt CreateForeignServerStmt CreateForeignTableStmt
CreateAssertStmt CreateTransformStmt CreateTrigStmt CreateEventTrigStmt
CreateUserStmt CreateUserMappingStmt CreateRoleStmt CreatePolicyStmt
@@ -809,6 +809,7 @@ stmt :
| CreateSchemaStmt
| CreateSeqStmt
| CreateStmt
+ | CreateStatsStmt
| CreateTableSpaceStmt
| CreateTransformStmt
| CreateTrigStmt
@@ -3436,6 +3437,36 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * CREATE STATISTICS stats_name ON relname (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+
+CreateStatsStmt: CREATE STATISTICS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $3;
+ n->relation = $5;
+ n->keys = $7;
+ n->options = $9;
+ n->if_not_exists = false;
+ $$ = (Node *)n;
+ }
+ | CREATE STATISTICS IF_P NOT EXISTS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $6;
+ n->relation = $8;
+ n->keys = $10;
+ n->options = $12;
+ n->if_not_exists = true;
+ $$ = (Node *)n;
+ }
+ ;
+
/*****************************************************************************
*
@@ -5621,6 +5652,7 @@ drop_type: TABLE { $$ = OBJECT_TABLE; }
| TEXT_P SEARCH DICTIONARY { $$ = OBJECT_TSDICTIONARY; }
| TEXT_P SEARCH TEMPLATE { $$ = OBJECT_TSTEMPLATE; }
| TEXT_P SEARCH CONFIGURATION { $$ = OBJECT_TSCONFIGURATION; }
+ | STATISTICS { $$ = OBJECT_STATISTICS; }
;
any_name_list:
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 045f7f0..2ba88e2 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1520,6 +1520,10 @@ ProcessUtilitySlow(Node *parsetree,
address = ExecSecLabelStmt((SecLabelStmt *) parsetree);
break;
+ case T_CreateStatsStmt: /* CREATE STATISTICS */
+ address = CreateStatistics((CreateStatsStmt *) parsetree);
+ break;
+
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(parsetree));
@@ -2160,6 +2164,9 @@ CreateCommandTag(Node *parsetree)
case OBJECT_TRANSFORM:
tag = "DROP TRANSFORM";
break;
+ case OBJECT_STATISTICS:
+ tag = "DROP STATISTICS";
+ break;
default:
tag = "???";
}
@@ -2527,6 +2534,10 @@ CreateCommandTag(Node *parsetree)
tag = "EXECUTE";
break;
+ case T_CreateStatsStmt:
+ tag = "CREATE STATISTICS";
+ break;
+
case T_DeallocateStmt:
{
DeallocateStmt *stmt = (DeallocateStmt *) parsetree;
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index fc5b9d9..1e41cac 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -47,6 +47,7 @@
#include "catalog/pg_auth_members.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
@@ -3930,6 +3931,62 @@ RelationGetIndexList(Relation relation)
return result;
}
+
+List *
+RelationGetMVStatList(Relation relation)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result;
+ List *oldlist;
+ MemoryContext oldcxt;
+
+ /* Quick exit if we already computed the list. */
+ if (relation->rd_mvstatvalid != 0)
+ return list_copy(relation->rd_mvstatlist);
+
+ /*
+ * We build the list we intend to return (in the caller's context) while
+ * doing the scan. After successfully completing the scan, we copy that
+ * list into the relcache entry. This avoids cache-context memory leakage
+ * if we get some sort of error partway through.
+ */
+ result = NIL;
+
+ /* Prepare to scan pg_index for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(relation)));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ /* TODO maybe include only already built statistics? */
+ result = insert_ordered_oid(result, HeapTupleGetOid(htup));
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* Now save a copy of the completed list in the relcache entry. */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ oldlist = relation->rd_mvstatlist;
+ relation->rd_mvstatlist = list_copy(result);
+
+ relation->rd_mvstatvalid = true;
+ MemoryContextSwitchTo(oldcxt);
+
+ /* Don't leak the old list, if there is one */
+ list_free(oldlist);
+
+ return result;
+}
+
/*
* insert_ordered_oid
* Insert a new Oid into a sorted list of Oids, preserving ordering
@@ -4899,6 +4956,8 @@ load_relcache_init_file(bool shared)
rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_idattr = NULL;
+ rel->rd_mvstatvalid = false;
+ rel->rd_mvstatlist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 6eb2ac6..0331da7 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -43,6 +43,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -501,6 +502,28 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATNAMENSP */
+ MvStatisticNameIndexId,
+ 2,
+ {
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ 0,
+ 0
+ },
+ 4
+ },
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 4
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..a755c49
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,356 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+static List* list_mv_stats(Oid relid);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ ListCell *lc;
+ List *mvstats;
+
+ TupleDesc tupdesc = RelationGetDescr(onerel);
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel));
+
+ foreach (lc, mvstats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies deps = NULL;
+
+ VacAttrStats **stats = NULL;
+ int numatts = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = stat->stakeys;
+
+ /* see how many of the columns are not dropped */
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ numatts += 1;
+
+ /* if there are dropped attributes, build a filtered int2vector */
+ if (numatts != attrs->dim1)
+ {
+ int16 *tmp = palloc0(numatts * sizeof(int16));
+ int attnum = 0;
+
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ tmp[attnum++] = attrs->values[j];
+
+ pfree(attrs);
+ attrs = buildint2vector(tmp, numatts);
+ }
+
+ /* filter only the interesting vacattrstats records */
+ stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(stat->mvoid, deps, attrs);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+static List*
+list_mv_stats(Oid relid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result = NIL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ info->mvoid = HeapTupleGetOid(htup);
+ info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_built = stats->deps_built;
+
+ result = lappend(result, info);
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_stakeys-1] = false;
+
+ /* use the new attnums, in case we removed some dropped ones */
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_stakeys -1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..6d5465b
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/datum.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "access/sysattr.h"
+
+#include "utils/mvstats.h"
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..84b6561
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,638 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ * Mine functional dependencies between columns, in the form (A => B),
+ * meaning that a value in column 'A' determines value in 'B'. A simple
+ * artificial example may be a table created like this
+ *
+ * CREATE TABLE deptest (a INT, b INT)
+ * AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+ *
+ * Clearly, once we know the value for 'A' we can easily determine the
+ * value of 'B' by dividing (A/10). A more practical example may be
+ * addresses, where (ZIP code => city name), i.e. once we know the ZIP,
+ * we probably know which city it belongs to. Larger cities usually have
+ * multiple ZIP codes, so the dependency can't be reversed.
+ *
+ * Functional dependencies are a concept well described in relational
+ * theory, especially in definition of normalization and "normal forms".
+ * Wikipedia has a nice definition of a functional dependency [1]:
+ *
+ * In a given table, an attribute Y is said to have a functional
+ * dependency on a set of attributes X (written X -> Y) if and only
+ * if each X value is associated with precisely one Y value. For
+ * example, in an "Employee" table that includes the attributes
+ * "Employee ID" and "Employee Date of Birth", the functional
+ * dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ * It follows from the previous two sentences that each {Employee ID}
+ * is associated with precisely one {Employee Date of Birth}.
+ *
+ * [1] http://en.wikipedia.org/wiki/Database_normalization
+ *
+ * Most datasets might be normalized not to contain any such functional
+ * dependencies, but sometimes it's not practical. In some cases it's
+ * actually a conscious choice to model the dataset in denormalized way,
+ * either because of performance or to make querying easier.
+ *
+ * The current implementation supports only dependencies between two
+ * columns, but this is merely a simplification of the initial patch.
+ * It's certainly useful to mine for dependencies involving multiple
+ * columns on the 'left' side, i.e. a condition for the dependency.
+ * That is dependencies [A,B] => C and so on.
+ *
+ * TODO The implementation may/should be smart enough not to mine both
+ * [A => B] and [A,C => B], because the second dependency is a
+ * consequence of the first one (if values of A determine values
+ * of B, adding another column won't change that). The ANALYZE
+ * should first analyze 1:1 dependencies, then 2:1 dependencies
+ * (and skip the already identified ones), etc.
+ *
+ * For example the dependency [city name => zip code] is much weaker
+ * than [city name, state name => zip code], because there may be
+ * multiple cities with the same name in various states. It's not
+ * perfect though - there are probably cities with the same name within
+ * the same state, but this is relatively rare occurence hopefully.
+ * More about this in the section about dependency mining.
+ *
+ * Handling multiple columns on the right side is not necessary, as such
+ * dependencies may be decomposed into a set of dependencies with
+ * the same meaning, one for each column on the right side. For example
+ *
+ * A => [B,C]
+ *
+ * is exactly the same as
+ *
+ * (A => B) & (A => C).
+ *
+ * Of course, storing (A => [B, C]) may be more efficient thant storing
+ * the two dependencies (A => B) and (A => C) separately.
+ *
+ *
+ * Dependency mining (ANALYZE)
+ * ---------------------------
+ *
+ * The current build algorithm is rather simple - for each pair [A,B] of
+ * columns, the data are sorted lexicographically (first by A, then B),
+ * and then a number of metrics is computed by walking the sorted data.
+ *
+ * In general the algorithm counts distict values of A (forming groups
+ * thanks to the sorting), supporting or contradicting the hypothesis
+ * that A => B (i.e. that values of B are predetermined by A). If there
+ * are multiple values of B for a single value of A, it's counted as
+ * contradicting.
+ *
+ * A group may be neither supporting nor contradicting. To be counted as
+ * supporting, the group has to have at least min_group_size(=3) rows.
+ * Smaller 'supporting' groups are counted as neutral.
+ *
+ * Finally, the number of rows in supporting and contradicting groups is
+ * compared, and if there is at least 10x more supporting rows, the
+ * dependency is considered valid.
+ *
+ *
+ * Real-world datasets are imperfect - there may be errors (e.g. due to
+ * data-entry mistakes), or factually correct records, yet contradicting
+ * the dependency (e.g. when a city splits into two, but both keep the
+ * same ZIP code). A strict ANALYZE implementation (where the functional
+ * dependencies are identified) would ignore dependencies on such noisy
+ * data, making the approach unusable in practice.
+ *
+ * The proposed implementation attempts to handle such noisy cases
+ * gracefully, by tolerating small number of contradicting cases.
+ *
+ * In the future this might also perform some sort of test and decide
+ * whether it's worth building any other kind of multivariate stats,
+ * or whether the dependencies sufficiently describe the data. Or at
+ * least not build the MCV list / histogram on the implied columns.
+ * Such reduction would however make the 'verification' (see the next
+ * section) impossible.
+ *
+ *
+ * Clause reduction (planner/optimizer)
+ * ------------------------------------
+ *
+ * Apllying the dependencies is quite simple - given a list of clauses,
+ * try to apply all the dependencies. For example given clause list
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d < 100)
+ *
+ * and dependencies [a=>b] and [a=>d], this may be reduced to
+ *
+ * (a = 1) AND (c = 1) AND (d < 100)
+ *
+ * The (d<100) can't be reduced as it's not an equality clause, so the
+ * dependency [a=>d] can't be applied.
+ *
+ * See clauselist_apply_dependencies() for more details.
+ *
+ * The problem with the reduction is that the query may use conditions
+ * that are not redundant, but in fact contradictory - e.g. the user
+ * may search for a ZIP code and a city name not matching the ZIP code.
+ *
+ * In such cases, the condition on the city name is not actually
+ * redundant, but actually contradictory (making the result empty), and
+ * removing it while estimating the cardinality will make the estimate
+ * worse.
+ *
+ * The current estimation assuming independence (and multiplying the
+ * selectivities) works better in this case, but only by utter luck.
+ *
+ * In some cases this might be verified using the other multivariate
+ * statistics - MCV lists and histograms. For MCV lists the verification
+ * might be very simple - peek into the list if there are any items
+ * matching the clause on the 'A' column (e.g. ZIP code), and if such
+ * item is found, check that the 'B' column matches the other clause.
+ * If it does not, the clauses are contradictory. We can't really say
+ * if such item was not found, except maybe restricting the selectivity
+ * using the MCV data (e.g. using min/max selectivity, or something).
+ *
+ * With histograms, it might work similarly - we can't check the values
+ * directly (because histograms use buckets, unlike MCV lists, storing
+ * the actual values). So we can only observe the buckets matching the
+ * clauses - if those buckets have very low frequency, it probably means
+ * the two clauses are incompatible.
+ *
+ * It's unclear what 'low frequency' is, but if one of the clauses is
+ * implied (automatically true because of the other clause), then
+ *
+ * selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+ *
+ * So we might compute selectivity of the first clause (on the column
+ * A in dependency [A=>B]) - for example using regular statistics.
+ * And then check if the selectivity computed from the histogram is
+ * about the same (or significantly lower).
+ *
+ * The problem is that histograms work well only when the data ordering
+ * matches the natural meaning. For values that serve as labels - like
+ * city names or ZIP codes, or even generated IDs, histograms really
+ * don't work all that well. For example sorting cities by name won't
+ * match the sorting of ZIP codes, rendering the histogram unusable.
+ *
+ * The MCV are probably going to work much better, because they don't
+ * really assume any sort of ordering. And it's probably more appropriate
+ * for the label-like data.
+ *
+ * TODO Support dependencies with multiple columns on left/right.
+ *
+ * TODO Investigate using histogram and MCV list to confirm the
+ * functional dependencies.
+ *
+ * TODO Investigate statistical testing of the distribution (to decide
+ * whether it makes sense to build the histogram/MCV list).
+ *
+ * TODO Using a min/max of selectivities would probably make more sense
+ * for the associated columns.
+ *
+ * TODO Consider eliminating the implied columns from the histogram and
+ * MCV lists (but maybe that's not a good idea, because that'd make
+ * it impossible to use these stats for non-equality clauses and
+ * also it wouldn't be possible to use the stats for verification
+ * of the dependencies as proposed in another TODO).
+ *
+ * TODO This builds a complete set of dependencies, i.e. including
+ * transitive dependencies - if we identify [A => B] and [B => C],
+ * we're likely to identify [A => C] too. It might be better to
+ * keep only the minimal set of dependencies, i.e. prune all the
+ * dependencies that we can recreate by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may
+ * be recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check
+ * before actually doing the work
+ *
+ * The second option has the advantage that we don't really need
+ * to perform the sort/count. It's not sufficient alone, though,
+ * because we may discover the dependencies in the wrong order.
+ * For example [A => B], [A => C] and then [B => C]. None of those
+ * dependencies is a combination of the already known ones, yet
+ * [A => C] is a combination of [A => B] and [B => C].
+ *
+ * FIXME Not sure the current NULL handling makes much sense. We assume
+ * that NULL is 0, so it's handled like a regular value
+ * (NULL == NULL), so all NULLs in a single column form a single
+ * group. Maybe that's not the right thing to do, especially with
+ * equality conditions - in that case NULLs are irrelevant. So
+ * maybe the right solution would be to just ignore NULL values?
+ *
+ * However simply "ignoring" the NULL values does not seem like
+ * a good idea - imagine columns A and B, where for each value of
+ * A, values in B are constant (same for the whole group) or NULL.
+ * Let's say only 10% of B values in each group is not NULL. Then
+ * ignoring the NULL values will result in 10x misestimate (and
+ * it's trivial to construct arbitrary errors). So maybe handling
+ * NULL values just like a regular value is the right thing here.
+ *
+ * Or maybe NULL values should be treated differently on each side
+ * of the dependency? E.g. as ignored on the left (condition) and
+ * as regular values on the right - this seems consistent with how
+ * equality clauses work, as equality clause means 'NOT NULL'.
+ * So if we say [A => B] then it may also imply "NOT NULL" on the
+ * right side.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct values in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we can estimate the
+ * average group size and use it as a threshold. Or something
+ * like that. Seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, stats);
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ SortItem current;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, stats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ current = items[0];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
+ {
+ n_violations += 1;
+ }
+
+ current = items[i];
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means there's a 1:1 relation (or one is a 'label'), making the
+ * conditions rather redundant. Although it's possible that the
+ * query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 85e3aa5..590cd51 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2104,6 +2104,50 @@ describeOneTableDetails(const char *schemaname,
PQclear(result);
}
+ /* print any multivariate statistics */
+ if (pset.sversion >= 90500)
+ {
+ printfPQExpBuffer(&buf,
+ "SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
+ " deps_enabled,\n"
+ " deps_built,\n"
+ " (SELECT string_agg(attname::text,', ')\n"
+ " FROM ((SELECT unnest(stakeys) AS attnum) s\n"
+ " JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
+ "FROM pg_mv_statistic stat WHERE starelid = '%s' ORDER BY 1;",
+ oid);
+
+ result = PSQLexec(buf.data);
+ if (!result)
+ goto error_return;
+ else
+ tuples = PQntuples(result);
+
+ if (tuples > 0)
+ {
+ printTableAddFooter(&cont, _("Statistics:"));
+ for (i = 0; i < tuples; i++)
+ {
+ printfPQExpBuffer(&buf, " ");
+
+ /* statistics name (qualified with namespace) */
+ appendPQExpBuffer(&buf, "\"%s.%s\" ",
+ PQgetvalue(result, i, 1),
+ PQgetvalue(result, i, 2));
+
+ /* options */
+ if (!strcmp(PQgetvalue(result, i, 4), "t"))
+ appendPQExpBuffer(&buf, "(dependencies)");
+
+ appendPQExpBuffer(&buf, " ON (%s)",
+ PQgetvalue(result, i, 6));
+
+ printTableAddFooter(&cont, buf.data);
+ }
+ }
+ PQclear(result);
+ }
+
/* print rules */
if (tableinfo.hasrules && tableinfo.relkind != 'm')
{
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 049bf9f..12211fe 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -153,10 +153,11 @@ typedef enum ObjectClass
OCLASS_EXTENSION, /* pg_extension */
OCLASS_EVENT_TRIGGER, /* pg_event_trigger */
OCLASS_POLICY, /* pg_policy */
- OCLASS_TRANSFORM /* pg_transform */
+ OCLASS_TRANSFORM, /* pg_transform */
+ OCLASS_STATISTICS /* pg_mv_statistics */
} ObjectClass;
-#define LAST_OCLASS OCLASS_TRANSFORM
+#define LAST_OCLASS OCLASS_STATISTICS
/* in dependency.c */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index b80d8d8..5ae42f7 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -119,6 +119,7 @@ extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
DropBehavior behavior, bool complain, bool internal);
extern void RemoveAttrDefaultById(Oid attrdefId);
extern void RemoveStatistics(Oid relid, AttrNumber attnum);
+extern void RemoveMVStatistics(Oid relid, AttrNumber attnum);
extern Form_pg_attribute SystemAttributeDefinition(AttrNumber attno,
bool relhasoids);
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index ab2c1a8..a768bb5 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,13 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_name_index, 3997, on pg_mv_statistic using btree(staname name_ops, stanamespace oid_ops));
+#define MvStatisticNameIndexId 3997
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index 2ccb3a7..44cf9c6 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -137,6 +137,8 @@ extern Oid get_collation_oid(List *collname, bool missing_ok);
extern Oid get_conversion_oid(List *conname, bool missing_ok);
extern Oid FindDefaultConversionProc(int32 for_encoding, int32 to_encoding);
+extern Oid get_statistics_oid(List *names, bool missing_ok);
+
/* initialization & transaction cleanup code */
extern void InitializeSearchPath(void);
extern void AtEOXact_Namespace(bool isCommit, bool parallel);
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..a568a07
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,73 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+ NameData staname; /* statistics name */
+ Oid stanamespace; /* OID of namespace containing this statistics */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_mv_statistic
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 7
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_staname 2
+#define Anum_pg_mv_statistic_stanamespace 3
+#define Anum_pg_mv_statistic_deps_enabled 4
+#define Anum_pg_mv_statistic_deps_built 5
+#define Anum_pg_mv_statistic_stakeys 6
+#define Anum_pg_mv_statistic_stadeps 7
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index f58672e..76e054d 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2741,6 +2741,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index b7a38ce..a52096b 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3577, 3578);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 54f67e9..99a6a62 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -75,6 +75,10 @@ extern ObjectAddress DefineOperator(List *names, List *parameters);
extern void RemoveOperatorById(Oid operOid);
extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
+/* commands/statscmds.c */
+extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
+extern void RemoveStatisticsById(Oid statsOid);
+
/* commands/aggregatecmds.c */
extern ObjectAddress DefineAggregate(List *name, List *args, bool oldstyle,
List *parameters, const char *queryString);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 2b73483..0329472 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -251,6 +251,7 @@ typedef enum NodeTag
T_PlaceHolderInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
+ T_MVStatisticInfo,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
@@ -381,6 +382,7 @@ typedef enum NodeTag
T_CreatePolicyStmt,
T_AlterPolicyStmt,
T_CreateTransformStmt,
+ T_CreateStatsStmt,
/*
* TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fd0629..e1807fb 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -601,6 +601,17 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct CreateStatsStmt
+{
+ NodeTag type;
+ List *defnames; /* qualified name (list of Value strings) */
+ RangeVar *relation; /* relation to build statistics on */
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+ bool if_not_exists; /* just do nothing if statistics already exists? */
+} CreateStatsStmt;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1410,6 +1421,7 @@ typedef enum ObjectType
OBJECT_RULE,
OBJECT_SCHEMA,
OBJECT_SEQUENCE,
+ OBJECT_STATISTICS,
OBJECT_TABCONSTRAINT,
OBJECT_TABLE,
OBJECT_TABLESPACE,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 61519bb..7ae0f9e 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -479,6 +479,7 @@ typedef struct RelOptInfo
List *lateral_vars; /* LATERAL Vars and PHVs referenced by rel */
Relids lateral_referencers; /* rels that reference me laterally */
List *indexlist; /* list of IndexOptInfo */
+ List *mvstatlist; /* list of MVStatisticInfo */
BlockNumber pages; /* size estimates derived from pg_class */
double tuples;
double allvisfrac;
@@ -573,6 +574,33 @@ typedef struct IndexOptInfo
bool amhasgetbitmap; /* does AM have amgetbitmap interface? */
} IndexOptInfo;
+/*
+ * MVStatisticInfo
+ * Information about multivariate stats for planning/optimization
+ *
+ * This contains information about which columns are covered by the
+ * statistics (stakeys), which options were requested while adding the
+ * statistics (*_enabled), and which kinds of statistics were actually
+ * built and are available for the optimizer (*_built).
+ */
+typedef struct MVStatisticInfo
+{
+ NodeTag type;
+
+ Oid mvoid; /* OID of the statistics row */
+ RelOptInfo *rel; /* back-link to index's table */
+
+ /* enabled statistics */
+ bool deps_enabled; /* functional dependencies enabled */
+
+ /* built/available statistics */
+ bool deps_built; /* functional dependencies built */
+
+ /* columns in the statistics (attnums) */
+ int2vector *stakeys; /* attnums of the columns covered */
+
+} MVStatisticInfo;
+
/*
* EquivalenceClasses
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..7ebd961
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,70 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "fmgr.h"
+#include "commands/vacuum.h"
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+
+#endif
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index ff5672d..26c7f85 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -79,6 +79,7 @@ typedef struct RelationData
bool rd_isvalid; /* relcache entry is valid */
char rd_indexvalid; /* state of rd_indexlist: 0 = not valid, 1 =
* valid, 2 = temporarily forced */
+ bool rd_mvstatvalid; /* state of rd_mvstatlist: true/false */
/*
* rd_createSubid is the ID of the highest subtransaction the rel has
@@ -111,6 +112,9 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
+
+ /* data managed by RelationGetMVStatList: */
+ List *rd_mvstatlist; /* list of OIDs of multivariate stats */
/* data managed by RelationGetIndexAttrBitmap: */
Bitmapset *rd_indexattr; /* identifies columns used in indexes */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 1b48304..9f03c8d 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -38,6 +38,7 @@ extern void RelationClose(Relation relation);
* Routines to compute/retrieve additional cached information
*/
extern List *RelationGetIndexList(Relation relation);
+extern List *RelationGetMVStatList(Relation relation);
extern Oid RelationGetOidIndex(Relation relation);
extern Oid RelationGetReplicaIndex(Relation relation);
extern List *RelationGetIndexExpressions(Relation relation);
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 256615b..0e0658d 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,8 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATNAMENSP,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 28b061f..2e2df8e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1365,6 +1365,14 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index eb0bc88..92a0d8a 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
--
2.1.0
0003-clause-reduction-using-functional-dependencies.patchapplication/x-patch; name=0003-clause-reduction-using-functional-dependencies.patchDownload
From ebae50c43eb5c6fdd24efd726489dc40672ac184 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 19:42:18 +0200
Subject: [PATCH 3/7] clause reduction using functional dependencies
During planning, use functional dependencies to decide which
clauses to skip during cardinality estimation. Initial and
rather simplistic implementation.
This only works with regular WHERE clauses, not clauses used
for join clauses.
Note: The clause_is_mv_compatible() needs to identify the
relation (so that we can fetch the list of multivariate stats
by OID). planner_rt_fetch() seems like the appropriate way to
get the relation OID, but apparently it only works with simple
vars. Maybe examine_variable() would make this work with more
complex vars too?
Includes regression tests analyzing functional dependencies
(part of ANALYZE) on several datasets (no dependencies, no
transitive dependencies, ...).
Checks that a query with conditions on two columns, where one (B)
is functionally dependent on the other one (A), correctly ignores
the clause on (B) and chooses bitmap index scan instead of plain
index scan (which is what happens otherwise, thanks to assumption
of independence).
Note: Functional dependencies only work with equality clauses,
no inequalities etc.
---
src/backend/optimizer/path/clausesel.c | 912 +++++++++++++++++++++++++-
src/backend/utils/mvstats/common.c | 5 +-
src/backend/utils/mvstats/dependencies.c | 24 +
src/include/utils/mvstats.h | 16 +-
src/test/regress/expected/mv_dependencies.out | 172 +++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 150 +++++
8 files changed, 1278 insertions(+), 5 deletions(-)
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 02660c2..e834722 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -14,14 +14,19 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
+#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/selfuncs.h"
+#include "utils/typcache.h"
/*
@@ -41,6 +46,44 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+
+static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+
+static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, List *stats,
+ SpecialJoinInfo *sjinfo);
+
+static bool has_stats(List *stats, int type);
+
+static List * find_stats(PlannerInfo *root, List *clauses,
+ Oid varRelid, Index *relid);
+
+static Bitmapset* fdeps_collect_attnums(List *stats);
+
+static int *make_idx_to_attnum_mapping(Bitmapset *attnums);
+static int *make_attnum_to_idx_mapping(Bitmapset *attnums);
+
+static bool *build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx);
+
+static void multiply_adjacency_matrix(bool *matrix, int natts);
+
+static List* fdeps_reduce_clauses(List *clauses,
+ Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx,
+ Index relid);
+
+static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+
+static Bitmapset * get_varattnos(Node * node, Index relid);
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
@@ -60,7 +103,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -87,6 +130,88 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
*
* Of course this is all very dependent on the behavior of
* scalarltsel/scalargtsel; perhaps some day we can generalize the approach.
+ *
+ *
+ * Multivariate statististics
+ * --------------------------
+ * This also uses multivariate stats to estimate combinations of
+ * conditions, in a way (a) maximizing the estimate accuracy by using
+ * as many stats as possible, and (b) minimizing the overhead,
+ * especially when there are no suitable multivariate stats (so if you
+ * are not using multivariate stats, there's no additional overhead).
+ *
+ * The following checks are performed (in this order), and the optimizer
+ * falls back to regular stats on the first 'false'.
+ *
+ * NOTE: This explains how this works with all the patches applied, not
+ * just the functional dependencies.
+ *
+ * (0) check if there are multivariate stats on the relation
+ *
+ * If no, just skip all the following steps (directly to the
+ * original code).
+ *
+ * (1) check how many attributes are there in conditions compatible
+ * with functional dependencies
+ *
+ * Only simple equality clauses are considered compatible with
+ * functional dependencies (and that's unlikely to change, because
+ * that's the only case when functional dependencies are useful).
+ *
+ * If there are no conditions that might be handled by multivariate
+ * stats, or if the conditions reference just a single column, it
+ * makes no sense to use functional dependencies, so skip to (4).
+ *
+ * (2) reduce the clauses using functional dependencies
+ *
+ * This simply attempts to 'reduce' the clauses by applying functional
+ * dependencies. For example if there are two clauses:
+ *
+ * WHERE (a = 1) AND (b = 2)
+ *
+ * and we know that 'a' determines the value of 'b', we may remove
+ * the second condition (b = 2) when computing the selectivity.
+ * This is of course tricky - see mvstats/dependencies.c for details.
+ *
+ * After the reduction, step (1) is to be repeated.
+ *
+ * (3) check how many attributes are there in conditions compatible
+ * with MCV lists and histograms
+ *
+ * What conditions are compatible with multivariate stats is decided
+ * by clause_is_mv_compatible(). At this moment, only conditions
+ * of the form "column operator constant" (for simple comparison
+ * operators), IS [NOT] NULL and some AND/OR clauses are considered
+ * compatible with multivariate statistics.
+ *
+ * Again, see clause_is_mv_compatible() for details.
+ *
+ * (4) check how many attributes are there in conditions compatible
+ * with MCV lists and histograms
+ *
+ * If there are no conditions that might be handled by MCV lists
+ * or histograms, or if the conditions reference just a single
+ * column, it makes no sense to continue, so just skip to (7).
+ *
+ * (5) choose the stats matching the most columns
+ *
+ * If there are multiple instances of multivariate statistics (e.g.
+ * built on different sets of columns), we choose the stats covering
+ * the most columns from step (1). It may happen that all available
+ * stats match just a single column - for example with conditions
+ *
+ * WHERE a = 1 AND b = 2
+ *
+ * and statistics built on (a,c) and (b,c). In such case just fall
+ * back to the regular stats because it makes no sense to use the
+ * multivariate statistics.
+ *
+ * For more details about how exactly we choose the stats, see
+ * choose_mv_statistics().
+ *
+ * (6) use the multivariate stats to estimate matching clauses
+ *
+ * (7) estimate the remaining clauses using the regular statistics
*/
Selectivity
clauselist_selectivity(PlannerInfo *root,
@@ -99,6 +224,16 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+
+ /* attributes in mv-compatible clauses */
+ Bitmapset *mvattnums = NULL;
+ List *stats = NIL;
+
+ /* use clauses (not conditions), because those are always non-empty */
+ stats = find_stats(root, clauses, varRelid, &relid);
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +243,31 @@ clauselist_selectivity(PlannerInfo *root,
varRelid, jointype, sjinfo);
/*
+ * Check that there are some stats with functional dependencies
+ * built (by walking the stats list). We're going to find that
+ * anyway when trying to apply the functional dependencies, but
+ * this is probably a tad faster.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ {
+ /* collect attributes referenced by mv-compatible clauses */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+
+ /*
+ * If there are mv-compatible clauses, referencing at least two
+ * different columns (otherwise it makes no sense to use mv stats),
+ * try to reduce the clauses using functional dependencies, and
+ * recollect the attributes from the reduced list.
+ *
+ * We don't need to select a single statistics for this - we can
+ * apply all the functional dependencies we have.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ clauses = clauselist_apply_dependencies(root, clauses, varRelid,
+ stats, sjinfo);
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -763,3 +923,753 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
+ Index *relid, SpecialJoinInfo *sjinfo)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ *relid = InvalidOid;
+ }
+
+ return attnums;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
+ Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+{
+
+ if (IsA(clause, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* no support for OR clauses at this point */
+ if (rinfo->orclause)
+ return false;
+
+ /* get the actual clause from the RestrictInfo (it's not an OR clause) */
+ clause = (Node*)rinfo->clause;
+
+ /* only simple opclauses are compatible with multivariate stats */
+ if (! is_opclause(clause))
+ return false;
+
+ /* we don't support join conditions at this moment */
+ if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
+ return false;
+
+ /* is it 'variable op constant' ? */
+ if (list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ *relid = var->varno;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+ *attnum = var->varattno;
+ return true;
+ }
+ }
+ }
+ }
+
+ return false;
+
+}
+
+/*
+ * Performs reduction of clauses using functional dependencies, i.e.
+ * removes clauses that are considered redundant. It simply walks
+ * through dependencies, and checks whether the dependency 'matches'
+ * the clauses, i.e. if there's a clause matching the condition. If yes,
+ * all clauses matching the implied part of the dependency are removed
+ * from the list.
+ *
+ * This simply looks at attnums references by the clauses, not at the
+ * type of the operator (equality, inequality, ...). This may not be the
+ * right way to do - it certainly works best for equalities, which is
+ * naturally consistent with functional dependencies (implications).
+ * It's not clear that other operators are handled sensibly - for
+ * example for inequalities, like
+ *
+ * WHERE (A >= 10) AND (B <= 20)
+ *
+ * and a trivial case where [A == B], resulting in symmetric pair of
+ * rules [A => B], [B => A], it's rather clear we can't remove either of
+ * those clauses.
+ *
+ * That only highlights that functional dependencies are most suitable
+ * for label-like data, where using non-equality operators is very rare.
+ * Using the common city/zipcode example, clauses like
+ *
+ * (zipcode <= 12345)
+ *
+ * or
+ *
+ * (cityname >= 'Washington')
+ *
+ * are rare. So restricting the reduction to equality should not harm
+ * the usefulness / applicability.
+ *
+ * The other assumption is that this assumes 'compatible' clauses. For
+ * example by using mismatching zip code and city name, this is unable
+ * to identify the discrepancy and eliminates one of the clauses. The
+ * usual approach (multiplying both selectivities) thus produces a more
+ * accurate estimate, although mostly by luck - the multiplication
+ * comes from assumption of statistical independence of the two
+ * conditions (which is not not valid in this case), but moves the
+ * estimate in the right direction (towards 0%).
+ *
+ * This might be somewhat improved by cross-checking the selectivities
+ * against MCV and/or histogram.
+ *
+ * The implementation needs to be careful about cyclic rules, i.e. rules
+ * like [A => B] and [B => A] at the same time. This must not reduce
+ * clauses on both attributes at the same time.
+ *
+ * Technically we might consider selectivities here too, somehow. E.g.
+ * when (A => B) and (B => A), we might use the clauses with minimum
+ * selectivity.
+ *
+ * TODO Consider restricting the reduction to equality clauses. Or maybe
+ * use equality classes somehow?
+ *
+ * TODO Merge this docs to dependencies.c, as it's saying mostly the
+ * same things as the comments there.
+ *
+ * TODO Currently this is applied only to the top-level clauses, but
+ * maybe we could apply it to lists at subtrees too, e.g. to the
+ * two AND-clauses in
+ *
+ * (x=1 AND y=2) OR (z=3 AND q=10)
+ *
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, List *stats,
+ SpecialJoinInfo *sjinfo)
+{
+ List *reduced_clauses = NIL;
+ Index relid;
+
+ /*
+ * matrix of (natts x natts), 1 means x=>y
+ *
+ * This serves two purposes - first, it merges dependencies from all
+ * the statistics, second it makes generating all the transitive
+ * dependencies easier.
+ *
+ * We need to build this only for attributes from the dependencies,
+ * not for all attributes in the table.
+ *
+ * We can't do that only for attributes from the clauses, because we
+ * want to build transitive dependencies (including those going
+ * through attributes not listed in the stats).
+ *
+ * This only works for A=>B dependencies, not sure how to do that
+ * for complex dependencies.
+ */
+ bool *deps_matrix;
+ int deps_natts; /* size of the matric */
+
+ /* mapping attnum <=> matrix index */
+ int *deps_idx_to_attnum;
+ int *deps_attnum_to_idx;
+
+ /* attnums in dependencies and clauses (and intersection) */
+ List *deps_clauses = NIL;
+ Bitmapset *deps_attnums = NULL;
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *intersect_attnums = NULL;
+
+ /*
+ * Is there at least one statistics with functional dependencies?
+ * If not, return the original clauses right away.
+ *
+ * XXX Isn't this pointless, thanks to exactly the same check in
+ * clauselist_selectivity()? Can we trigger the condition here?
+ */
+ if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ return clauses;
+
+ /*
+ * Build the dependency matrix, i.e. attribute adjacency matrix,
+ * where 1 means (a=>b). Once we have the adjacency matrix, we'll
+ * multiply it by itself, to get transitive dependencies.
+ *
+ * Note: This is pretty much transitive closure from graph theory.
+ *
+ * First, let's see what attributes are covered by functional
+ * dependencies (sides of the adjacency matrix), and also a maximum
+ * attribute (size of mapping to simple integer indexes);
+ */
+ deps_attnums = fdeps_collect_attnums(stats);
+
+ /*
+ * Walk through the clauses - clauses that are (one of)
+ *
+ * (a) not mv-compatible
+ * (b) are using more than a single attnum
+ * (c) using attnum not covered by functional depencencies
+ *
+ * may be copied directly to the result. The interesting clauses are
+ * kept in 'deps_clauses' and will be processed later.
+ */
+ clause_attnums = fdeps_filter_clauses(root, clauses, deps_attnums,
+ &reduced_clauses, &deps_clauses,
+ varRelid, &relid, sjinfo);
+
+ /*
+ * we need at least two clauses referencing two different attributes
+ * referencing to do the reduction
+ */
+ if ((list_length(deps_clauses) < 2) || (bms_num_members(clause_attnums) < 2))
+ {
+ bms_free(clause_attnums);
+ list_free(reduced_clauses);
+ list_free(deps_clauses);
+
+ return clauses;
+ }
+
+
+ /*
+ * We need at least two matching attributes in the clauses and
+ * dependencies, otherwise we can't really reduce anything.
+ */
+ intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
+ if (bms_num_members(intersect_attnums) < 2)
+ {
+ bms_free(clause_attnums);
+ bms_free(deps_attnums);
+ bms_free(intersect_attnums);
+
+ list_free(deps_clauses);
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ /*
+ * Build mapping between matrix indexes and attnums, and then the
+ * adjacency matrix itself.
+ */
+ deps_idx_to_attnum = make_idx_to_attnum_mapping(deps_attnums);
+ deps_attnum_to_idx = make_attnum_to_idx_mapping(deps_attnums);
+
+ /* build the adjacency matrix */
+ deps_matrix = build_adjacency_matrix(stats, deps_attnums,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx);
+
+ deps_natts = bms_num_members(deps_attnums);
+
+ /*
+ * Multiply the matrix N-times (N = size of the matrix), so that we
+ * get all the transitive dependencies. That makes the next step
+ * much easier and faster.
+ *
+ * This is essentially an adjacency matrix from graph theory, and
+ * by multiplying it we get transitive edges. We don't really care
+ * about the exact number (number of paths between vertices) though,
+ * so we can do the multiplication in-place (we don't care whether
+ * we found the dependency in this round or in the previous one).
+ *
+ * Track how many new dependencies were added, and stop when 0, but
+ * we can't multiply more than N-times (longest path in the graph).
+ */
+ multiply_adjacency_matrix(deps_matrix, deps_natts);
+
+ /*
+ * Walk through the clauses, and see which other clauses we may
+ * reduce. The matrix contains all transitive dependencies, which
+ * makes this very fast.
+ *
+ * We have to be careful not to reduce the clause using itself, or
+ * reducing all clauses forming a cycle (so we have to skip already
+ * eliminated clauses).
+ *
+ * I'm not sure whether this guarantees finding the best solution,
+ * i.e. reducing the most clauses, but it probably does (thanks to
+ * having all the transitive dependencies).
+ */
+ deps_clauses = fdeps_reduce_clauses(deps_clauses,
+ deps_attnums, deps_matrix,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx, relid);
+
+ /* join the two lists of clauses */
+ reduced_clauses = list_union(reduced_clauses, deps_clauses);
+
+ pfree(deps_matrix);
+ pfree(deps_idx_to_attnum);
+ pfree(deps_attnum_to_idx);
+
+ bms_free(deps_attnums);
+ bms_free(clause_attnums);
+ bms_free(intersect_attnums);
+
+ return reduced_clauses;
+}
+
+static bool
+has_stats(List *stats, int type)
+{
+ ListCell *s;
+
+ foreach (s, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Determing relid (either from varRelid or from clauses) and then
+ * lookup stats using the relid.
+ */
+static List *
+find_stats(PlannerInfo *root, List *clauses, Oid varRelid, Index *relid)
+{
+ /* unknown relid by default */
+ *relid = InvalidOid;
+
+ /*
+ * First we need to find the relid (index info simple_rel_array).
+ * If varRelid is not 0, we already have it, otherwise we have to
+ * look it up from the clauses.
+ */
+ if (varRelid != 0)
+ *relid = varRelid;
+ else
+ {
+ Relids relids = pull_varnos((Node*)clauses);
+
+ /*
+ * We only expect 0 or 1 members in the bitmapset. If there are
+ * no vars, we'll get empty bitmapset, otherwise we'll get the
+ * relid as the single member.
+ *
+ * FIXME For some reason we can get 2 relids here (e.g. \d in
+ * psql does that).
+ */
+ if (bms_num_members(relids) == 1)
+ *relid = bms_singleton_member(relids);
+
+ bms_free(relids);
+ }
+
+ /*
+ * if we found the relid, we can get the stats from simple_rel_array
+ *
+ * This only gets stats that are already built, because that's how
+ * we load it into RelOptInfo (see get_relation_info), but we don't
+ * detoast the whole stats yet. That'll be done later, after we
+ * decide which stats to use.
+ */
+ if (*relid != InvalidOid)
+ return root->simple_rel_array[*relid]->mvstatlist;
+
+ return NIL;
+}
+
+static Bitmapset*
+fdeps_collect_attnums(List *stats)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ int2vector *stakeys = info->stakeys;
+
+ /* skip stats without functional dependencies built */
+ if (! info->deps_built)
+ continue;
+
+ for (j = 0; j < stakeys->dim1; j++)
+ attnums = bms_add_member(attnums, stakeys->values[j]);
+ }
+
+ return attnums;
+}
+
+
+static int*
+make_idx_to_attnum_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum = -1;
+
+ int *mapping = (int*)palloc0(bms_num_members(attnums) * sizeof(int));
+
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attidx++] = attnum;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+static int*
+make_attnum_to_idx_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum = -1;
+ int maxattnum = -1;
+ int *mapping;
+
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ maxattnum = attnum;
+
+ mapping = (int*)palloc0((maxattnum+1) * sizeof(int));
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attnum] = attidx++;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+static bool*
+build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx)
+{
+ ListCell *lc;
+ int natts = bms_num_members(attnums);
+ bool *matrix = (bool*)palloc0(natts * natts * sizeof(bool));
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies dependencies = NULL;
+
+ /* skip stats without functional dependencies built */
+ if (! stat->deps_built)
+ continue;
+
+ /* fetch and deserialize dependencies */
+ dependencies = load_mv_dependencies(stat->mvoid);
+ if (dependencies == NULL)
+ {
+ elog(WARNING, "failed to deserialize func deps %d", stat->mvoid);
+ continue;
+ }
+
+ /* set matrix[a,b] to 'true' if 'a=>b' */
+ for (j = 0; j < dependencies->ndeps; j++)
+ {
+ int aidx = attnum_to_idx[dependencies->deps[j]->a];
+ int bidx = attnum_to_idx[dependencies->deps[j]->b];
+
+ /* a=> b */
+ matrix[aidx * natts + bidx] = true;
+ }
+ }
+
+ return matrix;
+}
+
+static void
+multiply_adjacency_matrix(bool *matrix, int natts)
+{
+ int i;
+
+ for (i = 0; i < natts; i++)
+ {
+ int k, l, m;
+ int nchanges = 0;
+
+ /* k => l */
+ for (k = 0; k < natts; k++)
+ {
+ for (l = 0; l < natts; l++)
+ {
+ /* we already have this dependency */
+ if (matrix[k * natts + l])
+ continue;
+
+ /* we don't really care about the exact value, just 0/1 */
+ for (m = 0; m < natts; m++)
+ {
+ if (matrix[k * natts + m] * matrix[m * natts + l])
+ {
+ matrix[k * natts + l] = true;
+ nchanges += 1;
+ break;
+ }
+ }
+ }
+ }
+
+ /* no transitive dependency added here, so terminate */
+ if (nchanges == 0)
+ break;
+ }
+}
+
+static List*
+fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx, Index relid)
+{
+ int i;
+ ListCell *lc;
+ List *reduced_clauses = NIL;
+
+ int nmvclauses; /* size of the arrays */
+ bool *reduced;
+ AttrNumber *mvattnums;
+ Node **mvclauses;
+
+ int natts = bms_num_members(attnums);
+
+ /*
+ * Preallocate space for all clauses (the list only containst
+ * compatible clauses at this point). This makes it somewhat easier
+ * to access the stats / attnums randomly.
+ *
+ * XXX This assumes each clause references exactly one Var, so the
+ * arrays are sized accordingly - for functional dependencies
+ * this is safe, because it only works with Var=Const.
+ */
+ mvclauses = (Node**)palloc0(list_length(clauses) * sizeof(Node*));
+ mvattnums = (AttrNumber*)palloc0(list_length(clauses) * sizeof(AttrNumber));
+ reduced = (bool*)palloc0(list_length(clauses) * sizeof(bool));
+
+ /* fill the arrays */
+ nmvclauses = 0;
+ foreach (lc, clauses)
+ {
+ Node * clause = (Node*)lfirst(lc);
+ Bitmapset * attnums = get_varattnos(clause, relid);
+
+ mvclauses[nmvclauses] = clause;
+ mvattnums[nmvclauses] = bms_singleton_member(attnums);
+ nmvclauses++;
+ }
+
+ Assert(nmvclauses == list_length(clauses));
+
+ /* now try to reduce the clauses (using the dependencies) */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ int j;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[i], attnums))
+ continue;
+
+ /* this clause was already reduced, so let's skip it */
+ if (reduced[i])
+ continue;
+
+ /* walk the potentially 'implied' clauses */
+ for (j = 0; j < nmvclauses; j++)
+ {
+ int aidx, bidx;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[j], attnums))
+ continue;
+
+ aidx = attnum_to_idx[mvattnums[i]];
+ bidx = attnum_to_idx[mvattnums[j]];
+
+ /* can't reduce the clause by itself, or if already reduced */
+ if ((i == j) || reduced[j])
+ continue;
+
+ /* mark the clause as reduced (if aidx => bidx) */
+ reduced[j] = matrix[aidx * natts + bidx];
+ }
+ }
+
+ /* now walk through the clauses, and keep only those not reduced */
+ for (i = 0; i < nmvclauses; i++)
+ if (! reduced[i])
+ reduced_clauses = lappend(reduced_clauses, mvclauses[i]);
+
+ pfree(reduced);
+ pfree(mvclauses);
+ pfree(mvattnums);
+
+ return reduced_clauses;
+}
+
+
+static Bitmapset *
+fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo)
+{
+ ListCell *lc;
+ Bitmapset *clause_attnums = NULL;
+
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(lc);
+
+ if (! clause_is_mv_compatible(root, clause, varRelid, relid,
+ &attnum, sjinfo))
+
+ /* clause incompatible with functional dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(attnum, deps_attnums))
+
+ /* clause not covered by the dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else
+ {
+ *deps_clauses = lappend(*deps_clauses, clause);
+ clause_attnums = bms_add_member(clause_attnums, attnum);
+ }
+ }
+
+ return clause_attnums;
+}
+
+/*
+ * Pull varattnos from the clauses, similarly to pull_varattnos() but:
+ *
+ * (a) only get attributes for a particular relation (relid)
+ * (b) ignore system attributes (we can't build stats on them anyway)
+ *
+ * This makes it possible to directly compare the result with attnum
+ * values from pg_attribute etc.
+ */
+static Bitmapset *
+get_varattnos(Node * node, Index relid)
+{
+ int k;
+ Bitmapset *varattnos = NULL;
+ Bitmapset *result = NULL;
+
+ /* get the varattnos */
+ pull_varattnos(node, relid, &varattnos);
+
+ k = -1;
+ while ((k = bms_next_member(varattnos, k)) >= 0)
+ {
+ if (k + FirstLowInvalidHeapAttributeNumber > 0)
+ result
+ = bms_add_member(result,
+ k + FirstLowInvalidHeapAttributeNumber);
+ }
+
+ bms_free(varattnos);
+
+ return result;
+}
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index a755c49..bd200bc 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -84,7 +84,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(stat->mvoid, deps, attrs);
@@ -163,6 +164,7 @@ list_mv_stats(Oid relid)
info->mvoid = HeapTupleGetOid(htup);
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
result = lappend(result, info);
@@ -274,6 +276,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 84b6561..0a08d12 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -636,3 +636,27 @@ pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(result));
}
+
+MVDependencies
+load_mv_dependencies(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->deps_enabled && mvstat->deps_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_dependencies(DatumGetByteaP(deps));
+}
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 7ebd961..cc43a79 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,12 +17,20 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
int16 a;
@@ -48,6 +56,8 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
+MVDependencies load_mv_dependencies(Oid mvoid);
+
bytea * serialize_mv_dependencies(MVDependencies dependencies);
/* deserialization of stats (serialization is private to analyze) */
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..e759997
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,172 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index b1bc7c7..81484f1 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -110,3 +110,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index ade9ef1..14ea574 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -161,3 +161,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..48dea4d
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,150 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
--
2.1.0
0004-multivariate-MCV-lists.patchapplication/x-patch; name=0004-multivariate-MCV-lists.patchDownload
From 7ce09934eddfc08315b623fa498f9548f9150ec3 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 16:52:15 +0200
Subject: [PATCH 4/7] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
---
doc/src/sgml/ref/create_statistics.sgml | 18 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 45 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 1079 +++++++++++++++++++++++++--
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 104 ++-
src/backend/utils/mvstats/common.h | 11 +-
src/backend/utils/mvstats/mcv.c | 1237 +++++++++++++++++++++++++++++++
src/bin/psql/describe.c | 25 +-
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 69 +-
src/test/regress/expected/mv_mcv.out | 207 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 178 +++++
20 files changed, 2915 insertions(+), 101 deletions(-)
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index a86eae3..193e4b0 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -132,6 +132,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>max_mcv_items</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of MCV list items.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>mcv</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables MCV list for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2423985..5488061 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -165,7 +165,9 @@ CREATE VIEW pg_mv_stats AS
S.staname AS staname,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 84a8b13..90bfaed 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -136,7 +136,13 @@ CreateStatistics(CreateStatsStmt *stmt)
ObjectAddress parentobject, childobject;
/* by default build nothing */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -212,6 +218,29 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -220,10 +249,16 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -243,8 +278,12 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 5ecc9ef..9e029ef 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1948,9 +1948,11 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
+ WRITE_BOOL_FIELD(mcv_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
+ WRITE_BOOL_FIELD(mcv_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index e834722..d194551 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -15,6 +15,7 @@
#include "postgres.h"
#include "access/sysattr.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
@@ -47,17 +48,38 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+ Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int type);
static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
+ int type);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, List *stats,
SpecialJoinInfo *sjinfo);
+static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
+
+static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid,
+ List **mvclauses, MVStatisticInfo *mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
@@ -85,6 +107,13 @@ static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
static Bitmapset * get_varattnos(Node * node, Index relid);
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,isor) \
+ (m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -250,8 +279,12 @@ clauselist_selectivity(PlannerInfo *root,
*/
if (has_stats(stats, MV_CLAUSE_TYPE_FDEP))
{
- /* collect attributes referenced by mv-compatible clauses */
- mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo);
+ /*
+ * Collect attributes referenced by mv-compatible clauses (looking
+ * for clauses compatible with functional dependencies for now).
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_FDEP);
/*
* If there are mv-compatible clauses, referencing at least two
@@ -268,6 +301,48 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Check that there are statistics with MCV list. If not, we don't
+ * need to waste time with the optimization.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV))
+ {
+ /*
+ * Recollect attributes from mv-compatible clauses (maybe we've
+ * removed so many clauses we have a single mv-compatible attnum).
+ * From now on we're only interested in MCV-compatible clauses.
+ */
+ mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
+ MV_CLAUSE_TYPE_MCV);
+
+ /*
+ * If there still are at least two columns, we'll try to select
+ * a suitable multivariate stats.
+ */
+ if (bms_num_members(mvattnums) >= 2)
+ {
+ /* see choose_mv_statistics() for details */
+ MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+
+ if (mvstat != NULL) /* we have a matching stats */
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -924,12 +999,129 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Estimate selectivity for the list of MV-compatible clauses, using
+ * using a MV statistics (combining a histogram and MCV list).
+ *
+ * This simply passes the estimation to the MCV list and then to the
+ * histogram, if available.
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities
+ * (i.e. the selectivity of the most restrictive clause), because
+ * that's the maximum we can ever get from ANDed list of clauses.
+ * This may probably prevent issues with hitting too many buckets
+ * and low precision histograms.
+ *
+ * TODO We may support some additional conditions, most importantly
+ * those matching multiple columns (e.g. "a = b" or "a < b").
+ * Ultimately we could track multi-table histograms for join
+ * cardinality estimation.
+ *
+ * TODO Further thoughts on processing equality clauses: Maybe it'd be
+ * better to look for stats (with MCV) covered by the equality
+ * clauses, because then we have a chance to find an exact match
+ * in the MCV list, which is pretty much the best we can do. We may
+ * also look at the least frequent MCV item, and use it as a upper
+ * boundary for the selectivity (had there been a more frequent
+ * item, it'd be in the MCV list).
+ *
+ * TODO There are several options for 'sanity clamping' the estimates.
+ *
+ * First, if we have selectivities for each condition, then
+ *
+ * P(A,B) <= MIN(P(A), P(B))
+ *
+ * Because additional conditions (connected by AND) can only lower
+ * the probability.
+ *
+ * So we can do some basic sanity checks using the single-variate
+ * stats (the ones we have right now).
+ *
+ * Second, when we have multivariate stats with a MCV list, then
+ *
+ * (a) if we have a full equality condition (one equality condition
+ * on each column) and we found a match in the MCV list, this is
+ * the selectivity (and it's supposed to be exact)
+ *
+ * (b) if we have a full equality condition and we haven't found a
+ * match in the MCV list, then the selectivity is below the
+ * lowest selectivity in the MCV list
+ *
+ * (c) if we have a equality condition (not full), we can still
+ * search the MCV for matches and use the sum of probabilities
+ * as a lower boundary for the histogram (if there are no
+ * matches in the MCV list, then we have no boundary)
+ *
+ * Third, if there are multiple (combinations of) multivariate
+ * stats for a set of clauses, we may compute all of them and then
+ * somehow aggregate them - e.g. by choosing the minimum, median or
+ * average. The stats are susceptible to overestimation (because
+ * we take 50% of the bucket for partial matches). Some stats may
+ * give better estimates than others, but it's very difficult to
+ * say that in advance which one is the best (it depends on the
+ * number of buckets, number of additional columns not referenced
+ * in the clauses, type of condition etc.).
+ *
+ * So we may compute them all and then choose a sane aggregation
+ * (minimum seems like a good approach). Of course, this may result
+ * in longer / more expensive estimation (CPU-wise), but it may be
+ * worth it.
+ *
+ * It's possible to add a GUC choosing whether to do a 'simple'
+ * (using a single stats expected to give the best estimate) and
+ * 'complex' (combining the multiple estimates).
+ *
+ * multivariate_estimates = (simple|full)
+ *
+ * Also, this might be enabled at a table level, by something like
+ *
+ * ALTER TABLE ... SET STATISTICS (simple|full)
+ *
+ * Which would make it possible to use this only for the tables
+ * where the simple approach does not work.
+ *
+ * Also, there are ways to optimize this algorithmically. E.g. we
+ * may try to get an estimate from a matching MCV list first, and
+ * if we happen to get a "full equality match" we may stop computing
+ * the estimates from other stats (for this condition) because
+ * that's probably the best estimate we can really get.
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive). But this
+ * requires really knowing the per-clause selectivities in advance,
+ * and that's not what we do now.
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
- Index *relid, SpecialJoinInfo *sjinfo)
+ Index *relid, SpecialJoinInfo *sjinfo, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
@@ -945,12 +1137,11 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(root, clause, varRelid, relid, &attnum, sjinfo))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
+ sjinfo, types);
}
/*
@@ -969,6 +1160,188 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
}
/*
+ * We're looking for statistics matching at least 2 attributes,
+ * referenced in the clauses compatible with multivariate statistics.
+ * The current selection criteria is very simple - we choose the
+ * statistics referencing the most attributes.
+ *
+ * If there are multiple statistics referencing the same number of
+ * columns (from the clauses), the one with less source columns
+ * (as listed in the ADD STATISTICS when creating the statistics) wins.
+ * Other wise the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns,
+ * but one has 100 buckets and the other one has 1000 buckets (thus
+ * likely providing better estimates), this is not currently
+ * considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list,
+ * another one with just a histogram and a third one with both,
+ * this is not considered.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts,
+ * so if there are multiple clauses on a single attribute, this
+ * still counts as a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example
+ * equality clauses probably work better with MCV lists than with
+ * histograms. But IS [NOT] NULL conditions may often work better
+ * with histograms (thanks to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
+ * selected as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list
+ * of clauses into two parts - conditions that are compatible with the
+ * selected stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while
+ * the last condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting
+ * conditions instead of just referenced attributes), but eventually
+ * the best option should be to combine multiple statistics. But that's
+ * much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing
+ * the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses,
+ * because 'dependencies' will probably work only with equality
+ * clauses.
+ */
+static MVStatisticInfo *
+choose_mv_statistics(List *stats, Bitmapset *attnums)
+{
+ int i;
+ ListCell *lc;
+
+ MVStatisticInfo *choice = NULL;
+
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements)
+ * and for each one count the referenced attributes (encoded in
+ * the 'attnums' bitmap).
+ */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = info->stakeys;
+ int numattrs = attrs->dim1;
+
+ /* skip dependencies-only stats */
+ if (! info->mcv_built)
+ continue;
+
+ /* count columns covered by the histogram */
+ for (i = 0; i < numattrs; i++)
+ if (bms_is_member(attrs->values[i], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = info;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen statistics, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes covered by the stats, so we can
+ * do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(root, clause, varRelid, NULL,
+ &attnums, sjinfo, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list
+ * of mv-compatible clauses. Otherwise, keep it in the list of
+ * 'regular' clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
* into simple_rte_array) and a bitmap of attributes. This is then
@@ -987,8 +1360,12 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
static bool
clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
- Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+ Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int types)
{
+ Relids clause_relids;
+ Relids left_relids;
+ Relids right_relids;
if (IsA(clause, RestrictInfo))
{
@@ -998,82 +1375,176 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
if (rinfo->pseudoconstant)
return false;
- /* no support for OR clauses at this point */
- if (rinfo->orclause)
- return false;
-
/* get the actual clause from the RestrictInfo (it's not an OR clause) */
clause = (Node*)rinfo->clause;
- /* only simple opclauses are compatible with multivariate stats */
- if (! is_opclause(clause))
- return false;
-
/* we don't support join conditions at this moment */
if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
return false;
+ clause_relids = rinfo->clause_relids;
+ left_relids = rinfo->left_relids;
+ right_relids = rinfo->right_relids;
+ }
+ else if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
+ {
+ left_relids = pull_varnos(get_leftop((Expr*)clause));
+ right_relids = pull_varnos(get_rightop((Expr*)clause));
+
+ clause_relids = bms_union(left_relids,
+ right_relids);
+ }
+ else
+ {
+ /* Not a binary opclause, so mark left/right relid sets as empty */
+ left_relids = NULL;
+ right_relids = NULL;
+ /* and get the total relid set the hard way */
+ clause_relids = pull_varnos((Node *) clause);
+ }
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
/* is it 'variable op constant' ? */
- if (list_length(((OpExpr *) clause)->args) == 2)
+
+ ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ left_relids)));
+
+ if (ok)
{
- OpExpr *expr = (OpExpr *) clause;
- bool varonleft = true;
- bool ok;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
- ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
- (is_pseudo_constant_clause_relids(lsecond(expr->args),
- rinfo->right_relids) ||
- (varonleft = false,
- is_pseudo_constant_clause_relids(linitial(expr->args),
- rinfo->left_relids)));
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
- if (ok)
- {
- Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
- /*
- * Simple variables only - otherwise the planner_rt_fetch seems to fail
- * (return NULL).
- *
- * TODO Maybe use examine_variable() would fix that?
- */
- if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
- return false;
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
- /*
- * Only consider this variable if (varRelid == 0) or when the varno
- * matches varRelid (see explanation at clause_selectivity).
- *
- * FIXME I suspect this may not be really necessary. The (varRelid == 0)
- * part seems to be enforced by treat_as_join_clause().
- */
- if (! ((varRelid == 0) || (varRelid == var->varno)))
- return false;
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
+ *relid = var->varno;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ /* not compatible with functional dependencies */
+ if (types & MV_CLAUSE_TYPE_MCV)
+ {
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return (types & MV_CLAUSE_TYPE_MCV);
+ }
+ return false;
+
+ case F_EQSEL:
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return true;
+ }
+ }
+ }
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ Var * var = (Var*)((NullTest*)clause)->arg;
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
- /* Also skip special varno values, and system attributes ... */
- if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
- return false;
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
*relid = var->varno;
- /*
- * If it's not a "<" or ">" or "=" operator, just ignore the
- * clause. Otherwise note the relid and attnum for the variable.
- * This uses the function for estimating selectivity, ont the
- * operator directly (a bit awkward, but well ...).
- */
- switch (get_oprrest(expr->opno))
- {
- case F_EQSEL:
- *attnum = var->varattno;
- return true;
- }
- }
+ *attnums = bms_add_member(*attnums, var->varattno);
+
+ return true;
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /*
+ * AND/OR-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses
+ * are supported and some are not, and treat all supported
+ * subclauses as a single clause, compute it's selectivity
+ * using mv stats, and compute the total selectivity using
+ * the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the
+ * orclause with nested RestrictInfo - we won't have to
+ * call pull_varnos() for each clause, saving time.
+ */
+ Bitmapset *tmp = NULL;
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ if (! clause_is_mv_compatible(root, (Node*)lfirst(l),
+ varRelid, relid, &tmp, sjinfo, types))
+ return false;
}
+
+ /* add the attnums from the OR-clause to the set of attnums */
+ *attnums = bms_join(*attnums, tmp);
+
+ return true;
}
return false;
-
}
/*
@@ -1322,6 +1793,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
}
return false;
@@ -1617,25 +2091,39 @@ fdeps_filter_clauses(PlannerInfo *root,
foreach (lc, clauses)
{
- AttrNumber attnum;
+ Bitmapset *attnums = NULL;
Node *clause = (Node *) lfirst(lc);
- if (! clause_is_mv_compatible(root, clause, varRelid, relid,
- &attnum, sjinfo))
+ if (! clause_is_mv_compatible(root, clause, varRelid, relid, &attnums,
+ sjinfo, MV_CLAUSE_TYPE_FDEP))
/* clause incompatible with functional dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
- else if (! bms_is_member(attnum, deps_attnums))
+ else if (bms_num_members(attnums) > 1)
+
+ /*
+ * clause referencing multiple attributes (strange, should
+ * this be handled by clause_is_mv_compatible directly)
+ */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(bms_singleton_member(attnums), deps_attnums))
/* clause not covered by the dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
else
{
+ /* ok, clause compatible with existing dependencies */
+ Assert(bms_num_members(attnums) == 1);
+
*deps_clauses = lappend(*deps_clauses, clause);
- clause_attnums = bms_add_member(clause_attnums, attnum);
+ clause_attnums = bms_add_member(clause_attnums,
+ bms_singleton_member(attnums));
}
+
+ bms_free(attnums);
}
return clause_attnums;
@@ -1673,3 +2161,454 @@ get_varattnos(Node * node, Index relid)
return result;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = load_mv_mcvlist(mvstats->mvoid);
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* used to 'scale' for MCV lists not covering all tuples */
+ u += mcvlist->items[i]->frequency;
+
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s*u;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /* frequency of the lowest MCV item */
+ *lowsel = 1.0;
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ *
+ * FIXME This would probably deserve a refactoring, I guess. Unify
+ * the two loops and put the checks inside, or something like
+ * that.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ /* operator */
+ FmgrInfo opproc;
+
+ /* get procedure computing operator selectivity */
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo ltproc, gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype,
+ TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* TODO consider bsearch here (list is sorted by values)
+ * TODO handle other operators too (LT, GT)
+ * TODO identify "full match" when the clauses fully
+ * match the whole MCV list (so that checking the
+ * histogram is not needed)
+ */
+ if (oprrest == F_EQSEL)
+ {
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ bool match = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (match)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ mismatch = (! match);
+ }
+ else if (oprrest == F_SCALARLTSEL) /* column < constant */
+ {
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ } /* (get_oprrest(expr->opno) == F_SCALARLTSEL) */
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+ }
+ else if (oprrest == F_SCALARGTSEL) /* column > constant */
+ {
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARGTSEL) */
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! mcvlist->items[i]->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (mcvlist->items[i]->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 83bd85c..0cb4063 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -410,7 +410,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built)
+ if (mvstat->deps_built || mvstat->mcv_built)
{
info = makeNode(MVStatisticInfo);
@@ -419,9 +419,11 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
+ info->mcv_enabled = mvstat->mcv_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
+ info->mcv_built = mvstat->mcv_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..f9bf10c 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o dependencies.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index bd200bc..d1da714 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,12 +16,14 @@
#include "common.h"
+#include "utils/array.h"
+
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+ int natts,
+ VacAttrStats **vacattrstats);
static List* list_mv_stats(Oid relid);
-
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -49,6 +51,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int j;
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -87,8 +91,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (stat->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, attrs);
+ update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -166,6 +174,8 @@ list_mv_stats(Oid relid)
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
+ info->mcv_enabled = stats->mcv_enabled;
+ info->mcv_built = stats->mcv_built;
result = lappend(result, info);
}
@@ -180,8 +190,56 @@ list_mv_stats(Oid relid)
return result;
}
+
+/*
+ * Find attnims of MV stats using the mvoid.
+ */
+int2vector*
+find_mv_attnums(Oid mvoid, Oid *relid)
+{
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ HeapTuple htup;
+ int2vector *keys;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ htup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+ /* starelid */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_starelid, &isnull);
+ Assert(!isnull);
+
+ *relid = DatumGetObjectId(adatum);
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+ ReleaseSysCache(htup);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return keys;
+}
+
+
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -206,18 +264,29 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
@@ -246,6 +315,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -256,11 +340,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index 6d5465b..f4309f7 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -46,7 +46,15 @@ typedef struct
Datum value; /* a data value */
int tupno; /* position index for tuple it came from */
} ScalarItem;
-
+
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -71,5 +79,6 @@ int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..670dbda
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1237 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+
+/*
+ * Multivariate MCVs (most-common values lists) are a straightforward
+ * extension of regular MCV list, tracking combinations of values for
+ * several attributes (columns), including NULL flags, and frequency
+ * of the combination.
+ *
+ * For columns with small number of distinct values, this works quite
+ * well and may represent the distribution very accurately. For columns
+ * with large number of distinct values (e.g. stored as FLOAT), this
+ * does not work that well. Especially if the distribution is mostly
+ * uniform, with no very common combinations.
+ *
+ * If we can represent the distribution as a MCV list, we can estimate
+ * some clauses (e.g. equality clauses) much accurately than using
+ * histograms for example.
+ *
+ * Another benefit of MCV lists (compared to histograms) is that they
+ * don't require sorting of the values, so that they work better for
+ * data types that either don't support sorting at all, or when the
+ * sorting does not really match the meaning. For example we know how to
+ * sort strings, but it's unlikely to make much sense for city names.
+ *
+ *
+ * Hashed MCV (not yet implemented)
+ * --------------------------------
+ * By restricting to MCV list and equality conditions, we may use hash
+ * values instead of the long varlena values. This significantly reduces
+ * the storage requirements, and we can still use it to estimate the
+ * equality conditions (assuming the collisions are rare enough).
+ *
+ * This however complicates matching the columns to available stats, as
+ * it requires matching clauses (not columns) to stats. And it may get
+ * quite complex - e.g. what if there are multiple clauses, each
+ * compatible with different stats subset?
+ *
+ *
+ * Selectivity estimation
+ * ----------------------
+ * The estimation, implemented in clauselist_mv_selectivity_mcvlist(),
+ * is quite simple in principle - walk through the MCV items and sum
+ * frequencies of all the items that match all the clauses.
+ *
+ * The current implementation uses MCV lists to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ * (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ * (d) OR clauses WHERE (a < 1) OR (b >= 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (e) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ *
+ * Estimating equality clauses
+ * ---------------------------
+ * When computing selectivity estimate for equality clauses
+ *
+ * (a = 1) AND (b = 2)
+ *
+ * we can do this estimate pretty exactly assuming that two conditions
+ * are met:
+ *
+ * (1) there's an equality condition on each attribute
+ *
+ * (2) we find a matching item in the MCV list
+ *
+ * In that case we know the MCV item represents all the tuples matching
+ * the clauses, and the selectivity estimate is complete. This is what
+ * we call 'full match'.
+ *
+ * When only (1) holds, but there's no matching MCV item, we don't know
+ * whether there are no such rows or just are not very frequent. We can
+ * however use the frequency of the least frequent MCV item as an upper
+ * bound for the selectivity.
+ *
+ * If the equality conditions match only a subset of the attributes
+ * the MCV list is built on (i.e. we can't get a full match - we may get
+ * multiple MCV items matching the clauses, but even if we get a single
+ * match there may be items that did not get into the MCV list. But in
+ * this case we can still use the frequency of the last MCV item to clam
+ * the 'additional' selectivity not accounted for by the matching items.
+ *
+ * If there's no histogram, because the MCV list approximates the
+ * distribution accurately (not because the histogram was disabled),
+ * it does not really matter whether there are equality conditions on
+ * all the columns - we can do pretty accurate estimation using the MCV.
+ *
+ * TODO For a combination of equality conditions (not full-match case)
+ * we probably can clamp the selectivity by the minimum of
+ * selectivities for each condition. For example if we know the
+ * number of distinct values for each column, we can use 1/ndistinct
+ * as a per-column estimate. Or rather 1/ndistinct + selectivity
+ * derived from the MCV list.
+ *
+ * If we know the estimate of number of combinations of the columns
+ * (i.e. ndistinct(A,B)), we may estimate the average frequency of
+ * items in the remaining 10% as [10% / ndistinct(A,B)].
+ *
+ *
+ * Bounding estimates
+ * ------------------
+ * In general the MCV lists may not provide estimates as accurate as
+ * for the full-match equality case, but may provide some useful
+ * lower/upper boundaries for the estimation error.
+ *
+ * With equality clauses we can do a few more tricks to narrow this
+ * error range (see the previous section and TODO), but with inequality
+ * clauses (or generally non-equality clauses), it's rather dificult.
+ * There's nothing like a 'full match' - we have to consider both the
+ * MCV items and the remaining part every time. We can't use the minimum
+ * selectivity of MCV items, as the clauses may match multiple items.
+ *
+ * For example with a MCV list on columns (A, B), covering 90% of the
+ * table (computed while building the MCV list), about ~10% of the table
+ * is not represented by the MCV list. So even if the conditions match
+ * all the remaining rows (not represented by the MCV items), we can't
+ * get selectivity higher than those 10%. We may use 1/2 the remaining
+ * selectivity as an estimate (minimizing average error).
+ *
+ * TODO Most of these ideas (error limiting) are not yet implemented.
+ *
+ *
+ * General TODO
+ * ------------
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * TODO Add support for clauses referencing multiple columns (a < b).
+ *
+ * TODO It's possible to build a special case of MCV list, storing not
+ * the actual values but only 32/64-bit hash. This is only useful
+ * for estimating equality clauses and for large varlena types,
+ * which are very impractical for plain MCV list because of size.
+ * But for those data types we really want just the equality
+ * clauses, so it's actually a good solution.
+ *
+ * TODO Currently there's no logic to consider building only a MCV list
+ * (and not building the histogram at all), except for doing this
+ * decision manually in ADD STATISTICS.
+ */
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(int32))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(int32) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/* pointers into a flat serialized item of ITEM_SIZE(n) bytes */
+#define ITEM_INDEXES(item) ((uint16*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+/*
+ * Builds MCV list from sample rows, and removes rows represented by
+ * the MCV list from the sample (the number of remaining sample rows is
+ * returned by the numrows_filtered parameter).
+ *
+ * The method is quite simple - in short it does about these steps:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * For more details, see the comments in the code.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we
+ * want to check the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or
+ * float4). Maybe we could save some space here, but the bytea
+ * compression should handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed
+ * from the table, but rather estimate the number of distinct
+ * values in the table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * Preallocate space for all the items as a single chunk, and point
+ * the items to the appropriate parts of the array.
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ /* keep all the rows by default (as if there was no MCV list) */
+ *numrows_filtered = numrows;
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ items[j].values[i] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Count the number of distinct groups - just walk through the
+ * sorted list and count the number of key changes. We use this to
+ * determine the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array. We'll always
+ * require at least 4 rows per group.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e.
+ * if there are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS),
+ * we'll require only 2 rows per group.
+ *
+ * TODO For now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ * for the multivariate case.
+ *
+ * TODO We can do this only if we believe we got all the distinct
+ * values of the table.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog)
+ * instead of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /*
+ * Walk through the sorted data again, and see how many groups
+ * reach the mcv_threshold (and become an item in the MCV list).
+ */
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or new group, so check if we exceed mcv_threshold */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* group hits the threshold, count the group as MCV item */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* within group, so increase the number of items */
+ count += 1;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as
+ * we'll pass this outside this method and thus it needs to be
+ * easy to pfree() the data (and we wouldn't know where the
+ * arrays start).
+ *
+ * TODO Maybe the reasoning that we can't allocate a single
+ * piece because we're passing it out is bogus? Who'd
+ * free a single item of the MCV list, anyway?
+ *
+ * TODO Maybe with a proper encoding (stuffing all the values
+ * into a list-level array, this will be untrue)?
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /*
+ * Repeat the same loop as above, but this time copy the data
+ * into the MCV list (for items exceeding the threshold).
+ *
+ * TODO Maybe we could simply remember indexes of the last item
+ * in each group (from the previous loop)?
+ */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[nitems];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, items[(i-1)].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, items[(i-1)].isnull, sizeof(bool) * numattrs);
+
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ /* next item */
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows
+ * that are not represented by the MCV list).
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ *
+ * A better option would be keeping the ID of the row in
+ * the sort item, and then just walk through the items and
+ * mark rows to remove (in a bitmap of the same size).
+ * There's not space for that in SortItem at this moment,
+ * but it's trivial to add 'private' pointer, or just
+ * using another structure with extra field (starting with
+ * SortItem, so that the comparators etc. still work).
+ *
+ * Another option is to use the sorted array of items
+ * (because that's how we sorted the source data), and
+ * simply do a bsearch() into it. If we find a matching
+ * item, the row belongs to the MCV list.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* used for the searches */
+ SortItem item, mcvitem;;
+
+ item.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ item.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * FIXME we don't need to allocate this, we can reference
+ * the MCV item directly ...
+ */
+ mcvitem.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ mcvitem.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ item.values[j] = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &item.isnull[j]);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /*
+ * TODO Create a SortItem/MCVItem comparator so that
+ * we don't need to do memcpy() like crazy.
+ */
+ memcpy(mcvitem.values, mcvlist->items[j]->values,
+ numattrs * sizeof(Datum));
+ memcpy(mcvitem.isnull, mcvlist->items[j]->isnull,
+ numattrs * sizeof(bool));
+
+ if (multi_sort_compare(&item, &mcvitem, mss) == 0)
+ {
+ match = true;
+ break;
+ }
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the rows and remember how many rows we kept */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(rows_filtered);
+ pfree(item.values);
+ pfree(item.isnull);
+ pfree(mcvitem.values);
+ pfree(mcvitem.isnull);
+ }
+ }
+
+ pfree(values);
+ pfree(items);
+ pfree(isnull);
+
+ return mcvlist;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+MCVList
+load_mv_mcvlist(Oid mvoid)
+{
+ bool isnull = false;
+ Datum mcvlist;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->mcv_enabled && mvstat->mcv_built);
+#endif
+
+ mcvlist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_mcvlist(DatumGetByteaP(mcvlist));
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Serialize MCV list into a bytea value. The basic algorithm is simple:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use uint16 values for the indexes in step (3), as we don't
+ * allow more than 8k MCV items (see list max_mcv_items). We might
+ * increase this to 65k and still fit into uint16.
+ *
+ * We don't really expect the high compression as with histograms,
+ * because we're not doing any bucket splits etc. (which is the source
+ * of high redundancy there), but we need to do it anyway as we need
+ * to serialize varlena values etc. We might invent another way to
+ * serialize MCV lists, but let's keep it consistent.
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider using 16-bit values for the indexes in step (3).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ Size total_length = 0;
+
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for all values, including NULLs (won't use them) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ if (! mcvlist->items[j]->isnull[i]) /* skip NULL values */
+ {
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* do not exceed UINT16_MAX */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval || (info[i].typlen > 0))
+ /* by value pased by reference, but fixed length */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data.
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized MCV exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'ptr' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the values for each item */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! mcvlist->items[i]->isnull[j])
+ {
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ v = (Datum*)bsearch(&mcvlist->items[i]->values[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims),
+ mcvlist->items[i]->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims),
+ &mcvlist->items[i]->frequency, sizeof(double));
+
+ /* copy the item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/*
+ * Inverse to serialize_mv_mcvlist() - see the comment there.
+ *
+ * We'll do full deserialization, because we don't really expect high
+ * duplication of values so the caching may not be as efficient as with
+ * histograms.
+ */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ uint16 *indexes = NULL;
+ Datum **values = NULL;
+
+ /* local allocation buffer (used only for deserialization) */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* buffer used for the result */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert(nitems > 0);
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MCVListData,items) +
+ ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * We'll allocate one large chunk of memory for the intermediate
+ * data, needed only for deserializing the MCV list, and we'll pack
+ * use a local dense allocation to minimize the palloc overhead.
+ *
+ * Let's see how much space we'll actually need, and also include
+ * space for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ /* for full-size byval types, we reuse the serialized value */
+ if (! (info[i].typbyval && info[i].typlen == sizeof(Datum)))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ /* pased by reference, but fixed length (name, tid, ...) */
+ if (info[i].typlen > 0)
+ {
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should exhaust the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for the MCV items in a single piece */
+ rbufflen = (sizeof(MCVItem) + sizeof(MCVItemData) +
+ sizeof(Datum)*ndims + sizeof(bool)*ndims) * nitems;
+
+ rbuff = palloc(rbufflen);
+ rptr = rbuff;
+
+ mcvlist->items = (MCVItem*)rbuff;
+ rptr += (sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)rptr;
+ rptr += (sizeof(MCVItemData));
+
+ item->values = (Datum*)rptr;
+ rptr += (sizeof(Datum)*ndims);
+
+ item->isnull = (bool*)rptr;
+ rptr += (sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+ for (j = 0; j < ndims; j++)
+ Assert(indexes[j] <= UINT16_MAX);
+#endif
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ /* release the temporary buffer */
+ pfree(buff);
+
+ return mcvlist;
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_mcv_items);
+
+Datum
+pg_mv_mcv_items(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MCVList mcvlist;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ mcvlist = load_mv_mcvlist(PG_GETARG_OID(0));
+
+ funcctx->user_fctx = mcvlist;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = mcvlist->nitems;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MCVList mcvlist;
+ MCVItem item;
+
+ mcvlist = (MCVList)funcctx->user_fctx;
+
+ Assert(call_cntr < mcvlist->nitems);
+
+ item = mcvlist->items[call_cntr];
+
+ stakeys = find_mv_attnums(PG_GETARG_OID(0), &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(4 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+
+ /* frequency */
+ values[3] = (char *) palloc(64 * sizeof(char));
+
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * mcvlist->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* item ID */
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ Datum val, valout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == mcvlist->ndimensions-1)
+ format = "%s, %s}";
+
+ val = item->values[i];
+ valout = FunctionCall1(&fmgrinfo[i], val);
+
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 590cd51..7d13a38 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,8 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled,\n"
- " deps_built,\n"
+ " deps_enabled, mcv_enabled,\n"
+ " deps_built, mcv_built,\n"
+ " mcv_max_items,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2128,6 +2129,8 @@ describeOneTableDetails(const char *schemaname,
printTableAddFooter(&cont, _("Statistics:"));
for (i = 0; i < tuples; i++)
{
+ bool first = true;
+
printfPQExpBuffer(&buf, " ");
/* statistics name (qualified with namespace) */
@@ -2137,10 +2140,22 @@ describeOneTableDetails(const char *schemaname,
/* options */
if (!strcmp(PQgetvalue(result, i, 4), "t"))
- appendPQExpBuffer(&buf, "(dependencies)");
+ {
+ appendPQExpBuffer(&buf, "(dependencies");
+ first = false;
+ }
+
+ if (!strcmp(PQgetvalue(result, i, 5), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", mcv");
+ else
+ appendPQExpBuffer(&buf, "(mcv");
+ first = false;
+ }
- appendPQExpBuffer(&buf, " ON (%s)",
- PQgetvalue(result, i, 6));
+ appendPQExpBuffer(&buf, ") ON (%s)",
+ PQgetvalue(result, i, 9));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index a568a07..fd7107d 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -37,15 +37,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -61,13 +67,17 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 7
+#define Natts_pg_mv_statistic 11
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
-#define Anum_pg_mv_statistic_deps_built 5
-#define Anum_pg_mv_statistic_stakeys 6
-#define Anum_pg_mv_statistic_stadeps 7
+#define Anum_pg_mv_statistic_mcv_enabled 5
+#define Anum_pg_mv_statistic_mcv_max_items 6
+#define Anum_pg_mv_statistic_deps_built 7
+#define Anum_pg_mv_statistic_mcv_built 8
+#define Anum_pg_mv_statistic_stakeys 9
+#define Anum_pg_mv_statistic_stadeps 10
+#define Anum_pg_mv_statistic_stamcv 11
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 76e054d..1875e26 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2745,6 +2745,10 @@ DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 7ae0f9e..d3c9898 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -592,9 +592,11 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
+ bool mcv_enabled; /* MCV list enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
+ bool mcv_built; /* MCV list built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index cc43a79..4535db7 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -51,30 +51,89 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
*/
MVDependencies load_mv_dependencies(Oid mvoid);
+MCVList load_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..56748e3
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON mcv_list (unknown_column) WITH (mcv);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON mcv_list (a) WITH (mcv);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a, b) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+ERROR: max number of MCV items is 8192
+-- correct command
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON mcv_list (a, b, c, d) WITH (mcv);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2e2df8e..ac5007e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1369,7 +1369,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
c.relname AS tablename,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 81484f1..838c12b 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 14ea574..d97a0ec 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -162,3 +162,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..af4c9f4
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,178 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON mcv_list (unknown_column) WITH (mcv);
+
+-- single column
+CREATE STATISTICS s1 ON mcv_list (a) WITH (mcv);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a) WITH (mcv);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a, b) WITH (mcv);
+
+-- unknown option
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (unknown_option);
+
+-- missing MCV statistics
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+
+-- correct command
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON mcv_list (a, b, c, d) WITH (mcv);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DROP TABLE mcv_list;
--
2.1.0
0005-multivariate-histograms.patchapplication/x-patch; name=0005-multivariate-histograms.patchDownload
From ff5b8b94fc19654a7fe98b0701d89af668388313 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 5/7] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
doc/src/sgml/ref/create_statistics.sgml | 18 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 44 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 718 ++++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 37 +-
src/backend/utils/mvstats/histogram.c | 2316 ++++++++++++++++++++++++++++
src/bin/psql/describe.c | 17 +-
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 136 +-
src/test/regress/expected/mv_histogram.out | 207 +++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 176 +++
19 files changed, 3680 insertions(+), 38 deletions(-)
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index 193e4b0..fd3382e 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -133,6 +133,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</varlistentry>
<varlistentry>
+ <term><literal>histogram</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables histogram for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>max_buckets</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of histogram buckets.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><literal>max_mcv_items</> (<type>integer</>)</term>
<listitem>
<para>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5488061..0fbdfa5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -167,7 +167,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 90bfaed..b974655 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -137,12 +137,15 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -241,6 +244,29 @@ CreateStatistics(CreateStatsStmt *stmt)
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("maximum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -249,10 +275,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -260,6 +286,11 @@ CreateStatistics(CreateStatsStmt *stmt)
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -279,11 +310,14 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 9e029ef..0edc839 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1949,10 +1949,12 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
WRITE_BOOL_FIELD(mcv_enabled);
+ WRITE_BOOL_FIELD(hist_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
WRITE_BOOL_FIELD(mcv_built);
+ WRITE_BOOL_FIELD(hist_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index d194551..5b2d92a 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -49,6 +49,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
@@ -73,6 +74,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStatisticInfo *mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -80,6 +83,12 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
@@ -114,6 +123,7 @@ static Bitmapset * get_varattnos(Node * node, Index relid);
#define UPDATE_RESULT(m,r,isor) \
(m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -304,7 +314,7 @@ clauselist_selectivity(PlannerInfo *root,
* Check that there are statistics with MCV list. If not, we don't
* need to waste time with the optimization.
*/
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV))
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
/*
* Recollect attributes from mv-compatible clauses (maybe we've
@@ -312,7 +322,7 @@ clauselist_selectivity(PlannerInfo *root,
* From now on we're only interested in MCV-compatible clauses.
*/
mvattnums = collect_mv_attnums(root, clauses, varRelid, &relid, sjinfo,
- MV_CLAUSE_TYPE_MCV);
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/*
* If there still are at least two columns, we'll try to select
@@ -331,7 +341,7 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, sjinfo, clauses,
varRelid, &mvclauses, mvstat,
- MV_CLAUSE_TYPE_MCV);
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -1098,6 +1108,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -1111,9 +1122,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1255,7 +1281,7 @@ choose_mv_statistics(List *stats, Bitmapset *attnums)
int numattrs = attrs->dim1;
/* skip dependencies-only stats */
- if (! info->mcv_built)
+ if (! (info->mcv_built || info->hist_built))
continue;
/* count columns covered by the histogram */
@@ -1415,7 +1441,6 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
bool ok;
/* is it 'variable op constant' ? */
-
ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
(is_pseudo_constant_clause_relids(lsecond(expr->args),
right_relids) ||
@@ -1465,10 +1490,10 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
case F_SCALARLTSEL:
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (types & MV_CLAUSE_TYPE_MCV)
+ if (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
*attnums = bms_add_member(*attnums, var->varattno);
- return (types & MV_CLAUSE_TYPE_MCV);
+ return (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
}
return false;
@@ -1796,6 +1821,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
}
return false;
@@ -2612,3 +2640,675 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ int nmatches = 0;
+ char *matches = NULL;
+
+ MVSerializedHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = load_mv_histogram(mvstats->mvoid);
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * Find out what part of the data is covered by the histogram,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample got into the histogram, and the rest
+ * is in a MCV list).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole histogram, which might save us some time
+ * spent accessing the not-matching part of the histogram.
+ * Although it's likely in a cache, so it's very fast.
+ */
+ u += mvhist->buckets[i]->ntuples;
+
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s * u;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ /*
+ * Used for caching function calls, only once per deduplicated value.
+ *
+ * We know may have up to (2 * nbuckets) values per dimension. It's
+ * probably overkill, but let's allocate that once for all clauses,
+ * to minimize overhead.
+ *
+ * Also, we only need two bits per value, but this allocates byte
+ * per value. Might be worth optimizing.
+ *
+ * 0x00 - not yet called
+ * 0x01 - called, result is 'false'
+ * 0x03 - called, result is 'true'
+ */
+ char *callcache = palloc(mvhist->nbuckets);
+
+ Assert(mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ /* reset the cache (per clause) */
+ memset(callcache, 0, mvhist->nbuckets);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ *
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR
+ | TYPECACHE_GT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ bool tmp;
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /* histogram boundaries */
+ Datum minval, maxval;
+
+ /* values from the call cache */
+ char mincached, maxcached;
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* lookup the values and cache of function calls */
+ minval = mvhist->values[idx][bucket->min[idx]];
+ maxval = mvhist->values[idx][bucket->max[idx]];
+
+ mincached = callcache[bucket->min[idx]];
+ maxcached = callcache[bucket->max[idx]];
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ *
+ * FIXME Once the min/max values are deduplicated, we can easily minimize
+ * the number of calls to the comparator (assuming we keep the
+ * deduplicated structure). See the note on compression at MVBucket
+ * serialize/deserialize methods.
+ */
+ switch (oprrest)
+ {
+ case F_SCALARLTSEL: /* column < constant */
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (minval, constvalue).
+ */
+ callcache[bucket->min[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(mincached & 0x02); /* get call result from the cache (inverse) */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ maxval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (minval, constvalue).
+ */
+ callcache[bucket->max[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(maxcached & 0x02); /* extract the result (reverse) */
+
+ if (tmp) /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+
+ }
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->max[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ minval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (mincached & 0x02); /* extract the result */
+
+ if (tmp) /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ break;
+
+ case F_SCALARGTSEL: /* column > constant */
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ maxval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (val, constvalue).
+ */
+ callcache[bucket->max[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /*
+ * Update the cache, but with the inverse value, as we keep the
+ * cache for calls with (val, constvalue).
+ */
+ callcache[bucket->min[idx]] = (tmp) ? 0x01 : 0x03;
+ }
+ else
+ tmp = !(mincached & 0x02); /* extract the result */
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ minval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (mincached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ */
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->max[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ }
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the lt/gt
+ * operators fetched from type cache.
+ *
+ * TODO We'll use the default 50% estimate, but that's probably way off
+ * if there are multiple distinct values. Consider tweaking this a
+ * somehow, e.g. using only a part inversely proportional to the
+ * estimated number of distinct values in the bucket.
+ *
+ * TODO This does not handle inclusion flags at the moment, thus counting
+ * some buckets twice (when hitting the boundary).
+ *
+ * TODO Optimization is that if max[i] == min[i], it's effectively a MCV
+ * item and we can count the whole bucket as a complete match (thus
+ * using 100% bucket selectivity and not just 50%).
+ *
+ * TODO Technically some buckets may "degenerate" into single-value
+ * buckets (not necessarily for all the dimensions) - maybe this
+ * is better than keeping a separate MCV list (multi-dimensional).
+ * Update: Actually, that's unlikely to be better than a separate
+ * MCV list for two reasons - first, it requires ~2x the space
+ * (because of storing lower/upper boundaries) and second because
+ * the buckets are ranges - depending on the partitioning algorithm
+ * it may not even degenerate into (min=max) bucket. For example the
+ * the current partitioning algorithm never does that.
+ */
+ if (! mincached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ /* Update the cache. */
+ callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (mincached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ if (! maxcached)
+ {
+ tmp = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ /* Update the cache. */
+ callcache[bucket->max[idx]] = (tmp) ? 0x03 : 0x01;
+ }
+ else
+ tmp = (maxcached & 0x02); /* extract the result */
+
+ if (tmp)
+ {
+ /* no match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ continue;
+ }
+
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+
+ break;
+ }
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+
+ /* free the call cache */
+ pfree(callcache);
+
+#ifdef DEBUG_MVHIST
+ debug_histogram_matches(mvhist, matches);
+#endif
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 0cb4063..963d26e 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -410,7 +410,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
{
info = makeNode(MVStatisticInfo);
@@ -420,10 +420,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
+ info->hist_enabled = mvstat->hist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
+ info->hist_built = mvstat->hist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index f9bf10c..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index d1da714..ffb76f4 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -13,11 +13,11 @@
*
*-------------------------------------------------------------------------
*/
+#include "postgres.h"
+#include "utils/array.h"
#include "common.h"
-#include "utils/array.h"
-
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts,
VacAttrStats **vacattrstats);
@@ -52,7 +52,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -95,8 +96,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (stat->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
}
}
@@ -176,6 +181,8 @@ list_mv_stats(Oid relid)
info->deps_built = stats->deps_built;
info->mcv_enabled = stats->mcv_enabled;
info->mcv_built = stats->mcv_built;
+ info->hist_enabled = stats->hist_enabled;
+ info->hist_built = stats->hist_built;
result = lappend(result, info);
}
@@ -190,7 +197,6 @@ list_mv_stats(Oid relid)
return result;
}
-
/*
* Find attnims of MV stats using the mvoid.
*/
@@ -236,9 +242,16 @@ find_mv_attnums(Oid mvoid, Oid *relid)
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -271,22 +284,34 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..933700f
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,2316 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+#include <math.h>
+
+/*
+ * Multivariate histograms
+ * -----------------------
+ *
+ * Histograms are a collection of buckets, represented by n-dimensional
+ * rectangles. Each rectangle is delimited by a min/max value in each
+ * dimension, stored in an array, so that the bucket includes values
+ * fulfilling condition
+ *
+ * min[i] <= value[i] <= max[i]
+ *
+ * where 'i' is the dimension. In 1D this corresponds to a simple
+ * interval, in 2D to a rectangle, and in 3D to a block. If you can
+ * imagine this in 4D, congrats!
+ *
+ * In addition to the bounaries, each bucket tracks additional details:
+ *
+ * * frequency (fraction of tuples it matches)
+ * * whether the boundaries are inclusive or exclusive
+ * * whether the dimension contains only NULL values
+ * * number of distinct values in each dimension (for building)
+ *
+ * and possibly some additional information.
+ *
+ * We do expect to support multiple histogram types, with different
+ * features etc. The 'type' field is used to identify those types.
+ * Technically some histogram types might use completely different
+ * bucket representation, but that's not expected at the moment.
+ *
+ * Although the current implementation builds non-overlapping buckets,
+ * the code does not (and should not) rely on the non-overlapping
+ * nature - there are interesting types of histograms / histogram
+ * building algorithms producing overlapping buckets.
+ *
+ *
+ * NULL handling (create_null_buckets)
+ * -----------------------------------
+ * Another thing worth mentioning is handling of NULL values. It would
+ * be quite difficult to work with buckets containing NULL and non-NULL
+ * values for a single dimension. To work around this, the initial step
+ * in building a histogram is building a set of 'NULL-buckets', i.e.
+ * buckets with one or more NULL-only dimensions.
+ *
+ * After that, no buckets are mixing NULL and non-NULL values in one
+ * dimension, and the actual histogram building starts. As that only
+ * splits the buckets into smaller ones, the resulting buckets can't
+ * mix NULL and non-NULL values either.
+ *
+ * The maximum number of NULL-buckets is determined by the number of
+ * attributes the histogram is built on. For N-dimensional histogram,
+ * the maximum number of NULL-buckets is 2^N. So for 8 attributes
+ * (which is the current value of MVSTATS_MAX_DIMENSIONS), there may be
+ * up to 256 NULL-buckets.
+ *
+ * Those buckets are only built if needed - if there are no NULL values
+ * in the data, no such buckets are built.
+ *
+ *
+ * Estimating selectivity
+ * ----------------------
+ * With histograms, we always "match" a whole bucket, not indivitual
+ * rows (or values), irrespectedly of the type of clause. Therefore we
+ * can't use the optimizations for equality clauses, as in MCV lists.
+ *
+ * The current implementation uses histograms to estimates those types
+ * of clauses (think of WHERE conditions):
+ *
+ * (a) equality clauses WHERE (a = 1) AND (b = 2)
+ * (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ * (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ * (d) OR-clauses WHERE (a = 1) OR (b = 2)
+ *
+ * It's possible to add more clauses, for example:
+ *
+ * (e) multi-var clauses WHERE (a > b)
+ *
+ * and so on. These are tasks for the future, not yet implemented.
+ *
+ * When used on low-cardinality data, histograms usually perform
+ * considerably worse than MCV lists (which are a good fit for this
+ * kind of data). This is especially true on categorical data, where
+ * ordering of the values is mostly unrelated to meaning of the data,
+ * as proper ordering is crucial for histograms.
+ *
+ * On high-cardinality data the histograms are usually a better choice,
+ * because MCV lists can't represent the distribution accurately enough.
+ *
+ * By evaluating a clause on a bucket, we may get one of three results:
+ *
+ * (a) FULL_MATCH - The bucket definitely matches the clause.
+ *
+ * (b) PARTIAL_MATCH - The bucket matches the clause, but not
+ * necessarily all the tuples it represents.
+ *
+ * (c) NO_MATCH - The bucket definitely does not match the clause.
+ *
+ * This may be illustrated using a range [1, 5], which is essentially
+ * a 1-D bucket. With clause
+ *
+ * WHERE (a < 10) => FULL_MATCH (all range values are below
+ * 10, so the whole bucket matches)
+ *
+ * WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ * the clause, but we don't know how many)
+ *
+ * WHERE (a < 0) => NO_MATCH (the whole range is above 1, so
+ * no values from the bucket can match)
+ *
+ * Some clauses may produce only some of those results - for example
+ * equality clauses may never produce FULL_MATCH as we always hit only
+ * part of the bucket (we can't match both boundaries at the same time).
+ * This results in less accurate estimates compared to MCV lists, where
+ * we can hit a MCV items exactly (there's no PARTIAL match in MCV).
+ *
+ * There are clauses that may not produce any PARTIAL_MATCH results.
+ * A nice example of that is 'IS [NOT] NULL' clause, which either
+ * matches the bucket completely (FULL_MATCH) or not at all (NO_MATCH),
+ * thanks to how the NULL-buckets are constructed.
+ *
+ * Computing the total selectivity estimate is trivial - simply sum
+ * selectivities from all the FULL_MATCH and PARTIAL_MATCH buckets (but
+ * multiply the PARTIAL_MATCH buckets by 0.5 to minimize average error).
+ *
+ *
+ * Serialization
+ * -------------
+ * After building, the histogram is serialized into a more efficient
+ * form (dedup boundary values etc.). See serialize_mv_histogram() for
+ * more details about how it's done.
+ *
+ * Serialized histograms are marked with 'magic' constant, to make it
+ * easier to check the bytea value really is a serialized histogram.
+ *
+ * In the serialized form, values for each dimension are deduplicated,
+ * and referenced using an uint16 index. This saves a lot of space,
+ * because every time we split a bucket, we introduce a single new
+ * boundary value (to split the bucket by the selected dimension), but
+ * we actually copy all the boundary values for all dimensions. So for
+ * a histogram with 4 dimensions and 1000 buckets, we do have
+ *
+ * 1000 * 4 * 2 = 8000
+ *
+ * boundary values, but many of them are actually duplicated because
+ * the histogram started with a single bucket (8 boundary values) and
+ * then there were 999 splits (each introducing 1 new value):
+ *
+ * 8 + 999 = 1007
+ *
+ * So that's quite large diffence. Let's assume the Datum values are
+ * 8 bytes each. Storing the raw histogram would take ~ 64 kB, while
+ * with deduplication it's only ~18 kB.
+ *
+ * The difference may be removed by the transparent bytea compression,
+ * but the deduplication is also used to optimize the estimation. It's
+ * possible to process the deduplicated values, and then use this as
+ * a cache to minimize the actual function calls while checking the
+ * buckets. This significantly reduces the number of calls to the
+ * (often quite expensive) operator functions etc.
+ *
+ *
+ * The current limit on number of buckets (16384) is mostly arbitrary,
+ * but set so that it makes sure we don't exceed the number of distinct
+ * values indexable by uint16. In practice we could handle more buckets,
+ * because we index each dimension independently, and we do the splits
+ * over multiple dimensions.
+ *
+ * Histograms with more than 16k buckets are quite expensive to build
+ * and process, so the current limit is somewhat reasonable.
+ *
+ * The actual number of buckets is also related to statistics target,
+ * because we require MIN_BUCKET_ROWS (10) tuples per bucket before
+ * a split, so we can't have more than (2 * 300 * target / 10) buckets.
+ *
+ *
+ * TODO Maybe the distinct stats (both for combination of all columns
+ * and for combinations of various subsets of columns) should be
+ * moved to a separate structure (next to histogram/MCV/...) to
+ * make it useful even without a histogram computed etc.
+ *
+ * This would actually make mvcoeff (proposed by Kyotaro Horiguchi
+ * in [1]) possible. Seems like a good way to estimate GROUP BY
+ * cardinality, and also some other cases, pointed out by Kyotaro:
+ *
+ * [1] http://www.postgresql.org/message-id/20150515.152936.83796179.horiguchi.kyotaro@lab.ntt.co.jp
+ *
+ * This is not implemented at the moment, though. Also, Kyotaro's
+ * patch only works with pairs of columns, but maybe tracking all
+ * the combinations would be useful to handle more complex
+ * conditions. It only seems to handle equalities, though (but for
+ * GROUP BY estimation that's not a big deal).
+ */
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(int32))
+ * - max boundary indexes (2 * ndim * sizeof(int32))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(int32) + 3 * sizeof(bool)) +
+ * 2 * sizeof(float)
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) ((float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* can't split bucket with less than 10 rows */
+#define MIN_BUCKET_ROWS 10
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size.
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket
+ * for more details about the algorithm.
+ *
+ * The current algorithm works like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (largest bucket)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (largest dimension)
+ * split the bucket into two buckets
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ int *ndistvalues;
+ Datum **distvalues;
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets
+ = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * Collect info on distinct values in each dimension (used later
+ * to select dimension to partition).
+ */
+ ndistvalues = (int*)palloc0(sizeof(int) * numattrs);
+ distvalues = (Datum**)palloc0(sizeof(Datum*) * numattrs);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ int j;
+ int nvals;
+ Datum *tmp;
+
+ SortSupportData ssup;
+ StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ nvals = 0;
+ tmp = (Datum*)palloc0(sizeof(Datum) * numrows);
+
+ for (j = 0; j < numrows; j++)
+ {
+ bool isnull;
+
+ /* remember the index of the sample row, to make the partitioning simpler */
+ Datum value = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+
+ if (isnull)
+ continue;
+
+ tmp[nvals++] = value;
+ }
+
+ /* do the sort and stuff only if there are non-NULL values */
+ if (nvals > 0)
+ {
+ /* sort the array of values */
+ qsort_arg((void *) tmp, nvals, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /* count distinct values */
+ ndistvalues[i] = 1;
+ for (j = 1; j < nvals; j++)
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ ndistvalues[i] += 1;
+
+ /* FIXME allocate only needed space (count ndistinct first) */
+ distvalues[i] = (Datum*)palloc0(sizeof(Datum) * ndistvalues[i]);
+
+ /* now collect distinct values into the array */
+ distvalues[i][0] = tmp[0];
+ ndistvalues[i] = 1;
+
+ for (j = 1; j < nvals; j++)
+ {
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ {
+ distvalues[i][ndistvalues[i]] = tmp[j];
+ ndistvalues[i] += 1;
+ }
+ }
+ }
+
+ pfree(tmp);
+ }
+
+ /*
+ * The initial bucket may contain NULL values, so we have to create
+ * buckets with NULL-only dimensions.
+ *
+ * FIXME We may need up to 2^ndims buckets - check that there are
+ * enough buckets (MVSTAT_HIST_MAX_BUCKETS >= 2^ndims).
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets]
+ = partition_bucket(bucket, attrs, stats,
+ ndistvalues, distvalues);
+
+ histogram->nbuckets += 1;
+ }
+
+ /* finalize the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data
+ = ((HistogramBuild)histogram->buckets[i]->build_data);
+
+ /*
+ * The frequency has to be computed from the whole sample, in
+ * case some of the rows were used for MCV (and thus are missing
+ * from the histogram).
+ */
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVSerializedHistogram
+load_mv_histogram(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram(DatumGetByteaP(histogram));
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVSerializedHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm
+ * is simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use 32-bit values for the indexes in step (3), although we
+ * could probably use just 16 bits as we don't allow more than 8k
+ * buckets in the histogram max_buckets (well, we might increase this
+ * to 16k and still fit into signed 16-bits). But let's be lazy and rely
+ * on the varlena compression to kick in. If most bytes will be 0x00
+ * so it should work nicely.
+ *
+ *
+ * Deduplication in serialization
+ * ------------------------------
+ * The deduplication is very effective and important here, because every
+ * time we split a bucket, we keep all the boundary values, except for
+ * the dimension that was used for the split. Another way to look at
+ * this is that each split introduces 1 new value (the value used to do
+ * the split). A histogram with M buckets was created by (M-1) splits
+ * of the initial bucket, and each bucket has 2*N boundary values. So
+ * assuming the initial bucket does not have any 'collapsed' dimensions,
+ * the number of distinct values is
+ *
+ * (2*N + (M-1))
+ *
+ * but the total number of boundary values is
+ *
+ * 2*N*M
+ *
+ * which is clearly much higher. For a histogram on two columns, with
+ * 1024 buckets, it's 1027 vs. 4096. Of course, we're not saving all
+ * the difference (because we'll use 32-bit indexes into the values).
+ * But with large values (e.g. stored as varlena), this saves a lot.
+ *
+ * An interesting feature is that the total number of distinct values
+ * does not really grow with the number of dimensions, except for the
+ * size of the initial bucket. After that it only depends on number of
+ * buckets (i.e. number of splits).
+ *
+ * XXX Of course this only holds for the current histogram building
+ * algorithm. Algorithms doing the splits differently (e.g.
+ * producing overlapping buckets) may behave differently.
+ *
+ * TODO This only confirms we can use the uint16 indexes. The worst
+ * that could happen is if all the splits happened by a single
+ * dimension. To exhaust the uint16 this would require ~64k
+ * splits (needs to be reflected in MVSTAT_HIST_MAX_BUCKETS).
+ *
+ * TODO We don't need to use a separate boolean for each flag, instead
+ * use a single char and set bits.
+ *
+ * TODO We might get a bit better compression by considering the actual
+ * data type length. The current implementation treats all data
+ * types passed by value as requiring 8B, but for INT it's actually
+ * just 4B etc.
+ *
+ * OTOH this is only related to the lookup table, and most of the
+ * space is occupied by the buckets (with int16 indexes).
+ *
+ *
+ * Varlena compression
+ * -------------------
+ * This encoding may prevent automatic varlena compression (similarly
+ * to JSONB), because first part of the serialized bytea will be an
+ * array of unique values (although sorted), and pglz decides whether
+ * to compress by trying to compress the first part (~1kB or so). Which
+ * is likely to be poor, due to the lack of repetition.
+ *
+ * One possible cure to that might be storing the buckets first, and
+ * then the deduplicated arrays. The buckets might be better suited
+ * for compression.
+ *
+ * On the other hand the encoding scheme is a context-aware compression,
+ * usually compressing to ~30% (or less, with large data types). So the
+ * lack of pglz compression may be OK.
+ *
+ * XXX But maybe we don't really want to compress this, to save on
+ * planning time?
+ *
+ * TODO Try storing the buckets / deduplicated arrays in reverse order,
+ * measure impact on compression.
+ *
+ *
+ * Deserialization
+ * ---------------
+ * The deserialization is currently implemented so that it reconstructs
+ * the histogram back into the same structures - this involves quite
+ * a few of memcpy() and palloc(), but maybe we could create a special
+ * structure for the serialized histogram, and access the data directly,
+ * without the unpacking.
+ *
+ * Not only it would save some memory and CPU time, but might actually
+ * work better with CPU caches (not polluting the caches).
+ *
+ * TODO Try to keep the compressed form, instead of deserializing it to
+ * MVHistogram/MVBucket.
+ *
+ *
+ * General TODOs
+ * -------------
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs
+ * (we won't use them, but we don't know how many are there),
+ * and then collect all non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* make sure we fit into uint16 */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typlen > 0)
+ /* byval or byref, but with fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (10 * 1024 * 1024))
+ elog(ERROR, "serialized histogram exceeds 10MB (%ld > %d)",
+ total_length, (10 * 1024 * 1024));
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typlen > 0)
+ {
+ /* pased by value or reference, but fixed length */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the histogram buckets */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ *BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ uint16 idx;
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ /* min boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* FIXME free the values/counts arrays here */
+
+ return output;
+}
+
+/*
+ * Returns histogram in a partially-serialized form (keeps the boundary
+ * values deduplicated, so that it's possible to optimize the estimation
+ * part by caching function call results between buckets etc.).
+ */
+MVSerializedHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+
+ MVSerializedHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram
+ = (MVSerializedHistogram)palloc(sizeof(MVSerializedHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVSerializedHistogramData, buckets));
+ tmp += offsetof(MVSerializedHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVSerializedHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* now let's allocate a single buffer for all the values and counts */
+
+ bufflen = (sizeof(int) + sizeof(Datum*)) * ndims;
+ for (i = 0; i < ndims; i++)
+ {
+ /* don't allocate space for byval types, matching Datum */
+ if (! (info[i].typbyval && (info[i].typlen == sizeof(Datum))))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+ }
+
+ /* also, include space for the result, tracking the buckets */
+ bufflen += nbuckets * (
+ sizeof(MVSerializedBucket) + /* bucket pointer */
+ sizeof(MVSerializedBucketData)); /* bucket data */
+
+ buff = palloc(bufflen);
+ ptr = buff;
+
+ histogram->nvalues = (int*)ptr;
+ ptr += (sizeof(int) * ndims);
+
+ histogram->values = (Datum**)ptr;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ histogram->nvalues[i] = info[i].nvalues;
+
+ if (info[i].typbyval && info[i].typlen == sizeof(Datum))
+ {
+ /* passed by value / Datum - simply reuse the array */
+ histogram->values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typbyval)
+ {
+ /* pased by value, but smaller than Datum */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&histogram->values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ histogram->buckets = (MVSerializedBucket*)ptr;
+ ptr += (sizeof(MVSerializedBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVSerializedBucket bucket = (MVSerializedBucket)ptr;
+ ptr += sizeof(MVSerializedBucketData);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+ bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+ bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+ bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+ bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+ bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ /* we should exhaust the output buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller ones.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size. We
+ * select the bucket with the highest number of distinct values, and
+ * then split it by the longest dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this
+ * is used to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this
+ * contains values for all the tuples from the sample, not just
+ * the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ * For example use the "bucket volume" (product of dimension
+ * lengths) to select the bucket.
+ *
+ * We need buckets containing about the same number of tuples (so
+ * about the same frequency), as that limits the error when we
+ * match the bucket partially (in that case use 1/2 the bucket).
+ *
+ * We also need buckets with "regular" size, i.e. not "narrow" in
+ * some dimensions and "wide" in the others, because that makes
+ * partial matches more likely and increases the estimation error,
+ * especially when the clauses match many buckets partially. This
+ * is especially serious for OR-clauses, because in that case any
+ * of them may add the bucket as a (partial) match. With AND-clauses
+ * all the clauses have to match the bucket, which makes this issue
+ * somewhat less pressing.
+ *
+ * For example this table:
+ *
+ * CREATE TABLE t AS SELECT i AS a, i AS b
+ * FROM generate_series(1,1000000) s(i);
+ * ALTER TABLE t ADD STATISTICS (histogram) ON (a,b);
+ * ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because
+ * every bucket always has exactly the same number of distinct
+ * values in all dimensions, which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ * SELECT * FROM t WHERE a < 10 AND b < 10;
+ *
+ * is estimated to return ~120 rows, while in reality it returns 9.
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=117 width=8)
+ * (actual time=0.185..270.774 rows=9 loops=1)
+ * Filter: ((a < 10) AND (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * while the query using OR clauses is estimated like this:
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=8100 width=8)
+ * (actual time=0.118..189.919 rows=9 loops=1)
+ * Filter: ((a < 10) OR (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * which is clearly much worse. This happens because the histogram
+ * contains buckets like this:
+ *
+ * bucket 592 [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the
+ * length of "b" is (30593-30134)=459. So the "b" dimension is much
+ * narrower than "a". Of course, there are buckets where "b" is the
+ * wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension
+ * in partition_bucket() but that only happens after we already
+ * selected the bucket. So if we never select the bucket, we can't
+ * really fix it there.
+ *
+ * The other reason why this particular example behaves so poorly
+ * is due to the way we split the partition in partition_bucket().
+ * Currently we attempt to divide the bucket into two parts with
+ * the same number of sampled tuples (frequency), but that does not
+ * work well when all the tuples are squashed on one end of the
+ * bucket (e.g. exactly at the diagonal, as a=b). In that case we
+ * split the bucket into a tiny bucket on the diagonal, and a huge
+ * remaining part of the bucket, which is still going to be narrow
+ * and we're unlikely to fix that.
+ *
+ * So perhaps we need two partitioning strategies - one aiming to
+ * split buckets with high frequency (number of sampled rows), the
+ * other aiming to split "large" buckets. And alternating between
+ * them, somehow.
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int numrows = 0;
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+ /* if the number of rows is higher, use this bucket */
+ if ((data->ndistinct > 2) &&
+ (data->numrows > numrows) &&
+ (data->numrows >= MIN_BUCKET_ROWS)) {
+ bucket = buckets[i];
+ numrows = data->numrows;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest
+ * bucket dimension, measured using the array of distinct values built
+ * at the very beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly
+ * distributed, and then use this to measure length. It's essentially
+ * a number of distinct values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts
+ * with roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning
+ * the new bucket (essentially shrinking the existing one in-place and
+ * returning the other "half" as a new bucket). The caller is responsible
+ * for adding the new bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case
+ * of strongly dependent columns - e.g. y=x). The current algorithm
+ * minimizes this, but may still happen for perfectly dependent
+ * examples (when all the dimensions have equal length, the first
+ * one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ // int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+ double delta;
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split.
+ */
+ delta = 0.0;
+ dimension = -1;
+
+ for (i = 0; i < numattrs; i++)
+ {
+ Datum *a, *b;
+
+ mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ /* can't split NULL-only dimension */
+ if (bucket->nullsonly[i])
+ continue;
+
+ /* can't split dimension with a single ndistinct value */
+ if (data->ndistincts[i] <= 1)
+ continue;
+
+ /* sort support for the bsearch_comparator */
+ ssup_private = &ssup;
+
+ /* search for min boundary in the distinct list */
+ a = (Datum*)bsearch(&bucket->min[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ b = (Datum*)bsearch(&bucket->max[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ /* if this dimension is 'larger' then partition by it */
+ if (((b-a)*1.0 / ndistvalues[i]) > delta)
+ {
+ delta = ((b-a)*1.0 / ndistvalues[i]);
+ dimension = i;
+ }
+ }
+
+ /*
+ * If we haven't found a dimension here, we've done something
+ * wrong in select_bucket_to_partition.
+ */
+ Assert(dimension != -1);
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ delta = fabs(data->numrows);
+ split_value = values[0].value;
+
+ for (i = 1; i < data->numrows; i++)
+ {
+ if (values[i].value != values[i-1].value)
+ {
+ /* are we closer to splitting the bucket in half? */
+ if (fabs(i - data->numrows/2.0) < delta)
+ {
+ /* let's assume we'll use this value for the split */
+ split_value = values[i].value;
+ delta = fabs(i - data->numrows/2.0);
+ nrows = i;
+ }
+ }
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and
+ * non-NULL values in a single dimension. Each dimension may either be
+ * marked as 'nulls only', and thus containing only NULL values, or
+ * it must not contain any NULL values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns,
+ * it's necessary to build those NULL-buckets. This is done in an
+ * iterative way using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL
+ * and non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not
+ * marked as NULL-only, mark it as NULL-only and run the
+ * algorithm again (on this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the
+ * bucket into two parts - one with NULL values, one with
+ * non-NULL values (replacing the current one). Then run
+ * the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions
+ * should be quite low - limited by the number of NULL-buckets. Also,
+ * in each branch the number of nested calls is limited by the number
+ * of dimensions (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The
+ * number of buckets produced by this algorithm is rather limited - with
+ * N dimensions, there may be only 2^N such buckets (each dimension may
+ * be either NULL or non-NULL). So with 8 dimensions (current value of
+ * MVSTATS_MAX_DIMENSIONS) there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further
+ * optimizing the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL
+ * in a dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ /*
+ * FIXME We don't need to start from the first attribute
+ * here - we can start from the last known dimension.
+ */
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only,
+ * but is not yet marked like that. It's enough to mark it and
+ * repeat the process recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in
+ * the dimension, one with non-NULL values. We don't need to sort
+ * the data or anything, but otherwise it's similar to what's done
+ * in partition_bucket().
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each
+ * bucket (NULL is not a value, so 0, and the other bucket got
+ * all the ndistinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram (or if there's no
+ * statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ * - prints actual values
+ * - using the output function of the data type (as string)
+ * - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ * - prints index of the distinct value (into the serialized array)
+ * - makes it easier to spot neighbor buckets, etc.
+ * - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ * - prints index of the distinct value, but normalized into [0,1]
+ * - similar to 1, but shows how 'long' the bucket range is
+ * - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options
+ * skew the lengths by distributing the distinct values uniformly. For
+ * data types without a clear meaning of 'distance' (e.g. strings) that
+ * is not a big deal, but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_histogram_buckets);
+
+Datum
+pg_mv_histogram_buckets(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ Oid mvoid = PG_GETARG_OID(0);
+ int otype = PG_GETARG_INT32(1);
+
+ if ((otype < 0) || (otype > 2))
+ elog(ERROR, "invalid output type specified");
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MVSerializedHistogram histogram;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ histogram = load_mv_histogram(mvoid);
+
+ funcctx->user_fctx = histogram;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = histogram->nbuckets;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+ double bucket_size = 1.0;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MVSerializedHistogram histogram;
+ MVSerializedBucket bucket;
+
+ histogram = (MVSerializedHistogram)funcctx->user_fctx;
+
+ Assert(call_cntr < histogram->nbuckets);
+
+ bucket = histogram->buckets[call_cntr];
+
+ stakeys = find_mv_attnums(mvoid, &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(9 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+ values[3] = (char *) palloc0(1024 * sizeof(char));
+ values[4] = (char *) palloc0(1024 * sizeof(char));
+ values[5] = (char *) palloc0(1024 * sizeof(char));
+
+ values[6] = (char *) palloc(64 * sizeof(char));
+ values[7] = (char *) palloc(64 * sizeof(char));
+ values[8] = (char *) palloc(64 * sizeof(char));
+
+ /* we need to do this only when printing the actual values */
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * histogram->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* bucket ID */
+
+ /*
+ * currently we only print array of indexes, but the deduplicated
+ * values should be sorted, so this is actually quite useful
+ *
+ * TODO print the actual min/max values, using the output
+ * function of the attribute type
+ */
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bucket_size *= (bucket->max[i] - bucket->min[i]) * 1.0
+ / (histogram->nvalues[i]-1);
+
+ /* print the actual values, i.e. use output function etc. */
+ if (otype == 0)
+ {
+ Datum minval, maxval;
+ Datum minout, maxout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ minval = histogram->values[i][bucket->min[i]];
+ minout = FunctionCall1(&fmgrinfo[i], minval);
+
+ maxval = histogram->values[i][bucket->max[i]];
+ maxout = FunctionCall1(&fmgrinfo[i], maxval);
+
+ // snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(minout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ // snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ snprintf(buff, 1024, format, values[2], DatumGetPointer(maxout));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else if (otype == 1)
+ {
+ format = "%s, %d";
+ if (i == 0)
+ format = "{%s%d";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %d}";
+
+ snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else
+ {
+ format = "%s, %f";
+ if (i == 0)
+ format = "{%s%f";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %f}";
+
+ snprintf(buff, 1024, format, values[1],
+ bucket->min[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2],
+ bucket->max[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ snprintf(buff, 1024, format, values[3], bucket->nullsonly[i] ? "t" : "f");
+ strncpy(values[3], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[4], bucket->min_inclusive[i] ? "t" : "f");
+ strncpy(values[4], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[5], bucket->max_inclusive[i] ? "t" : "f");
+ strncpy(values[5], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[6], 64, "%f", bucket->ntuples); /* frequency */
+ snprintf(values[7], 64, "%f", bucket->ntuples / bucket_size); /* density */
+ snprintf(values[8], 64, "%f", bucket_size); /* bucket_size */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+ pfree(values[4]);
+ pfree(values[5]);
+ pfree(values[6]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
+
+#ifdef DEBUG_MVHIST
+/*
+ * prints debugging info about matched histogram buckets (full/partial)
+ *
+ * XXX Currently works only for INT data type.
+ */
+void
+debug_histogram_matches(MVSerializedHistogram mvhist, char *matches)
+{
+ int i, j;
+
+ float ffull = 0, fpartial = 0;
+ int nfull = 0, npartial = 0;
+
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ char ranges[1024];
+
+ if (! matches[i])
+ continue;
+
+ /* increment the counters */
+ nfull += (matches[i] == MVSTATS_MATCH_FULL) ? 1 : 0;
+ npartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? 1 : 0;
+
+ /* and also update the frequencies */
+ ffull += (matches[i] == MVSTATS_MATCH_FULL) ? bucket->ntuples : 0;
+ fpartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? bucket->ntuples : 0;
+
+ memset(ranges, 0, sizeof(ranges));
+
+ /* build ranges for all the dimentions */
+ for (j = 0; j < mvhist->ndimensions; j++)
+ {
+ sprintf(ranges, "%s [%d %d]", ranges,
+ DatumGetInt32(mvhist->values[j][bucket->min[j]]),
+ DatumGetInt32(mvhist->values[j][bucket->max[j]]));
+ }
+
+ elog(WARNING, "bucket %d %s => %d [%f]", i, ranges, matches[i], bucket->ntuples);
+ }
+
+ elog(WARNING, "full=%f partial=%f (%f)", ffull, fpartial, (ffull + 0.5 * fpartial));
+}
+#endif
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 7d13a38..942b779 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,9 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled, mcv_enabled,\n"
- " deps_built, mcv_built,\n"
- " mcv_max_items,\n"
+ " deps_enabled, mcv_enabled, hist_enabled,\n"
+ " deps_built, mcv_built, hist_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2154,8 +2154,17 @@ describeOneTableDetails(const char *schemaname,
first = false;
}
+ if (!strcmp(PQgetvalue(result, i, 6), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", histogram");
+ else
+ appendPQExpBuffer(&buf, "(histogram");
+ first = false;
+ }
+
appendPQExpBuffer(&buf, ") ON (%s)",
- PQgetvalue(result, i, 9));
+ PQgetvalue(result, i, 12));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index fd7107d..a5945af 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -38,13 +38,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -52,6 +55,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -67,17 +71,21 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 11
+#define Natts_pg_mv_statistic 15
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
#define Anum_pg_mv_statistic_mcv_enabled 5
-#define Anum_pg_mv_statistic_mcv_max_items 6
-#define Anum_pg_mv_statistic_deps_built 7
-#define Anum_pg_mv_statistic_mcv_built 8
-#define Anum_pg_mv_statistic_stakeys 9
-#define Anum_pg_mv_statistic_stadeps 10
-#define Anum_pg_mv_statistic_stamcv 11
+#define Anum_pg_mv_statistic_hist_enabled 6
+#define Anum_pg_mv_statistic_mcv_max_items 7
+#define Anum_pg_mv_statistic_hist_max_buckets 8
+#define Anum_pg_mv_statistic_deps_built 9
+#define Anum_pg_mv_statistic_mcv_built 10
+#define Anum_pg_mv_statistic_hist_built 11
+#define Anum_pg_mv_statistic_stakeys 12
+#define Anum_pg_mv_statistic_stadeps 13
+#define Anum_pg_mv_statistic_stamcv 14
+#define Anum_pg_mv_statistic_stahist 15
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 1875e26..2eb16f4 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2749,6 +2749,10 @@ DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f
DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
DESCR("details about MCV list items");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3374 ( pg_mv_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_size}" _null_ _null_ pg_mv_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index d3c9898..1298c42 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -593,10 +593,12 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
+ bool hist_enabled; /* histogram enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
+ bool hist_built; /* histogram built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 4535db7..f05a517 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -92,6 +92,123 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ *
+ * TODO add more detailed description here
+ */
+
+typedef struct MVSerializedBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ uint16 *min;
+ bool *min_inclusive;
+
+ /* indexes of upper boundaries - values and information about the
+ * inequalities (exclusive vs. inclusive) */
+ uint16 *max;
+ bool *max_inclusive;
+
+} MVSerializedBucketData;
+
+typedef MVSerializedBucketData *MVSerializedBucket;
+
+typedef struct MVSerializedHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ /*
+ * keep this the same with MVHistogramData, because of
+ * deserialization (same offset)
+ */
+ MVSerializedBucket *buckets; /* array of buckets */
+
+ /*
+ * serialized boundary values, one array per dimension, deduplicated
+ * (the min/max indexes point into these arrays)
+ */
+ int *nvalues;
+ Datum **values;
+
+} MVSerializedHistogramData;
+
+typedef MVSerializedHistogramData *MVSerializedHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -99,20 +216,25 @@ typedef MCVListData *MCVList;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
+MVSerializedHistogram load_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVSerializedHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
* dimension within the stats).
*/
int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
@@ -121,6 +243,8 @@ extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_histogram_buckets(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -130,10 +254,20 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
+#ifdef DEBUG_MVHIST
+extern void debug_histogram_matches(MVSerializedHistogram mvhist, char *matches);
+#endif
+
+
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..a34edb8
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON mv_histogram (unknown_column) WITH (histogram);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON mv_histogram (a) WITH (histogram);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a, b) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+ERROR: maximum number of buckets is 16384
+-- correct command
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON mv_histogram (a, b, c, d) WITH (histogram);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index ac5007e..9db1913 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1371,7 +1371,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 838c12b..fbed683 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index d97a0ec..c60c0b2 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -163,3 +163,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..02f49b4
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,176 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON mv_histogram (unknown_column) WITH (histogram);
+
+-- single column
+CREATE STATISTICS s1 ON mv_histogram (a) WITH (histogram);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a) WITH (histogram);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a, b) WITH (histogram);
+
+-- unknown option
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (unknown_option);
+
+-- missing histogram statistics
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+
+-- invalid max_buckets value / too low
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+
+-- invalid max_buckets value / too high
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+
+-- correct command
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON mv_histogram (a, b, c, d) WITH (histogram);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DROP TABLE mv_histogram;
--
2.1.0
0006-multi-statistics-estimation.patchapplication/x-patch; name=0006-multi-statistics-estimation.patchDownload
From c0983a6079d9b7f4617fb3d31bce53690a35e9d6 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 6/7] multi-statistics estimation
The general idea is that a probability (which
is what selectivity is) can be split into a product of
conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part
may be simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute
the original probability.
The implementation works in the other direction, though.
We know what probability P(A & B & C) we need to compute,
and also what statistics are available.
So we search for a combinations of statistics, covering
the clauses in an optimal way (most clauses covered, most
dependencies exploited).
There are two possible approaches - exhaustive and greedy.
The exhaustive one walks through all permutations of
stats using dynamic programming, so it's guaranteed to
find the optimal solution, but it soon gets very slow as
it's roughly O(N!). The dynamic programming may improve
that a bit, but it's still far too expensive for large
numbers of statistics (on a single table).
The greedy algorithm is very simple - in every step choose
the best solution. That may not guarantee the best solution
globally (but maybe it does?), but it only needs N steps
to find the solution, so it's very fast (processing the
selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with
respect to runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply
them to the clauses using the conditional probabilities.
We process the selected stats one by one, and for each
we select the estimated clauses and conditions. See
clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to
be covered by a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single
multivariate statistics.
Clauses not covered by a single statistics at this level
will be passed to clause_selectivity() but this will treat
them as a collection of simpler clauses (connected by AND
or OR), and the clauses from the previous level will be
used as conditions.
So using the same example, the last clause will be passed
to clause_selectivity() with 'clause1' and 'clause2' as
conditions, and it will be processed using multivariate
stats if possible.
The other limitation is that all the expressions have to
be mv-compatible, i.e. there can't be a mix of expressions.
If this is violated, the clause may be passed to the next
level (just like with list of clauses not covered by
a single statistics), which splits that into clauses
handled by multivariate stats and clauses handler by
regular statistics.
rework clauselist_selectivity_or to handle OR-clauses correctly
We might invent a completely new set of functions here, resembling
clauselist_selectivity but adapting the ideas to OR-clauses.
But luckily we know that each OR-clause
(a OR b OR c)
may be rewritten as an equivalent AND-clause using negation:
NOT ((NOT a) AND (NOT b) AND (NOT c))
And that's something we can pass to clauselist_selectivity.
histogram call cache
--------------------
The call cache was removed because it did not initially work
well with OR clauses, but that was just a stupid thinko in the
implementation. This patch re-adds it, hopefully correctly.
The code in update_match_bitmap_histogram() is overly complex,
the branches handling various inequality cases are redundant.
This needs to be simplified somehow.
---
contrib/file_fdw/file_fdw.c | 3 +-
contrib/postgres_fdw/postgres_fdw.c | 6 +-
src/backend/optimizer/path/clausesel.c | 2224 +++++++++++++++++++++++++++-----
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
9 files changed, 2003 insertions(+), 308 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index f13316b..e25870f 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -954,7 +954,8 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
baserel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 374faf5..8f05a02 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -457,7 +457,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
@@ -2000,7 +2001,8 @@ estimate_path_cost_size(PlannerInfo *root,
local_join_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 5b2d92a..3d4d136 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -29,6 +29,8 @@
#include "utils/selfuncs.h"
#include "utils/typcache.h"
+#include "miscadmin.h"
+
/*
* Data structure for accumulating info about possible range-query
@@ -44,6 +46,13 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
+
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -59,23 +68,29 @@ static Bitmapset *collect_mv_attnums(PlannerInfo *root, List *clauses,
Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
int type);
+static Bitmapset *clause_mv_get_attnums(PlannerInfo *root, Node *clause);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, List *stats,
SpecialJoinInfo *sjinfo);
-static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
-
static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
List *clauses, Oid varRelid,
List **mvclauses, MVStatisticInfo *mvstats, int types);
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats, List *clauses,
+ List *conditions, bool is_or);
+
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats,
- bool *fullmatch, Selectivity *lowsel);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or, bool *fullmatch,
+ Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -89,11 +104,59 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static List *choose_mv_statistics(PlannerInfo *root,
+ List *mvstats,
+ List *clauses, List *conditions,
+ Oid varRelid,
+ SpecialJoinInfo *sjinfo);
+
+static List *filter_clauses(PlannerInfo *root, Oid varRelid,
+ SpecialJoinInfo *sjinfo, int type,
+ List *stats, List *clauses,
+ Bitmapset **attnums);
+
+static List *filter_stats(List *stats, Bitmapset *new_attnums,
+ Bitmapset *all_attnums);
+
+static Bitmapset **make_stats_attnums(MVStatisticInfo *mvstats,
+ int nmvstats);
+
+static MVStatisticInfo *make_stats_array(List *stats, int *nmvstats);
+
+static List* filter_redundant_stats(List *stats,
+ List *clauses, List *conditions);
+
+static Node** make_clauses_array(List *clauses, int *nclauses);
+
+static Bitmapset ** make_clauses_attnums(PlannerInfo *root, Oid varRelid,
+ SpecialJoinInfo *sjinfo, int type,
+ Node **clauses, int nclauses);
+
+static bool* make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
Oid varRelid, Index *relid);
-
+
static Bitmapset* fdeps_collect_attnums(List *stats);
static int *make_idx_to_attnum_mapping(Bitmapset *attnums);
@@ -116,6 +179,8 @@ static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
static Bitmapset * get_varattnos(Node * node, Index relid);
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
+
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
#define MIN(x, y) (((x) < (y)) ? (x) : (y))
@@ -257,14 +322,15 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
/* processing mv stats */
- Oid relid = InvalidOid;
+ Index relid = InvalidOid;
/* attributes in mv-compatible clauses */
Bitmapset *mvattnums = NULL;
@@ -274,12 +340,13 @@ clauselist_selectivity(PlannerInfo *root,
stats = find_stats(root, clauses, varRelid, &relid);
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, or matching multivariate statistics, so just go directly
+ * to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
+ varRelid, jointype, sjinfo, conditions);
/*
* Check that there are some stats with functional dependencies
@@ -311,8 +378,8 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
- * Check that there are statistics with MCV list. If not, we don't
- * need to waste time with the optimization.
+ * Check that there are statistics with MCV list or histogram.
+ * If not, we don't need to waste time with the optimization.
*/
if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
@@ -326,33 +393,194 @@ clauselist_selectivity(PlannerInfo *root,
/*
* If there still are at least two columns, we'll try to select
- * a suitable multivariate stats.
+ * a suitable combination of multivariate stats. If there are
+ * multiple combinations, we'll try to choose the best one.
+ * See choose_mv_statistics for more details.
*/
if (bms_num_members(mvattnums) >= 2)
{
- /* see choose_mv_statistics() for details */
- MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+ int k;
+ ListCell *s;
+
+ /*
+ * Copy the list of conditions, so that we can build a list
+ * of local conditions (and keep the original intact, for
+ * the other clauses at the same level).
+ */
+ List *conditions_local = list_copy(conditions);
+
+ /* find the best combination of statistics */
+ List *solution = choose_mv_statistics(root, stats,
+ clauses, conditions,
+ varRelid, sjinfo);
- if (mvstat != NULL) /* we have a matching stats */
+ /* we have a good solution (list of stats) */
+ foreach (s, solution)
{
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
/* clauses compatible with multi-variate stats */
List *mvclauses = NIL;
+ List *mvclauses_new = NIL;
+ List *mvclauses_conditions = NIL;
+ Bitmapset *stat_attnums = NULL;
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, sjinfo, clauses,
+ /* build attnum bitmapset for this statistics */
+ for (k = 0; k < mvstat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ mvstat->stakeys->values[k]);
+
+ /*
+ * Append the compatible conditions (passed from above)
+ * to mvclauses_conditions.
+ */
+ foreach (l, conditions)
+ {
+ Node *c = (Node*)lfirst(l);
+ Bitmapset *tmp = clause_mv_get_attnums(root, c);
+
+ if (bms_is_subset(tmp, stat_attnums))
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, c);
+
+ bms_free(tmp);
+ }
+
+ /* split the clauselist into regular and mv-clauses
+ *
+ * We keep the list of clauses (we don't remove the
+ * clauses yet, because we want to use the clauses
+ * as conditions of other clauses).
+ *
+ * FIXME Do this only once, i.e. filter the clauses
+ * once (selecting clauses covered by at least
+ * one statistics) and then convert them into
+ * smaller per-statistics lists of conditions
+ * and estimated clauses.
+ */
+ clauselist_mv_split(root, sjinfo, clauses,
varRelid, &mvclauses, mvstat,
(MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
- /* we've chosen the histogram to match the clauses */
+ /*
+ * We've chosen the statistics to match the clauses, so
+ * each statistics from the solution should have at least
+ * one new clause (not covered by the previous stats).
+ */
Assert(mvclauses != NIL);
+ /*
+ * Mvclauses now contains only clauses compatible
+ * with the currently selected stats, but we have to
+ * split that into conditions (already matched by
+ * the previous stats), and the new clauses we need
+ * to estimate using this stats.
+ */
+ foreach (l, mvclauses)
+ {
+ ListCell *p;
+ bool covered = false;
+ Node *clause = (Node *) lfirst(l);
+ Bitmapset *clause_attnums = clause_mv_get_attnums(root, clause);
+
+ /*
+ * If already covered by previous stats, add it to
+ * conditions.
+ *
+ * TODO Maybe this could be relaxed a bit? Because
+ * with complex and/or clauses, this might
+ * mean no statistics actually covers such
+ * complex clause.
+ */
+ foreach (p, solution)
+ {
+ int k;
+ Bitmapset *stat_attnums = NULL;
+
+ MVStatisticInfo *prev_stat
+ = (MVStatisticInfo *)lfirst(p);
+
+ /* break if we've ran into current statistic */
+ if (prev_stat == mvstat)
+ break;
+
+ for (k = 0; k < prev_stat->stakeys->dim1; k++)
+ stat_attnums = bms_add_member(stat_attnums,
+ prev_stat->stakeys->values[k]);
+
+ covered = bms_is_subset(clause_attnums, stat_attnums);
+
+ bms_free(stat_attnums);
+
+ if (covered)
+ break;
+ }
+
+ if (covered)
+ mvclauses_conditions
+ = lappend(mvclauses_conditions, clause);
+ else
+ mvclauses_new
+ = lappend(mvclauses_new, clause);
+ }
+
+ /*
+ * We need at least one new clause (not just conditions).
+ */
+ Assert(mvclauses_new != NIL);
+
/* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ mvclauses_new,
+ mvclauses_conditions,
+ false); /* AND */
+ }
+
+ /*
+ * And now finally remove all the mv-compatible clauses.
+ *
+ * This only repeats the same split as above, but this
+ * time we actually use the result list (and feed it to
+ * the next call).
+ */
+ foreach (s, solution)
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
+ /* split the list into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+
+ /*
+ * Add the clauses to the conditions (to be passed
+ * to regular clauses), irrespectedly whether it
+ * will be used as a condition or a clause here.
+ *
+ * We only keep the remaining conditions in the
+ * clauses (we keep what clauselist_mv_split returns)
+ * so we add each MV condition exactly once.
+ */
+ conditions_local = list_concat(conditions_local, mvclauses);
}
+
+ /* from now on, work with the 'local' list of conditions */
+ conditions = conditions_local;
}
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ return clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo, conditions);
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -364,7 +592,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions);
/*
* Check for being passed a RestrictInfo.
@@ -523,6 +752,55 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Similar to clauselist_selectivity(), but for OR-clauses. We can't
+ * simply apply exactly the same logic as to AND-clauses, because there
+ * are a few key differences:
+ *
+ * - functional dependencies don't really apply to OR-clauses
+ *
+ * - clauselist_selectivity() works by decomposing the selectivity
+ * into conditional selectivities (probabilities), but that can be
+ * done only for AND-clauses. That means problems with applying
+ * multiple statistics (and reusing clauses as conditions, etc.).
+ *
+ * We might invent a completely new set of functions here, resembling
+ * clauselist_selectivity but adapting the ideas to OR-clauses.
+ *
+ * But luckily we know that each OR-clause
+ *
+ * (a OR b OR c)
+ *
+ * may be rewritten as an equivalent AND-clause using negation:
+ *
+ * NOT ((NOT a) AND (NOT b) AND (NOT c))
+ *
+ * And that's something we can pass to clauselist_selectivity.
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
+{
+ List *args = NIL;
+ ListCell *l;
+ Expr *expr;
+
+ /* (NOT ...) */
+ foreach (l, clauses)
+ args = lappend(args, makeBoolExpr(NOT_EXPR, list_make1(lfirst(l)), -1));
+
+ /* ((NOT ...) AND (NOT ...)) */
+ expr = makeBoolExpr(AND_EXPR, args, -1);
+
+ /* NOT (... AND ...) */
+ return 1.0 - clauselist_selectivity(root, list_make1(expr), varRelid,
+ jointype, sjinfo, conditions);
+}
+
+/*
* addRangeClause --- add a new range clause for clauselist_selectivity
*
* Here is where we try to match up pairs of range-query clauses
@@ -729,7 +1007,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -849,7 +1128,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -858,29 +1138,18 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
- /*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
- */
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(arg),
- varRelid,
- jointype,
- sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ /* just call to clauselist_selectivity_or() */
+ s1 = clauselist_selectivity_or(root,
+ ((BoolExpr *) clause)->args,
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -970,7 +1239,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -979,7 +1249,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else
{
@@ -1103,9 +1374,67 @@ clause_selectivity(PlannerInfo *root,
* them without inspection, which is more expensive). But this
* requires really knowing the per-clause selectivities in advance,
* and that's not what we do now.
+ *
+ * TODO All this is based on the assumption that the statistics represent
+ * the necessary dependencies, i.e. that if two colunms are not in
+ * the same statistics, there's no dependency. If that's not the
+ * case, we may get misestimates, just like before. For example
+ * assume we have a table with three columns [a,b,c] with exactly
+ * the same values, and statistics on [a,b] and [b,c]. So somthing
+ * like this:
+ *
+ * CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+ *
+ * ALTER TABLE test ADD STATISTICS (mcv) ON (a,b);
+ * ALTER TABLE test ADD STATISTICS (mcv) ON (b,c);
+ *
+ * ANALYZE test;
+ *
+ * EXPLAIN ANALYZE SELECT * FROM test
+ * WHERE (a < 10) AND (b < 20) AND (c < 10);
+ *
+ * The problem here is that the only shared column between the two
+ * statistics is 'b' so the probability will be computed like this
+ *
+ * P[(a < 10) & (b < 20) & (c < 10)]
+ * = P[(a < 10) & (b < 20)] * P[(c < 10) | (a < 10) & (b < 20)]
+ * = P[(a < 10) & (b < 20)] * P[(c < 10) | (b < 20)]
+ *
+ * or like this
+ *
+ * P[(a < 10) & (b < 20) & (c < 10)]
+ * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20) & (c < 10)]
+ * = P[(b < 20) & (c < 10)] * P[(a < 10) | (b < 20)]
+ *
+ * In both cases the conditional probabilities will be evaluated as
+ * 0.5, because they lack the other column (which would make it 1.0).
+ *
+ * Theoretically it might be possible to transfer the dependency,
+ * e.g. by building bitmap for [a,b] and then combine it with [b,c]
+ * by doing something like this:
+ *
+ * 1) build bitmap on [a,b] using [(a<10) & (b < 20)]
+ * 2) for each element in [b,c] check the bitmap
+ *
+ * But that's certainly nontrivial - for example the statistics may
+ * be different (MCV list vs. histogram) and/or the items may not
+ * match (e.g. MCV items or histogram buckets will be built
+ * differently). Also, for one value of 'b' there might be multiple
+ * MCV items (because of the other column values) with different
+ * bitmap values (some will match, some won't) - so it's not exactly
+ * bitmap but a partial match.
+ *
+ * Maybe a hash table with number of matches and mismatches (or
+ * maybe sums of frequencies) would work? The step (2) would then
+ * lookup the values and use that to weight the item somehow.
+ *
+ * Currently the only solution is to build statistics on all three
+ * columns.
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
@@ -1123,7 +1452,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
*/
/* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions, is_or,
&fullmatch, &mcv_low);
/*
@@ -1136,7 +1466,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
/* FIXME if (fullmatch) without matching MCV item, use the mcv_low
* selectivity as upper bound */
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions, is_or);
/* TODO clamp to <= 1.0 (or more strictly, when possible) */
return s1 + s2;
@@ -1176,8 +1507,7 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
*/
if (bms_num_members(attnums) <= 1)
{
- if (attnums != NULL)
- pfree(attnums);
+ bms_free(attnums);
attnums = NULL;
*relid = InvalidOid;
}
@@ -1186,202 +1516,931 @@ collect_mv_attnums(PlannerInfo *root, List *clauses, Oid varRelid,
}
/*
- * We're looking for statistics matching at least 2 attributes,
- * referenced in the clauses compatible with multivariate statistics.
- * The current selection criteria is very simple - we choose the
- * statistics referencing the most attributes.
+ * Selects the best combination of multivariate statistics, in an
+ * exhaustive way, where 'best' means:
*
- * If there are multiple statistics referencing the same number of
- * columns (from the clauses), the one with less source columns
- * (as listed in the ADD STATISTICS when creating the statistics) wins.
- * Other wise the first one wins.
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ * (c) using the most conditions to exploit dependency
*
- * This is a very simple criteria, and has several weaknesses:
+ * There may be other optimality criteria, not considered in the initial
+ * implementation (more on that 'weaknesses' section).
*
- * (a) does not consider the accuracy of the statistics
+ * This pretty much splits the probability of clauses (aka selectivity)
+ * into a sequence of conditional probabilities, like this
*
- * If there are two histograms built on the same set of columns,
- * but one has 100 buckets and the other one has 1000 buckets (thus
- * likely providing better estimates), this is not currently
- * considered.
+ * P(A,B,C,D) = P(A,B) * P(C|A,B) * P(D|A,B,C)
*
- * (b) does not consider the type of statistics
+ * and removing the attributes not referenced by the existing stats,
+ * under the assumption that there's no dependency (otherwise the DBA
+ * would create the stats).
*
- * If there are three statistics - one containing just a MCV list,
- * another one with just a histogram and a third one with both,
- * this is not considered.
+ * The last criteria means that when we have the choice to compute like
+ * this
*
- * (c) does not consider the number of clauses
+ * P(A,B,C,D) = P(A,B,C) * P(D|B,C)
*
- * As explained, only the number of referenced attributes counts,
- * so if there are multiple clauses on a single attribute, this
- * still counts as a single attribute.
+ * or like this
*
- * (d) does not consider type of condition
+ * P(A,B,C,D) = P(A,B,C) * P(D|C)
*
- * Some clauses may work better with some statistics - for example
- * equality clauses probably work better with MCV lists than with
- * histograms. But IS [NOT] NULL conditions may often work better
- * with histograms (thanks to NULL-buckets).
+ * we should use the first option, as that exploits more dependencies.
*
- * So for example with five WHERE conditions
+ * The order of statistics in the solution implicitly determines the
+ * order of estimation of clauses, because as we apply a statistics,
+ * we always use it to estimate all the clauses covered by it (and
+ * then we use those clauses as conditions for the next statistics).
*
- * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ * Don't call this directly but through choose_mv_statistics().
*
- * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
- * selected as it references the most columns.
*
- * Once we have selected the multivariate statistics, we split the list
- * of clauses into two parts - conditions that are compatible with the
- * selected stats, and conditions are estimated using simple statistics.
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with
+ * maximum 'depth' equal to the number of multi-variate statistics
+ * available on the table.
*
- * From the example above, conditions
+ * It explores all the possible permutations of the stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it
+ * matches are divided into 'conditions' (clauses already matched by at
+ * least one previous statistics) and clauses that are estimated.
*
- * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ * Then several checks are performed:
*
- * will be estimated using the multivariate statistics (a,b,c,d) while
- * the last condition (e = 1) will get estimated using the regular ones.
+ * (a) The statistics covers at least 2 columns, referenced in the
+ * estimated clauses (otherwise multi-variate stats are useless).
*
- * There are various alternative selection criteria (e.g. counting
- * conditions instead of just referenced attributes), but eventually
- * the best option should be to combine multiple statistics. But that's
- * much harder to do correctly.
+ * (b) The statistics covers at least 1 new column, i.e. column not
+ * refefenced by the already used stats (and the new column has
+ * to be referenced by the clauses, of couse). Otherwise the
+ * statistics would not add any new information.
*
- * TODO Select multiple statistics and combine them when computing
- * the estimate.
+ * There are some other sanity checks (e.g. that the stats must not be
+ * used twice etc.).
*
- * TODO This will probably have to consider compatibility of clauses,
- * because 'dependencies' will probably work only with equality
- * clauses.
+ * Finally the new solution is compared to the currently best one, and
+ * if it's considered better, it's used instead.
+ *
+ *
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a somewhat simple optimality criteria,
+ * suffering by the following weaknesses.
+ *
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but
+ * with statistics in a different order). It's unclear which solution
+ * is the best one - in a sense all of them are equal.
+ *
+ * TODO It might be possible to compute estimate for each of those
+ * solutions, and then combine them to get the final estimate
+ * (e.g. by using average or median).
+ *
+ * (b) Does not consider that some types of stats are a better match for
+ * some types of clauses (e.g. MCV list is a good match for equality
+ * than a histogram).
+ *
+ * XXX Maybe MCV is almost always better / more accurate?
+ *
+ * But maybe this is pointless - generally, each column is either
+ * a label (it's not important whether because of the data type or
+ * how it's used), or a value with ordering that makes sense. So
+ * either a MCV list is more appropriate (labels) or a histogram
+ * (values with orderings).
+ *
+ * Now sure what to do with statistics on columns mixing columns of
+ * both types - maybe it'd be beeter to invent a new type of stats
+ * combining MCV list and histogram (keeping a small histogram for
+ * each MCV item, and a separate histogram for values not on the
+ * MCV list). But that's not implemented at this moment.
+ *
+ * TODO The algorithm should probably count number of Vars (not just
+ * attnums) when computing the 'score' of each solution. Computing
+ * the ratio of (num of all vars) / (num of condition vars) as a
+ * measure of how well the solution uses conditions might be
+ * useful.
*/
-static MVStatisticInfo *
-choose_mv_statistics(List *stats, Bitmapset *attnums)
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
{
- int i;
- ListCell *lc;
+ int i, j;
- MVStatisticInfo *choice = NULL;
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
/*
- * Walk through the statistics (simple array with nmvstats elements)
- * and for each one count the referenced attributes (encoded in
- * the 'attnums' bitmap).
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
*/
- foreach (lc, stats)
+ for (i = 0; i < nmvstats; i++)
{
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+ int c;
- /* columns matching this statistics */
- int matches = 0;
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
- int2vector * attrs = info->stakeys;
- int numattrs = attrs->dim1;
+ Bitmapset *all_attnums = NULL;
+ Bitmapset *new_attnums = NULL;
- /* skip dependencies-only stats */
- if (! (info->mcv_built || info->hist_built))
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
continue;
- /* count columns covered by the histogram */
- for (i = 0; i < numattrs; i++)
- if (bms_is_member(attrs->values[i], attnums))
- matches++;
-
/*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
*/
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
+ for (c = 0; c < nclauses; c++)
{
- choice = info;
- current_matches = matches;
- current_dims = numattrs;
- }
- }
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
- return choice;
-}
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
-/*
- * This splits the clauses list into two parts - one containing clauses
- * that will be evaluated using the chosen statistics, and the remaining
- * clauses (either non-mvcompatible, or not related to the histogram).
- */
-static List *
-clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
- List *clauses, Oid varRelid, List **mvclauses,
- MVStatisticInfo *mvstats, int types)
-{
- int i;
- ListCell *l;
- List *non_mvclauses = NIL;
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
- /* FIXME is there a better way to get info on int2vector? */
- int2vector * attrs = mvstats->stakeys;
- int numattrs = mvstats->stakeys->dim1;
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
- Bitmapset *mvattnums = NULL;
+ if (covered)
+ break;
+ }
- /* build bitmap of attributes covered by the stats, so we can
- * do bms_is_subset later */
- for (i = 0; i < numattrs; i++)
- mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
- /* erase the list of mv-compatible clauses */
- *mvclauses = NIL;
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
- foreach (l, clauses)
- {
- bool match = false; /* by default not mv-compatible */
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(l);
+ /* add the attnums into attnums from 'new clauses' */
+ // new_attnums = bms_union(new_attnums, clause_attnums);
+ }
- if (clause_is_mv_compatible(root, clause, varRelid, NULL,
- &attnums, sjinfo, types))
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+ bms_free(new_attnums);
+
+ all_attnums = NULL;
+ new_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
{
- /* are all the attributes part of the selected stats? */
- if (bms_is_subset(attnums, mvattnums))
- match = true;
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
}
/*
- * The clause matches the selected stats, so put it to the list
- * of mv-compatible clauses. Otherwise, keep it in the list of
- * 'regular' clauses (that may be selected later).
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
*/
- if (match)
- *mvclauses = lappend(*mvclauses, clause);
- else
- non_mvclauses = lappend(non_mvclauses, clause);
- }
+ ruled_out[i] = step;
- /*
- * Perform regular estimation using the clauses incompatible
- * with the chosen histogram (or MV stats in general).
- */
- return non_mvclauses;
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
-}
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
-/*
- * Determines whether the clause is compatible with multivariate stats,
- * and if it is, returns some additional information - varno (index
- * into simple_rte_array) and a bitmap of attributes. This is then
- * used to fetch related multivariate statistics.
- *
- * At this moment we only support basic conditions of the form
- *
- * variable OP constant
- *
- * where OP is one of [=,<,<=,>=,>] (which is however determined by
- * looking at the associated function for estimating selectivity, just
- * like with the single-dimensional case).
- *
- * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ /* we can't get more conditions that clauses and conditions combined
+ *
+ * FIXME This assert does not work because we count the conditions
+ * repeatedly (once for each statistics covering it).
+ */
+ /* Assert((nconditions + nclauses) >= current->nconditions); */
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics
+ * covering the clauses. This chooses the "best" statistics at each step,
+ * so the resulting solution may not be the best solution globally, but
+ * this produces the solution in only N steps (where N is the number of
+ * statistics), while the exhaustive approach may have to walk through
+ * ~N! combinations (although some of those are terminated early).
+ *
+ * See the comments at choose_mv_statistics_exhaustive() as this does
+ * the same thing (but in a different way).
+ *
+ * Don't call this directly, but through choose_mv_statistics().
+ *
+ * TODO There are probably other metrics we might use - e.g. using
+ * number of columns (num_cond_columns / num_cov_columns), which
+ * might work better with a mix of simple and complex clauses.
+ *
+ * TODO Also the choice at the very first step should be handled
+ * in a special way, because there will be 0 conditions at that
+ * moment, so there needs to be some other criteria - e.g. using
+ * the simplest (or most complex?) clause might be a good idea.
+ *
+ * TODO We might also select multiple stats using different criteria,
+ * and branch the search. This is however tricky, because if we
+ * choose k statistics at each step, we get k^N branches to
+ * walk through (with N steps). That's not really good with
+ * large number of stats (yet better than exhaustive search).
+ */
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
+
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
+
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
+ */
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *attnums_new
+ = bms_union(attnums_covered, clauses_attnums[j]);
+
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = attnums_new;
+
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
+
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ attnums_new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = attnums_new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
+
+ if (gain > max_gain)
+ {
+ max_gain = gain;
+ best_stat = i;
+ }
+ }
+
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
+ }
+
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
+}
+
+/*
+ * Chooses the combination of statistics, optimal for estimation of
+ * a particular clause list.
+ *
+ * This only handles a 'preparation' shared by the exhaustive and greedy
+ * implementations (see the previous methods), mostly trying to reduce
+ * the size of the problem (eliminate clauses/statistics that can't be
+ * really used in the solution).
+ *
+ * It also precomputes bitmaps for attributes covered by clauses and
+ * statistics, so that we don't need to do that over and over in the
+ * actual optimizations (as it's both CPU and memory intensive).
+ *
+ * TODO This will probably have to consider compatibility of clauses,
+ * because 'dependencies' will probably work only with equality
+ * clauses.
+ *
+ * TODO Another way to make the optimization problems smaller might
+ * be splitting the statistics into several disjoint subsets, i.e.
+ * if we can split the graph of statistics (after the elimination)
+ * into multiple components (so that stats in different components
+ * share no attributes), we can do the optimization for each
+ * component separately.
+ *
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew
+ * that we can cover 10 clauses and reuse 8 dependencies, maybe
+ * covering 9 clauses and 7 dependencies would be OK?
+ */
+static List*
+choose_mv_statistics(PlannerInfo *root, List *stats,
+ List *clauses, List *conditions,
+ Oid varRelid, SpecialJoinInfo *sjinfo)
+{
+ int i;
+ mv_solution_t *best = NULL;
+ List *result = NIL;
+
+ int nmvstats;
+ MVStatisticInfo *mvstats;
+
+ /* we only work with MCV lists and histograms here */
+ int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
+
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
+
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
+ stats = list_copy(stats);
+
+ /*
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
+ */
+ while (true)
+ {
+ List *tmp;
+
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
+
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. We'll also keep info
+ * about attnums in clauses (without conditions) so that we can
+ * ignore stats covering just conditions (which is pointless).
+ */
+ tmp = filter_clauses(root, varRelid, sjinfo, type,
+ stats, clauses, &compatible_attnums);
+
+ /* discard the original list */
+ list_free(clauses);
+ clauses = tmp;
+
+ /*
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new
+ * attribute (by comparing with clauses).
+ */
+ if (conditions != NIL)
+ {
+ tmp = filter_clauses(root, varRelid, sjinfo, type,
+ stats, conditions, &condition_attnums);
+
+ /* discard the original list */
+ list_free(conditions);
+ conditions = tmp;
+ }
+
+ /* get a union of attnums (from conditions and new clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ tmp = filter_stats(stats, compatible_attnums, all_attnums);
+
+ /* if we've not eliminated anything, terminate */
+ if (list_length(stats) == list_length(tmp))
+ break;
+
+ /* work only with filtered statistics from now */
+ list_free(stats);
+ stats = tmp;
+ }
+
+ /* only do the optimization if we have clauses/statistics */
+ if ((list_length(stats) == 0) || (list_length(clauses) == 0))
+ return NULL;
+
+ /* remove redundant stats (stats covered by another stats) */
+ stats = filter_redundant_stats(stats, clauses, conditions);
+
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* collect clauses an bitmap of attnums */
+ clauses_array = make_clauses_array(clauses, &nclauses);
+ clauses_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
+ clauses_array, nclauses);
+
+ /* collect conditions and bitmap of attnums */
+ conditions_array = make_clauses_array(conditions, &nconditions);
+ conditions_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
+ conditions_array, nconditions);
+
+ /*
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
+ */
+ clause_cover_map = make_cover_map(stats_attnums, nmvstats,
+ clauses_attnums, nclauses);
+
+ condition_cover_map = make_cover_map(stats_attnums, nmvstats,
+ conditions_attnums, nconditions);
+
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ /* no stats are ruled out by default */
+ for (i = 0; i < nmvstats; i++)
+ ruled_out[i] = -1;
+
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* create a list of statistics from the array */
+ if (best != NULL)
+ {
+ for (i = 0; i < best->nstats; i++)
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
+ result = lappend(result, info);
+ }
+ pfree(best);
+ }
+
+ /* cleanup (maybe leave it up to the memory context?) */
+ for (i = 0; i < nmvstats; i++)
+ bms_free(stats_attnums[i]);
+
+ for (i = 0; i < nclauses; i++)
+ bms_free(clauses_attnums[i]);
+
+ for (i = 0; i < nconditions; i++)
+ bms_free(conditions_attnums[i]);
+
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(conditions_attnums);
+
+ pfree(clauses_array);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+ pfree(mvstats);
+
+ list_free(clauses);
+ list_free(conditions);
+ list_free(stats);
+
+ return result;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen statistics, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes covered by the stats, so we can
+ * do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(root, clause, varRelid, NULL,
+ &attnums, sjinfo, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list
+ * of mv-compatible clauses. Otherwise, keep it in the list of
+ * 'regular' clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
* evaluate them using multivariate stats.
*/
static bool
@@ -1539,10 +2598,10 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
return true;
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/*
- * AND/OR-clauses are supported if all sub-clauses are supported
+ * AND/OR/NOT-clauses are supported if all sub-clauses are supported
*
* TODO We might support mixed case, where some of the clauses
* are supported and some are not, and treat all supported
@@ -1552,7 +2611,10 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
*
* TODO For RestrictInfo above an OR-clause, we might use the
* orclause with nested RestrictInfo - we won't have to
- * call pull_varnos() for each clause, saving time.
+ * call pull_varnos() for each clause, saving time.
+ *
+ * TODO Perhaps this needs a bit more thought for functional
+ * dependencies? Those don't quite work for NOT cases.
*/
Bitmapset *tmp = NULL;
ListCell *l;
@@ -1572,6 +2634,51 @@ clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
return false;
}
+
+static Bitmapset *
+clause_mv_get_attnums(PlannerInfo *root, Node *clause)
+{
+ Bitmapset * attnums = NULL;
+
+ /* Extract clause from restrict info, if needed. */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+
+ if (IsA(linitial(expr->args), Var))
+ attnums = bms_add_member(attnums,
+ ((Var*)linitial(expr->args))->varattno);
+ else
+ attnums = bms_add_member(attnums,
+ ((Var*)lsecond(expr->args))->varattno);
+ }
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ attnums = bms_add_member(attnums,
+ ((Var*)((NullTest*)clause)->arg)->varattno);
+ }
+ else if (or_clause(clause) || and_clause(clause) || or_clause(clause))
+ {
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ attnums = bms_join(attnums,
+ clause_mv_get_attnums(root, (Node*)lfirst(l)));
+ }
+ }
+
+ return attnums;
+}
+
/*
* Performs reduction of clauses using functional dependencies, i.e.
* removes clauses that are considered redundant. It simply walks
@@ -2223,22 +3330,26 @@ get_varattnos(Node * node, Index relid)
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -2249,32 +3360,85 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
- nmatches, matches,
- lowsel, fullmatch, false);
+ ((is_or) ? 0 : nmatches), matches,
+ lowsel, fullmatch, is_or);
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
{
- /* used to 'scale' for MCV lists not covering all tuples */
+ /*
+ * Find out what part of the data is covered by the MCV list,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample items got into the MCV, and the rest
+ * is either in a histogram, or not covered by stats).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole list, which might save us a bit of time
+ * spent on accessing the not-matching part of the MCV list.
+ * Although it's likely in a cache, so it's very fast.
+ */
u += mcvlist->items[i]->frequency;
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s*u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -2567,64 +3731,57 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
}
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each MCV item */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching MCV items */
- or_nmatches = mcvlist->nitems;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
/* by default none of the MCV items matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
stakeys, mcvlist,
- or_nmatches, or_matches,
+ tmp_nmatches, tmp_matches,
lowsel, fullmatch, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mcvlist->nitems; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
+ pfree(tmp_matches);
}
else
- {
elog(ERROR, "unknown clause type: %d", clause->type);
- }
}
/*
@@ -2682,15 +3839,18 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
MVSerializedHistogram mvhist = NULL;
@@ -2701,27 +3861,57 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
/* There may be no histogram in the stats (check hist_built flag) */
mvhist = load_mv_histogram(mvstats->mvoid);
- Assert (mvhist != NULL);
- Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
/*
- * Bitmap of bucket matches (mismatch, partial, full). by default
- * all buckets fully match (and we'll eliminate them).
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
-
- nmatches = mvhist->nbuckets;
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, false);
- /* build the match bitmap */
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
- nmatches, matches, false);
+ ((is_or) ? 0 : nmatches), matches,
+ is_or);
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
/*
* Find out what part of the data is covered by the histogram,
* so that we can 'scale' the selectivity properly (e.g. when
@@ -2735,17 +3925,35 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
*/
u += mvhist->buckets[i]->ntuples;
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s * u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -2775,7 +3983,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
{
int i;
ListCell * l;
-
+
/*
* Used for caching function calls, only once per deduplicated value.
*
@@ -2818,7 +4026,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
FmgrInfo opproc; /* operator */
fmgr_info(get_opcode(expr->opno), &opproc);
-
+
/* reset the cache (per clause) */
memset(callcache, 0, mvhist->nbuckets);
@@ -2870,7 +4078,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
/* histogram boundaries */
Datum minval, maxval;
-
+
/* values from the call cache */
char mincached, maxcached;
@@ -2959,7 +4167,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
}
/*
- * Now check whether the upper boundary is below the constant (in that
+ * Now check whether constant is below the upper boundary (in that
* case it's a partial match).
*/
if (! maxcached)
@@ -2978,8 +4186,32 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
else
tmp = !(maxcached & 0x02); /* extract the result (reverse) */
- if (tmp) /* partial match */
+ if (tmp)
+ {
+ /* partial match */
UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ continue;
+ }
+
+
+ /*
+ * And finally check whether the whether the constant is above the the upper
+ * boundary (in that case it's a full match match).
+ *
+ * XXX We need to do this because of the OR clauses (which start with no
+ * matches and we incrementally add more and more matches), but maybe
+ * we don't need to do the check and can just do UPDATE_RESULT?
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ maxval,
+ cst->constvalue));
+
+ if (tmp)
+ {
+ /* full match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_FULL, is_or);
+ }
}
else /* (const < var) */
@@ -3018,15 +4250,36 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
DEFAULT_COLLATION_OID,
minval,
cst->constvalue));
-
/* Update the cache. */
callcache[bucket->min[idx]] = (tmp) ? 0x03 : 0x01;
- }
+ }
else
tmp = (mincached & 0x02); /* extract the result */
- if (tmp) /* partial match */
+ if (tmp)
+ {
+ /* partial match */
UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ *
+ * XXX We need to do this because of the OR clauses (which start with no
+ * matches and we incrementally add more and more matches), but maybe
+ * we don't need to do the check and can just do UPDATE_RESULT?
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ minval));
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_FULL, is_or);
+
}
break;
@@ -3082,8 +4335,29 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
tmp = !(mincached & 0x02); /* extract the result */
if (tmp)
+ {
/* partial match */
UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the lower boundary is below the constant (in that
+ * case it's a partial match).
+ *
+ * XXX We need to do this because of the OR clauses (which start with no
+ * matches and we incrementally add more and more matches), but maybe
+ * we don't need to do the check and can just do UPDATE_RESULT?
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ minval,
+ cst->constvalue));
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_FULL, is_or);
+
}
else /* (const > var) */
{
@@ -3129,8 +4403,30 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
tmp = (maxcached & 0x02); /* extract the result */
if (tmp)
+ {
/* partial match */
UPDATE_RESULT(matches[i], MVSTATS_MATCH_PARTIAL, is_or);
+ continue;
+ }
+
+ /*
+ * Now check whether the upper boundary is below the constant (in that
+ * case it's a partial match).
+ *
+ * XXX We need to do this because of the OR clauses (which start with no
+ * matches and we incrementally add more and more matches), but maybe
+ * we don't need to do the check and can just do UPDATE_RESULT?
+ */
+ tmp = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ maxval));
+
+ if (tmp)
+ /* partial match */
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_FULL, is_or);
+ continue;
+
}
break;
@@ -3195,6 +4491,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
else
tmp = (maxcached & 0x02); /* extract the result */
+
if (tmp)
{
/* no match */
@@ -3246,64 +4543,57 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each bucket */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching buckets */
- or_nmatches = mvhist->nbuckets;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mvhist->nbuckets;
- /* by default none of the buckets matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ /* by default none of the buckets matches the clauses (OR clause) */
+ tmp_matches = palloc0(sizeof(char) * mvhist->nbuckets);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* but AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ tmp_nmatches = update_match_bitmap_histogram(root, tmp_clauses,
stakeys, mvhist,
- or_nmatches, or_matches, or_clause(clause));
+ tmp_nmatches, tmp_matches, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mvhist->nbuckets; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
-
+ pfree(tmp_matches);
}
else
elog(ERROR, "unknown clause type: %d", clause->type);
}
- /* free the call cache */
pfree(callcache);
#ifdef DEBUG_MVHIST
@@ -3312,3 +4602,363 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics.
+ */
+static List *
+filter_clauses(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
+ int type, List *stats, List *clauses, Bitmapset **attnums)
+{
+ ListCell *c;
+ ListCell *s;
+
+ /* results (list of compatible clauses, attnums) */
+ List *rclauses = NIL;
+
+ foreach (c, clauses)
+ {
+ Node *clause = (Node*)lfirst(c);
+ Bitmapset *clause_attnums = NULL;
+ Index relid;
+
+ /*
+ * The clause has to be mv-compatible (suitable operators etc.).
+ */
+ if (! clause_is_mv_compatible(root, clause, varRelid,
+ &relid, &clause_attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ /* is there a statistics covering this clause? */
+ foreach (s, stats)
+ {
+ int k, matches = 0;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ if (bms_is_member(stat->stakeys->values[k],
+ clause_attnums))
+ matches += 1;
+ }
+
+ /*
+ * The clause is compatible if all attributes it references
+ * are covered by the statistics.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ *attnums = bms_union(*attnums, clause_attnums);
+ rclauses = lappend(rclauses, clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(clauses) >= list_length(rclauses));
+
+ return rclauses;
+}
+
+
+/*
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ *
+ * This check might be made more strict by checking against individual
+ * clauses, because by using the bitmapsets of all attnums we may
+ * actually use attnums from clauses that are not covered by the
+ * statistics. For example, we may have a condition
+ *
+ * (a=1 AND b=2)
+ *
+ * and a new clause
+ *
+ * (c=1 AND d=1)
+ *
+ * With only bitmapsets, statistics on [b,c] will pass through this
+ * (assuming there are some statistics covering both clases).
+ *
+ * TODO Do the more strict check.
+ */
+static List *
+filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+{
+ ListCell *s;
+ List *stats_filtered = NIL;
+
+ foreach (s, stats)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* see how many attributes the statistics covers */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ /* attributes from new clauses */
+ if (bms_is_member(stat->stakeys->values[k], new_attnums))
+ matches_new += 1;
+
+ /* attributes from onditions */
+ if (bms_is_member(stat->stakeys->values[k], all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ stats_filtered = lappend(stats_filtered, stat);
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(list_length(stats) >= list_length(stats_filtered));
+
+ return stats_filtered;
+}
+
+static MVStatisticInfo *
+make_stats_array(List *stats, int *nmvstats)
+{
+ int i;
+ ListCell *l;
+
+ MVStatisticInfo *mvstats = NULL;
+ *nmvstats = list_length(stats);
+
+ mvstats
+ = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
+
+ i = 0;
+ foreach (l, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
+ memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
+ }
+
+ return mvstats;
+}
+
+static Bitmapset **
+make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
+{
+ int i, j;
+ Bitmapset **stats_attnums = NULL;
+
+ Assert(nmvstats > 0);
+
+ /* build bitmaps of attnums for the stats (easier to compare) */
+ stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i]
+ = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+
+ return stats_attnums;
+}
+
+
+/*
+ * Now let's remove redundant statistics, covering the same columns
+ * as some other stats, when restricted to the attributes from
+ * remaining clauses.
+ *
+ * If statistics S1 covers S2 (covers S2 attributes and possibly
+ * some more), we can probably remove S2. What actually matters are
+ * attributes from covered clauses (not all the attributes). This
+ * might however prefer larger, and thus less accurate, statistics.
+ *
+ * When a redundancy is detected, we simply keep the smaller
+ * statistics (less number of columns), on the assumption that it's
+ * more accurate and faster to process. That might be incorrect for
+ * two reasons - first, the accuracy really depends on number of
+ * buckets/MCV items, not the number of columns. Second, we might
+ * prefer MCV lists over histograms or something like that.
+ */
+static List*
+filter_redundant_stats(List *stats, List *clauses, List *conditions)
+{
+ int i, j, nmvstats;
+
+ MVStatisticInfo *mvstats;
+ bool *redundant;
+ Bitmapset **stats_attnums;
+ Bitmapset *varattnos;
+ Index relid;
+
+ Assert(list_length(stats) > 0);
+ Assert(list_length(clauses) > 0);
+
+ /*
+ * We'll convert the list of statistics into an array now, because
+ * the reduction of redundant statistics is easier to do that way
+ * (we can mark previous stats as redundant, etc.).
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* by default, none of the stats is redundant (so palloc0) */
+ redundant = palloc0(nmvstats * sizeof(bool));
+
+ /*
+ * We only expect a single relid here, and also we should get the
+ * same relid from clauses and conditions (but we get it from
+ * clauses, because those are certainly non-empty).
+ */
+ relid = bms_singleton_member(pull_varnos((Node*)clauses));
+
+ /*
+ * Get the varattnos from both conditions and clauses.
+ *
+ * This skips system attributes, although that should be impossible
+ * thanks to previous filtering out of incompatible clauses.
+ *
+ * XXX Is that really true?
+ */
+ varattnos = bms_union(get_varattnos((Node*)clauses, relid),
+ get_varattnos((Node*)conditions, relid));
+
+ for (i = 1; i < nmvstats; i++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
+
+ /* walk through 'previous' stats and check redundancy */
+ for (j = 0; j < i; j++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *prev;
+
+ /* skip stats already identified as redundant */
+ if (redundant[j])
+ continue;
+
+ prev = bms_intersect(stats_attnums[j], varattnos);
+
+ switch (bms_subset_compare(curr, prev))
+ {
+ case BMS_EQUAL:
+ /*
+ * Use the smaller one (hopefully more accurate).
+ * If both have the same size, use the first one.
+ */
+ if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
+ redundant[i] = TRUE;
+ else
+ redundant[j] = TRUE;
+
+ break;
+
+ case BMS_SUBSET1: /* curr is subset of prev */
+ redundant[i] = TRUE;
+ break;
+
+ case BMS_SUBSET2: /* prev is subset of curr */
+ redundant[j] = TRUE;
+ break;
+
+ case BMS_DIFFERENT:
+ /* do nothing - keep both stats */
+ break;
+ }
+
+ bms_free(prev);
+ }
+
+ bms_free(curr);
+ }
+
+ /* can't reduce all statistics (at least one has to remain) */
+ Assert(nmvstats > 0);
+
+ /* now, let's remove the reduced statistics from the arrays */
+ list_free(stats);
+ stats = NIL;
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ MVStatisticInfo *info;
+
+ pfree(stats_attnums[i]);
+
+ if (redundant[i])
+ continue;
+
+ info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
+
+ stats = lappend(stats, info);
+ }
+
+ pfree(mvstats);
+ pfree(stats_attnums);
+ pfree(redundant);
+
+ return stats;
+}
+
+static Node**
+make_clauses_array(List *clauses, int *nclauses)
+{
+ int i;
+ ListCell *l;
+
+ Node** clauses_array;
+
+ *nclauses = list_length(clauses);
+ clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
+
+ i = 0;
+ foreach (l, clauses)
+ clauses_array[i++] = (Node *)lfirst(l);
+
+ *nclauses = i;
+
+ return clauses_array;
+}
+
+static Bitmapset **
+make_clauses_attnums(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
+ int type, Node **clauses, int nclauses)
+{
+ int i;
+ Index relid;
+ Bitmapset **clauses_attnums
+ = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
+
+ for (i = 0; i < nclauses; i++)
+ {
+ Bitmapset * attnums = NULL;
+
+ if (! clause_is_mv_compatible(root, clauses[i], varRelid,
+ &relid, &attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ clauses_attnums[i] = attnums;
+ }
+
+ return clauses_attnums;
+}
+
+static bool*
+make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses)
+{
+ int i, j;
+ bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < nclauses; j++)
+ cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+
+ return cover_map;
+}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 459368e..d5bb819 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3431,7 +3431,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3454,7 +3455,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3621,7 +3623,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3657,7 +3659,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3694,7 +3697,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3832,12 +3836,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3849,7 +3855,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index ea831f5..6299e75 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index ebb03aa..3c58d42 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1625,13 +1625,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6257,7 +6259,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6582,7 +6585,8 @@ btcostestimate(PG_FUNCTION_ARGS)
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7361,7 +7365,8 @@ gincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7598,7 +7603,7 @@ brincostestimate(PG_FUNCTION_ARGS)
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 38ba82f..861601f 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -380,6 +381,15 @@ static const struct config_enum_entry huge_pages_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3672,6 +3682,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 9999ca3..c1f7787 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -191,11 +191,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index f05a517..35b2f8e 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,6 +17,14 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
--
2.1.0
0007-initial-version-of-ndistinct-conefficient-statistics.patchapplication/x-patch; name=0007-initial-version-of-ndistinct-conefficient-statistics.patchDownload
From f750038f659d74ff88be575b1c5c92ad0f745f1d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Wed, 23 Dec 2015 02:07:58 +0100
Subject: [PATCH 7/7] initial version of ndistinct conefficient statistics
---
doc/src/sgml/ref/create_statistics.sgml | 9 ++
src/backend/commands/statscmds.c | 11 ++-
src/backend/optimizer/path/clausesel.c | 7 ++
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/common.c | 20 ++++-
src/backend/utils/mvstats/mvdist.c | 147 ++++++++++++++++++++++++++++++++
src/include/catalog/pg_mv_statistic.h | 26 +++---
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 6 ++
10 files changed, 217 insertions(+), 17 deletions(-)
create mode 100644 src/backend/utils/mvstats/mvdist.c
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index fd3382e..80360a6 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -168,6 +168,15 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>ndistinct</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables ndistinct coefficients for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index b974655..6ea0e13 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -138,7 +138,8 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
build_mcv = false,
- build_histogram = false;
+ build_histogram = false,
+ build_ndistinct = false;
int32 max_buckets = -1,
max_mcv_items = -1;
@@ -221,6 +222,8 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "ndistinct") == 0)
+ build_ndistinct = defGetBoolean(opt);
else if (strcmp(opt->defname, "mcv") == 0)
build_mcv = defGetBoolean(opt);
else if (strcmp(opt->defname, "max_mcv_items") == 0)
@@ -275,10 +278,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv || build_histogram))
+ if (! (build_dependencies || build_mcv || build_histogram || build_ndistinct))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram, ndistinct) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -311,6 +314,7 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+ values[Anum_pg_mv_statistic_ndist_enabled-1] = BoolGetDatum(build_ndistinct);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
@@ -318,6 +322,7 @@ CreateStatistics(CreateStatsStmt *stmt)
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
nulls[Anum_pg_mv_statistic_stahist -1] = true;
+ nulls[Anum_pg_mv_statistic_standist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 3d4d136..720ff87 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -59,6 +59,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
#define MV_CLAUSE_TYPE_HIST 0x04
+#define MV_CLAUSE_TYPE_NDIST 0x08
static bool clause_is_mv_compatible(PlannerInfo *root, Node *clause, Oid varRelid,
Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
@@ -377,6 +378,9 @@ clauselist_selectivity(PlannerInfo *root,
stats, sjinfo);
}
+ if (has_stats(stats, MV_CLAUSE_TYPE_NDIST))
+ elog(WARNING, "has ndistinct coefficient stats");
+
/*
* Check that there are statistics with MCV list or histogram.
* If not, we don't need to waste time with the optimization.
@@ -2931,6 +2935,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_NDIST) && stat->ndist_built)
+ return true;
}
return false;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 963d26e..a319246 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -410,7 +410,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built || mvstat->ndist_built)
{
info = makeNode(MVStatisticInfo);
@@ -421,11 +421,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
info->hist_enabled = mvstat->hist_enabled;
+ info->ndist_enabled = mvstat->ndist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
info->hist_built = mvstat->hist_built;
+ info->ndist_built = mvstat->ndist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 9dbb3b6..d4b88e9 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o histogram.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o mvdist.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index ffb76f4..c42ca8f 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -53,6 +53,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
MVHistogram histogram = NULL;
+ double ndist = -1;
int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
@@ -92,6 +93,9 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->ndist_enabled)
+ ndist = build_mv_ndistinct(numrows, rows, attrs, stats);
+
/* build the MCV list */
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
@@ -101,7 +105,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, ndist, attrs, stats);
}
}
@@ -183,6 +187,8 @@ list_mv_stats(Oid relid)
info->mcv_built = stats->mcv_built;
info->hist_enabled = stats->hist_enabled;
info->hist_built = stats->hist_built;
+ info->ndist_enabled = stats->ndist_enabled;
+ info->ndist_built = stats->ndist_built;
result = lappend(result, info);
}
@@ -252,7 +258,7 @@ find_mv_attnums(Oid mvoid, Oid *relid)
void
update_mv_stats(Oid mvoid,
MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
- int2vector *attrs, VacAttrStats **stats)
+ double ndistcoeff, int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -292,26 +298,36 @@ update_mv_stats(Oid mvoid,
= PointerGetDatum(data);
}
+ if (ndistcoeff > 1.0)
+ {
+ nulls[Anum_pg_mv_statistic_standist -1] = false;
+ values[Anum_pg_mv_statistic_standist-1] = Float8GetDatum(ndistcoeff);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
replaces[Anum_pg_mv_statistic_stahist-1] = true;
+ replaces[Anum_pg_mv_statistic_standist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_hist_built-1] = false;
+ nulls[Anum_pg_mv_statistic_ndist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_ndist_built-1] = true;
replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
+ values[Anum_pg_mv_statistic_ndist_built-1] = BoolGetDatum(ndistcoeff > 1.0);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/mvdist.c b/src/backend/utils/mvstats/mvdist.c
new file mode 100644
index 0000000..6df7411
--- /dev/null
+++ b/src/backend/utils/mvstats/mvdist.c
@@ -0,0 +1,147 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvdist.c
+ * POSTGRES multivariate distinct coefficients
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mvdist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ *
+ */
+double
+build_mv_ndistinct(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ MultiSortSupport mss = multi_sort_init(numattrs);
+ int ndistinct;
+ double result;
+
+ /*
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simpler / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ Assert(numattrs >= 2);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* accumulate all the data into the array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ items[j].values[i]
+ = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+ }
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /* count number of distinct combinations */
+
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ {
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct++;
+ }
+
+ result = 1 / (double)ndistinct;
+
+ /*
+ * now count distinct values for each attribute and incrementally
+ * compute ndistinct(a,b) / (ndistinct(a) * ndistinct(b))
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ memset(values, 0, sizeof(Datum) * numrows);
+
+ /* accumulate all the data into the array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ bool isnull;
+ values[j] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+ }
+
+ qsort_arg((void *)values, numrows, sizeof(Datum),
+ compare_scalars_simple, &ssup);
+
+ ndistinct = 1;
+ for (j = 1; j < numrows; j++)
+ {
+ if (compare_scalars_simple(&values[j], &values[j-1], &ssup) != 0)
+ ndistinct++;
+ }
+
+ result *= ndistinct;
+ }
+
+ return result;
+}
+
+double
+load_mv_ndistinct(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->ndist_enabled && mvstat->ndist_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_standist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return DatumGetFloat8(deps);
+}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index a5945af..ee353da 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -39,6 +39,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
bool hist_enabled; /* build histogram? */
+ bool ndist_enabled; /* build ndist coefficient? */
/* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
@@ -48,6 +49,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
bool hist_built; /* histogram was built */
+ bool ndist_built; /* ndistinct coeff built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -56,6 +58,7 @@ CATALOG(pg_mv_statistic,3381)
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
bytea stahist; /* MV histogram (serialized) */
+ float8 standcoeff; /* ndistinct coeff (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -71,21 +74,24 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 15
+#define Natts_pg_mv_statistic 18
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
#define Anum_pg_mv_statistic_mcv_enabled 5
#define Anum_pg_mv_statistic_hist_enabled 6
-#define Anum_pg_mv_statistic_mcv_max_items 7
-#define Anum_pg_mv_statistic_hist_max_buckets 8
-#define Anum_pg_mv_statistic_deps_built 9
-#define Anum_pg_mv_statistic_mcv_built 10
-#define Anum_pg_mv_statistic_hist_built 11
-#define Anum_pg_mv_statistic_stakeys 12
-#define Anum_pg_mv_statistic_stadeps 13
-#define Anum_pg_mv_statistic_stamcv 14
-#define Anum_pg_mv_statistic_stahist 15
+#define Anum_pg_mv_statistic_ndist_enabled 7
+#define Anum_pg_mv_statistic_mcv_max_items 8
+#define Anum_pg_mv_statistic_hist_max_buckets 9
+#define Anum_pg_mv_statistic_deps_built 10
+#define Anum_pg_mv_statistic_mcv_built 11
+#define Anum_pg_mv_statistic_hist_built 12
+#define Anum_pg_mv_statistic_ndist_built 13
+#define Anum_pg_mv_statistic_stakeys 14
+#define Anum_pg_mv_statistic_stadeps 15
+#define Anum_pg_mv_statistic_stamcv 16
+#define Anum_pg_mv_statistic_stahist 17
+#define Anum_pg_mv_statistic_standist 18
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 1298c42..97d74e9 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -594,11 +594,13 @@ typedef struct MVStatisticInfo
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
bool hist_enabled; /* histogram enabled */
+ bool ndist_enabled; /* ndistinct coefficient enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
bool hist_built; /* histogram built */
+ bool ndist_built; /* ndistinct coefficient built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 35b2f8e..a154cd9 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -225,6 +225,7 @@ typedef MVSerializedHistogramData *MVSerializedHistogram;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
MVSerializedHistogram load_mv_histogram(Oid mvoid);
+double load_mv_ndistinct(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
@@ -266,11 +267,16 @@ MVHistogram
build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int numrows_total);
+double
+build_mv_ndistinct(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
void update_mv_stats(Oid relid, MVDependencies dependencies,
MCVList mcvlist, MVHistogram histogram,
+ double ndistcoeff,
int2vector *attrs, VacAttrStats **stats);
#ifdef DEBUG_MVHIST
--
2.1.0
On 12/12/14 05:53, Heikki Linnakangas wrote:
On 10/13/2014 01:00 AM, Tomas Vondra wrote:
Hi,
attached is a WIP patch implementing multivariate statistics.
Great! Really glad to see you working on this.
+ * FIXME This sample sizing is mostly OK when computing stats for + * individual columns, but when computing multi-variate stats + * for multivariate stats (histograms, mcv, ...) it's rather + * insufficient. For small number of dimensions it works, but + * for complex stats it'd be nice use sample proportional to + * the table (say, 0.5% - 1%) instead of a fixed size.I don't think a fraction of the table is appropriate. As long as the
sample is random, the accuracy of a sample doesn't depend much on the
size of the population. For example, if you sample 1,000 rows from a
table with 100,000 rows, or 1000 rows from a table with 100,000,000
rows, the accuracy is pretty much the same. That doesn't change when
you go from a single variable to multiple variables.You do need a bigger sample with multiple variables, however. My gut
feeling is that if you sample N rows for a single variable, with two
variables you need to sample N^2 rows to get the same accuracy. But
it's not proportional to the table size. (I have no proof for that,
but I'm sure there is literature on this.)
[...]
I did stage III statistics at University many moons ago...
The accuracy of the sample only depends on the value of N, not the total
size of the population, with the obvious constraint that N <= population
size.
The standard deviation in a random sample is proportional to the square
root of N. So using N = 100 would have a standard deviation of about
10%, so to reduce it to 5% you would need N = 400.
For multiple variables, it will also be a function of N - I don't recall
precisely how, I suspect it might M * N were M is the number of
parameters (but I'm not as certain). I think M^N might be needed if you
want all the possible correlations between sets of variable to be
reasonably significant - but I'm mostly just guessing here.
So using a % of table size is somewhat silly, looking at the above.
However, if you want to detect frequencies that occur at the 1% level,
then you will need to sample 1% of the table or greater. So which
approach is 'best', depends on what you are trying to determine. The
sample size is more useful when you need to decide between 2 different
hypothesises.
The sampling methodology, is far more important than the ratio of N to
population size - consider the bias imposed by using random telephone
numbers, even before the event of mobile phones!
Cheers,
Gavin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Dec 23, 2015 at 2:07 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
The remaining question is how unique the statistics name should be.
My initial plan was to make it unique within a table, but that of
course does not work well with the DROP STATISTICS (it'd have to
specify the table name also), and it'd also now work with statistics
on multiple tables (which is one of the reasons for abandoning ALTER
TABLE stuff).So I think it should be unique across tables. Statistics are hardly
a global object, so it should be unique within a schema. I thought
that simply using the schema of the table would work, but that of
course breaks with multiple tables in different schemas. So the only
solution seems to be explicit schema for statistics.
That solution seems good to me.
(with apologies for not having looked at the rest of this much at all)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 20, 2016 at 02:20:38PM -0500, Robert Haas wrote:
On Wed, Dec 23, 2015 at 2:07 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:The remaining question is how unique the statistics name should be.
My initial plan was to make it unique within a table, but that of
course does not work well with the DROP STATISTICS (it'd have to
specify the table name also), and it'd also now work with statistics
on multiple tables (which is one of the reasons for abandoning ALTER
TABLE stuff).So I think it should be unique across tables. Statistics are hardly
a global object, so it should be unique within a schema. I thought
that simply using the schema of the table would work, but that of
course breaks with multiple tables in different schemas. So the only
solution seems to be explicit schema for statistics.That solution seems good to me.
(with apologies for not having looked at the rest of this much at all)
Woh, this will be an optimizer game-changer, from the user perspective!
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Bruce Momjian wrote:
On Wed, Jan 20, 2016 at 02:20:38PM -0500, Robert Haas wrote:
On Wed, Dec 23, 2015 at 2:07 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:The remaining question is how unique the statistics name should be.
My initial plan was to make it unique within a table, but that of
course does not work well with the DROP STATISTICS (it'd have to
specify the table name also), and it'd also now work with statistics
on multiple tables (which is one of the reasons for abandoning ALTER
TABLE stuff).So I think it should be unique across tables. Statistics are hardly
a global object, so it should be unique within a schema. I thought
that simply using the schema of the table would work, but that of
course breaks with multiple tables in different schemas. So the only
solution seems to be explicit schema for statistics.That solution seems good to me.
(with apologies for not having looked at the rest of this much at all)
Woh, this will be an optimizer game-changer, from the user perspective!
That is the intent. The patch is huge, though -- any reviewing help is
welcome.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 01/20/2016 10:54 PM, Alvaro Herrera wrote:
Bruce Momjian wrote:
On Wed, Jan 20, 2016 at 02:20:38PM -0500, Robert Haas wrote:
On Wed, Dec 23, 2015 at 2:07 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:The remaining question is how unique the statistics name should be.
My initial plan was to make it unique within a table, but that of
course does not work well with the DROP STATISTICS (it'd have to
specify the table name also), and it'd also now work with statistics
on multiple tables (which is one of the reasons for abandoning ALTER
TABLE stuff).So I think it should be unique across tables. Statistics are hardly
a global object, so it should be unique within a schema. I thought
that simply using the schema of the table would work, but that of
course breaks with multiple tables in different schemas. So the only
solution seems to be explicit schema for statistics.That solution seems good to me.
(with apologies for not having looked at the rest of this much at all)
Woh, this will be an optimizer game-changer, from the user perspective!
That is the intent. The patch is huge, though -- any reviewing help
is welcome.
It's also true that a significant fraction of the size is documentation
(in the form of comments). However even after stripping them the patch
is not exactly small ...
I'm afraid it may be rather difficult to understand the general idea of
the patch. So if anyone is interested in discussing the patch in
Brussels next week, I'm available.
Also, in December I've posted a link to a "paper" I started writing
about the stats:
https://bitbucket.org/tvondra/mvstats-paper/src
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
Attached is v10 of the patch series. There are 9 parts at the moment:
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patch
0002-shared-infrastructure-and-functional-dependencies.patch
0003-clause-reduction-using-functional-dependencies.patch
0004-multivariate-MCV-lists.patch
0005-multivariate-histograms.patch
0006-multi-statistics-estimation.patch
0007-multivariate-ndistinct-coefficients.patch
0008-change-how-we-apply-selectivity-to-number-of-groups-.patch
0009-fixup-of-regression-tests-plans-changes-by-group-by-.patch
However, the first one is still just a temporary workaround that I plan
to address next, and the last 3 are all dealing with the ndistinct
coefficients (and shall be squashed into a single chunk).
README docs
-----------
Aside from fixing a few bugs, there are several major improvements, the
main one being that I've moved most of the comments explaining how it
all works into a set of regular README files, located in
src/backend/utils/mvstats:
1) README.stats - Overview of available types of statistics, what
clauses can be estimated, how multiple statistics are combined etc.
This is probably the right place to start.
2) docs for each type of statistics currently available
README.dependencies - soft functional dependencies
README.mcv - MCV lists
README.histogram - histograms
README.ndistinct - ndistinct coefficients
The READMEs are added and modified through the patch series, so the best
thing to do is apply all the patches and start reading.
I have not improved the user-oriented SGML documentation in this patch,
that's one of the tasks I'd lie to work on next. But the READMEs should
give you a good idea how it's supposed to work, and there are some
examples of use in the regression tests.
Significantly simplified places
-------------------------------
The patch version also significantly simplifies several places that were
needlessly complex in the previous ones - firstly the function
evaluating clauses on multivariate histograms was rather needlessly
bloated, so I've simplified it a lot. Similarly for the code in
clauselist_select() that combines multiple statistics to estimate a list
of clauses - that's much simpler now too. And various other pieces.
That being said, I still think the code in clausesel.c can be
simplified. I feel there's a lot of cruft, mostly due to unknowingly
implementing something that could be solved by an existing function.
A prime example of that is inspecting the expression tree to check if we
know how to estimate the clauses using the multivariate statistics. That
sounds like a nice match for expression walker, but currently is done by
custom code. I plan to look at that next.
Also, I'm not quite sure I understand what the varRelid parameter of
clauselist_selectivity is for, so the code may be handling that wrong
(seems to be working though).
ndistinct coefficients
----------------------
The one new piece in this patch is the GROUP BY estimation, based on the
ndistinct coefficients. So for example you can do this:
CREATE TABLE t AS SELECT mod(i,1000) AS a, mod(i,1000) AS b
FROM generate_series(1,1000000) s(i);
ANALYZE t;
EXPLAIN SELECT * FROM t GROUP BY a, b;
which currently does this:
QUERY PLAN
-----------------------------------------------------------------------
Group (cost=127757.34..135257.34 rows=99996 width=8)
Group Key: a, b
-> Sort (cost=127757.34..130257.34 rows=1000000 width=8)
Sort Key: a, b
-> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=8)
(5 rows)
but we know that there are only 1000 groups because the columns are
correlated. So let's create ndistinct statistics on the two columns:
CREATE STATISTICS s1 ON t (a,b) WITH (ndistinct);
ANALYZE t;
which results in estimates like this:
QUERY PLAN
-----------------------------------------------------------------
HashAggregate (cost=19425.00..19435.00 rows=1000 width=8)
Group Key: a, b
-> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=8)
(3 rows)
I'm not quite sure how to combine this type of statistics with MCV lists
and histograms, so for now it's used only for GROUP BY.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchbinary/octet-stream; name=0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchDownload
From 19defa4e8c1e578f3cf4099b0729357ecc333c5a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 1/9] teach pull_(varno|varattno)_walker about RestrictInfo
otherwise pull_varnos fails when processing OR clauses
---
src/backend/optimizer/util/var.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index dff52c4..80d01bd 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -197,6 +197,13 @@ pull_varnos_walker(Node *node, pull_varnos_context *context)
context->sublevels_up--;
return result;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo*)node;
+ context->varnos = bms_add_members(context->varnos,
+ rinfo->clause_relids);
+ return false;
+ }
return expression_tree_walker(node, pull_varnos_walker,
(void *) context);
}
@@ -245,6 +252,15 @@ pull_varattnos_walker(Node *node, pull_varattnos_context *context)
return false;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *)node;
+
+ return expression_tree_walker((Node*)rinfo->clause,
+ pull_varattnos_walker,
+ (void*) context);
+ }
+
/* Should not find an unplanned subquery */
Assert(!IsA(node, Query));
--
2.1.0
0002-shared-infrastructure-and-functional-dependencies.patchbinary/octet-stream; name=0002-shared-infrastructure-and-functional-dependencies.patchDownload
From 8aa6a738260ece48b31e9abc955d0c326fbf8a9a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 2/9] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate
stats, most importantly:
- adds a new system catalog (pg_mv_statistic)
- CREATE STATISTICS name ON table (columns) WITH (options)
- DROP STATISTICS name
- implementation of functional dependencies (the simplest
type of multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e.
it does not influence the query planning (subject to
follow-up patches).
The current implementation requires a valid 'ltopr' for
the columns, so that we can sort the sample rows in various
ways, both in this patch and other kinds of statistics.
Maybe this restriction could be relaxed in the future,
requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
Maybe some of the stats (functional dependencies and MCV
list with limited functionality) might be made to work
with hashes of the values, which is sufficient for equality
comparisons. But the queries would require the equality
operator anyway, so it's not really a weaker requirement.
The hashes might reduce space requirements, though.
The algorithm detecting the dependencies is rather simple
and probably needs improvements, so that it detects more
complicated dependencies, and also validation of the math.
The name 'functional dependencies' is more correct (than
'association rules') as it's exactly the name used in
relational theory (esp. Normal Forms) for tracking
column-level dependencies.
The multivariate statistics are automatically removed in
two situations
(a) after a DROP TABLE (obviously)
(b) after ALTER TABLE ... DROP COLUMN, if the statistics
would be defined on less than 2 columns (remaining)
If there are more at least 2 columns remaining, we keep
the statistics but perform cleanup on the next ANALYZE.
The dropped columns are removed from stakeys, and the new
statistics is built on the smaller set.
We can't do this at DROP COLUMN, because that'd leave us
with invalid statistics, or we'd have to throw it away
although we can still use it. This lazy approach lets us
use the statistics although some of the columns are dead.
This also adds a simple list of statistics to \d in psql.
This means the statistics are created within a schema by
using a qualified name (or using the default schema)
CREATE STATISTICS schema.statistics ON ...
and then dropped by specifying qualified name
DROP STATISTICS schema.statistics
or searching through search_path (just like with other objects).
This also gets rid of the "(opt_)stats_name" definitions in gram.y
and instead replaces them with just "opt_any_name", although the
optional case is not really handled currently - there's no generated
name yet (so either we should drop it or implement it).
I'm not entirely sure making statistics schema-specific is that
a great idea. Maybe it should be "global", but that does not seem
right (e.g. it makes multi-tenant systems based on schemas more
difficult to manage, because tenants would interact).
---
doc/src/sgml/ref/allfiles.sgml | 2 +
doc/src/sgml/ref/create_statistics.sgml | 174 ++++++++++
doc/src/sgml/ref/drop_statistics.sgml | 90 ++++++
doc/src/sgml/reference.sgml | 2 +
src/backend/catalog/Makefile | 1 +
src/backend/catalog/dependency.c | 11 +-
src/backend/catalog/heap.c | 102 ++++++
src/backend/catalog/namespace.c | 51 +++
src/backend/catalog/objectaddress.c | 22 ++
src/backend/catalog/system_views.sql | 11 +
src/backend/commands/Makefile | 6 +-
src/backend/commands/analyze.c | 21 ++
src/backend/commands/dropcmds.c | 4 +
src/backend/commands/event_trigger.c | 3 +
src/backend/commands/statscmds.c | 331 +++++++++++++++++++
src/backend/commands/tablecmds.c | 8 +-
src/backend/nodes/copyfuncs.c | 16 +
src/backend/nodes/outfuncs.c | 18 ++
src/backend/optimizer/util/plancat.c | 63 ++++
src/backend/parser/gram.y | 34 +-
src/backend/tcop/utility.c | 11 +
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/relcache.c | 59 ++++
src/backend/utils/cache/syscache.c | 23 ++
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/README.dependencies | 222 +++++++++++++
src/backend/utils/mvstats/common.c | 356 +++++++++++++++++++++
src/backend/utils/mvstats/common.h | 75 +++++
src/backend/utils/mvstats/dependencies.c | 437 ++++++++++++++++++++++++++
src/bin/psql/describe.c | 44 +++
src/include/catalog/dependency.h | 5 +-
src/include/catalog/heap.h | 1 +
src/include/catalog/indexing.h | 7 +
src/include/catalog/namespace.h | 2 +
src/include/catalog/pg_mv_statistic.h | 73 +++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/commands/defrem.h | 4 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 12 +
src/include/nodes/relation.h | 28 ++
src/include/utils/mvstats.h | 70 +++++
src/include/utils/rel.h | 4 +
src/include/utils/relcache.h | 1 +
src/include/utils/syscache.h | 2 +
src/test/regress/expected/rules.out | 9 +
src/test/regress/expected/sanity_check.out | 1 +
47 files changed, 2432 insertions(+), 11 deletions(-)
create mode 100644 doc/src/sgml/ref/create_statistics.sgml
create mode 100644 doc/src/sgml/ref/drop_statistics.sgml
create mode 100644 src/backend/commands/statscmds.c
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/README.dependencies
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index bf95453..c0f7653 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -76,6 +76,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY createSchema SYSTEM "create_schema.sgml">
<!ENTITY createSequence SYSTEM "create_sequence.sgml">
<!ENTITY createServer SYSTEM "create_server.sgml">
+<!ENTITY createStatistics SYSTEM "create_statistics.sgml">
<!ENTITY createTable SYSTEM "create_table.sgml">
<!ENTITY createTableAs SYSTEM "create_table_as.sgml">
<!ENTITY createTableSpace SYSTEM "create_tablespace.sgml">
@@ -119,6 +120,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY dropSchema SYSTEM "drop_schema.sgml">
<!ENTITY dropSequence SYSTEM "drop_sequence.sgml">
<!ENTITY dropServer SYSTEM "drop_server.sgml">
+<!ENTITY dropStatistics SYSTEM "drop_statistics.sgml">
<!ENTITY dropTable SYSTEM "drop_table.sgml">
<!ENTITY dropTableSpace SYSTEM "drop_tablespace.sgml">
<!ENTITY dropTransform SYSTEM "drop_transform.sgml">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
new file mode 100644
index 0000000..a86eae3
--- /dev/null
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -0,0 +1,174 @@
+<!--
+doc/src/sgml/ref/create_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-CREATESTATISTICS">
+ <indexterm zone="sql-createstatistics">
+ <primary>CREATE STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>CREATE STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>CREATE STATISTICS</refname>
+ <refpurpose>define a new statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable> ON <replaceable class="PARAMETER">table_name</replaceable> ( [
+ { <replaceable class="PARAMETER">column_name</replaceable> } ] [, ...])
+[ WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )
+</synopsis>
+
+ </refsynopsisdiv>
+
+ <refsect1 id="SQL-CREATESTATISTICS-description">
+ <title>Description</title>
+
+ <para>
+ <command>CREATE STATISTICS</command> will create a new multivariate
+ statistics on the table. The statistics will be created in the in the
+ current database. The statistics will be owned by the user issuing
+ the command.
+ </para>
+
+ <para>
+ If a schema name is given (for example, <literal>CREATE STATISTICS
+ myschema.mystat ...</>) then the statistics is created in the specified
+ schema. Otherwise it is created in the current schema. The name of
+ the table must be distinct from the name of any other statistics in the
+ same schema.
+ </para>
+
+ <para>
+ To be able to create a table, you must have <literal>USAGE</literal>
+ privilege on all column types or the type in the <literal>OF</literal>
+ clause, respectively.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>IF NOT EXISTS</></term>
+ <listitem>
+ <para>
+ Do not throw an error if a statistics with the same name already exists.
+ A notice is issued in this case. Note that there is no guarantee that
+ the existing statistics is anything like the one that would have been
+ created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">statistics_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to be created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">table_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the table the statistics should
+ be created on.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name</replaceable></term>
+ <listitem>
+ <para>
+ The name of a column to be included in the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ ...
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <refsect2 id="SQL-CREATESTATISTICS-parameters">
+ <title id="SQL-CREATESTATISTICS-parameters-title">Statistics Parameters</title>
+
+ <indexterm zone="sql-createstatistics-parameters">
+ <primary>statistics parameters</primary>
+ </indexterm>
+
+ <para>
+ The <literal>WITH</> clause can specify <firstterm>statistics parameters</>
+ for statistics. The currently available parameters are listed below.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>dependencies</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables functional dependencies for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1 id="SQL-CREATESTATISTICS-notes">
+ <title>Notes</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+
+ <refsect1 id="SQL-CREATESTATISTICS-examples">
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>CREATE STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-dropstatistics"></member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/doc/src/sgml/ref/drop_statistics.sgml b/doc/src/sgml/ref/drop_statistics.sgml
new file mode 100644
index 0000000..4cc0b70
--- /dev/null
+++ b/doc/src/sgml/ref/drop_statistics.sgml
@@ -0,0 +1,90 @@
+<!--
+doc/src/sgml/ref/drop_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-DROPSTATISTICS">
+ <indexterm zone="sql-dropstatistics">
+ <primary>DROP STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>DROP STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>DROP STATISTICS</refname>
+ <refpurpose>remove a statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> [, ...]
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <command>DROP STATISTICS</command> removes statistics from the database.
+ Only the statistics owner, the schema owner, and superuser can drop a
+ statistics.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>IF EXISTS</literal></term>
+ <listitem>
+ <para>
+ Do not throw an error if the statistics does not exist. A notice is
+ issued in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to drop.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>DROP STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-createstatistics"></member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 03020df..2b07b2d 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -104,6 +104,7 @@
&createSchema;
&createSequence;
&createServer;
+ &createStatistics;
&createTable;
&createTableAs;
&createTableSpace;
@@ -147,6 +148,7 @@
&dropSchema;
&dropSequence;
&dropServer;
+ &dropStatistics;
&dropTable;
&dropTableSpace;
&dropTSConfig;
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 25130ec..058b8a9 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index c48e37b..8200454 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -40,6 +40,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -160,7 +161,8 @@ static const Oid object_classes[] = {
ExtensionRelationId, /* OCLASS_EXTENSION */
EventTriggerRelationId, /* OCLASS_EVENT_TRIGGER */
PolicyRelationId, /* OCLASS_POLICY */
- TransformRelationId /* OCLASS_TRANSFORM */
+ TransformRelationId, /* OCLASS_TRANSFORM */
+ MvStatisticRelationId /* OCLASS_STATISTICS */
};
@@ -1272,6 +1274,10 @@ doDeletion(const ObjectAddress *object, int flags)
DropTransformById(object->objectId);
break;
+ case OCLASS_STATISTICS:
+ RemoveStatisticsById(object->objectId);
+ break;
+
default:
elog(ERROR, "unrecognized object class: %u",
object->classId);
@@ -2415,6 +2421,9 @@ getObjectClass(const ObjectAddress *object)
case TransformRelationId:
return OCLASS_TRANSFORM;
+
+ case MvStatisticRelationId:
+ return OCLASS_STATISTICS;
}
/* shouldn't get here */
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 6a4a9d9..e7d9aaa 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -47,6 +47,7 @@
#include "catalog/pg_constraint_fn.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_tablespace.h"
@@ -1613,7 +1614,10 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
if (attnum > 0)
+ {
RemoveStatistics(relid, attnum);
+ RemoveMVStatistics(relid, attnum);
+ }
relation_close(rel, NoLock);
}
@@ -1841,6 +1845,11 @@ heap_drop_with_catalog(Oid relid)
RemoveStatistics(relid, 0);
/*
+ * delete multi-variate statistics
+ */
+ RemoveMVStatistics(relid, 0);
+
+ /*
* delete attribute tuples
*/
DeleteAttributeTuples(relid);
@@ -2696,6 +2705,99 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
/*
+ * RemoveMVStatistics --- remove entries in pg_mv_statistic for a rel
+ *
+ * If attnum is zero, remove all entries for rel; else remove only the one(s)
+ * for that column.
+ */
+void
+RemoveMVStatistics(Oid relid, AttrNumber attnum)
+{
+ Relation pgmvstatistic;
+ TupleDesc tupdesc = NULL;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ /*
+ * When dropping a column, we'll drop statistics with a single
+ * remaining (undropped column). To do that, we need the tuple
+ * descriptor.
+ *
+ * We already have the relation locked (as we're running ALTER
+ * TABLE ... DROP COLUMN), so we'll just get the descriptor here.
+ */
+ if (attnum != 0)
+ {
+ Relation rel = relation_open(relid, NoLock);
+
+ /* multivariate stats are supported on tables and matviews */
+ if (rel->rd_rel->relkind == RELKIND_RELATION ||
+ rel->rd_rel->relkind == RELKIND_MATVIEW)
+ tupdesc = RelationGetDescr(rel);
+
+ relation_close(rel, NoLock);
+ }
+
+ if (tupdesc == NULL)
+ return;
+
+ pgmvstatistic = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ scan = systable_beginscan(pgmvstatistic,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ bool delete = true;
+
+ if (attnum != 0)
+ {
+ Datum adatum;
+ bool isnull;
+ int i;
+ int ncolumns = 0;
+ ArrayType *arr;
+ int16 *attnums;
+
+ /* get the columns */
+ adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+ attnums = (int16*)ARR_DATA_PTR(arr);
+
+ for (i = 0; i < ARR_DIMS(arr)[0]; i++)
+ {
+ /* count the column unless it's has been / is being dropped */
+ if ((! tupdesc->attrs[attnums[i]-1]->attisdropped) &&
+ (attnums[i] != attnum))
+ ncolumns += 1;
+ }
+
+ /* delete if there are less than two attributes */
+ delete = (ncolumns < 2);
+ }
+
+ if (delete)
+ simple_heap_delete(pgmvstatistic, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(pgmvstatistic, RowExclusiveLock);
+}
+
+
+/*
* RelationTruncateIndexes - truncate all indexes associated
* with the heap relation to zero tuples.
*
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 446b2ac..dfd5bef 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -4201,3 +4201,54 @@ pg_is_other_temp_schema(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(isOtherTempNamespace(oid));
}
+
+Oid
+get_statistics_oid(List *names, bool missing_ok)
+{
+ char *schemaname;
+ char *stats_name;
+ Oid namespaceId;
+ Oid stats_oid = InvalidOid;
+ ListCell *l;
+
+ /* deconstruct the name list */
+ DeconstructQualifiedName(names, &schemaname, &stats_name);
+
+ if (schemaname)
+ {
+ /* use exact schema given */
+ namespaceId = LookupExplicitNamespace(schemaname, missing_ok);
+ if (missing_ok && !OidIsValid(namespaceId))
+ stats_oid = InvalidOid;
+ else
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ }
+ else
+ {
+ /* search for it in search path */
+ recomputeNamespacePath();
+
+ foreach(l, activeSearchPath)
+ {
+ namespaceId = lfirst_oid(l);
+
+ if (namespaceId == myTempNamespace)
+ continue; /* do not look in temp namespace */
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ if (OidIsValid(stats_oid))
+ break;
+ }
+ }
+
+ if (!OidIsValid(stats_oid) && !missing_ok)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics \"%s\" does not exist",
+ NameListToString(names))));
+
+ return stats_oid;
+}
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index d2aaa6d..3a6a0b0 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -39,6 +39,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_opfamily.h"
@@ -438,9 +439,22 @@ static const ObjectPropertyType ObjectProperty[] =
Anum_pg_type_typacl,
ACL_KIND_TYPE,
true
+ },
+ {
+ MvStatisticRelationId,
+ MvStatisticOidIndexId,
+ MVSTATOID,
+ MVSTATNAMENSP,
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ InvalidAttrNumber, /* XXX same owner as relation */
+ InvalidAttrNumber, /* no ACL (same as relation) */
+ -1, /* no ACL */
+ true
}
};
+
/*
* This struct maps the string object types as returned by
* getObjectTypeDescription into ObjType enum values. Note that some enum
@@ -913,6 +927,11 @@ get_object_address(ObjectType objtype, List *objname, List *objargs,
address = get_object_address_defacl(objname, objargs,
missing_ok);
break;
+ case OBJECT_STATISTICS:
+ address.classId = MvStatisticRelationId;
+ address.objectId = get_statistics_oid(objname, missing_ok);
+ address.objectSubId = 0;
+ break;
default:
elog(ERROR, "unrecognized objtype: %d", (int) objtype);
/* placate compiler, in case it thinks elog might return */
@@ -2185,6 +2204,9 @@ check_object_ownership(Oid roleid, ObjectType objtype, ObjectAddress address,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser")));
break;
+ case OBJECT_STATISTICS:
+ /* FIXME do the right owner checks here */
+ break;
default:
elog(ERROR, "unrecognized object type: %d",
(int) objtype);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index abf9a70..b8a264e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,6 +158,17 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.staname AS staname,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats WITH (security_barrier) AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index b1ac704..5151001 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -18,8 +18,8 @@ OBJS = aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
event_trigger.o explain.o extension.o foreigncmds.o functioncmds.o \
indexcmds.o lockcmds.o matview.o operatorcmds.o opclasscmds.o \
policy.o portalcmds.o prepare.o proclang.o \
- schemacmds.o seclabel.o sequence.o tablecmds.o tablespace.o trigger.o \
- tsearchcmds.o typecmds.o user.o vacuum.o vacuumlazy.o \
- variable.o view.o
+ schemacmds.o seclabel.o sequence.o statscmds.o \
+ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o \
+ user.o vacuum.o vacuumlazy.o variable.o view.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 070df29..cbaa4e1 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -55,7 +56,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Per-index data for ANALYZE */
typedef struct AnlIndexData
@@ -460,6 +465,19 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, because we need to build more
+ * detailed stats (more MCV items / histogram buckets) to get
+ * good accuracy. Maybe it'd be appropriate to use samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -562,6 +580,9 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/dropcmds.c b/src/backend/commands/dropcmds.c
index 522027a..cd65b58 100644
--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -292,6 +292,10 @@ does_not_exist_skipping(ObjectType objtype, List *objname, List *objargs)
msg = gettext_noop("schema \"%s\" does not exist, skipping");
name = NameListToString(objname);
break;
+ case OBJECT_STATISTICS:
+ msg = gettext_noop("statistics \"%s\" does not exist, skipping");
+ name = NameListToString(objname);
+ break;
case OBJECT_TSPARSER:
if (!schema_does_not_exist_skipping(objname, &msg, &name))
{
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 9e32f8d..09061bb 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -110,6 +110,7 @@ static event_trigger_support_data event_trigger_support[] = {
{"SCHEMA", true},
{"SEQUENCE", true},
{"SERVER", true},
+ {"STATISTICS", true},
{"TABLE", true},
{"TABLESPACE", false},
{"TRANSFORM", true},
@@ -1106,6 +1107,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
case OBJECT_RULE:
case OBJECT_SCHEMA:
case OBJECT_SEQUENCE:
+ case OBJECT_STATISTICS:
case OBJECT_TABCONSTRAINT:
case OBJECT_TABLE:
case OBJECT_TRANSFORM:
@@ -1167,6 +1169,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
case OCLASS_DEFACL:
case OCLASS_EXTENSION:
case OCLASS_POLICY:
+ case OCLASS_STATISTICS:
return true;
}
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
new file mode 100644
index 0000000..84a8b13
--- /dev/null
+++ b/src/backend/commands/statscmds.c
@@ -0,0 +1,331 @@
+/*-------------------------------------------------------------------------
+ *
+ * statscmds.c
+ * Commands for creating and altering multivariate statistics
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/commands/statscmds.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/multixact.h"
+#include "access/reloptions.h"
+#include "access/relscan.h"
+#include "access/sysattr.h"
+#include "access/xact.h"
+#include "access/xlog.h"
+#include "catalog/catalog.h"
+#include "catalog/dependency.h"
+#include "catalog/heap.h"
+#include "catalog/index.h"
+#include "catalog/indexing.h"
+#include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_constraint.h"
+#include "catalog/pg_depend.h"
+#include "catalog/pg_foreign_table.h"
+#include "catalog/pg_inherits.h"
+#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
+#include "catalog/pg_namespace.h"
+#include "catalog/pg_opclass.h"
+#include "catalog/pg_tablespace.h"
+#include "catalog/pg_trigger.h"
+#include "catalog/pg_type.h"
+#include "catalog/pg_type_fn.h"
+#include "catalog/storage.h"
+#include "catalog/toasting.h"
+#include "commands/cluster.h"
+#include "commands/comment.h"
+#include "commands/defrem.h"
+#include "commands/event_trigger.h"
+#include "commands/policy.h"
+#include "commands/sequence.h"
+#include "commands/tablecmds.h"
+#include "commands/tablespace.h"
+#include "commands/trigger.h"
+#include "commands/typecmds.h"
+#include "commands/user.h"
+#include "executor/executor.h"
+#include "foreign/foreign.h"
+#include "miscadmin.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/parsenodes.h"
+#include "optimizer/clauses.h"
+#include "optimizer/planner.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_coerce.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_expr.h"
+#include "parser/parse_oper.h"
+#include "parser/parse_relation.h"
+#include "parser/parse_type.h"
+#include "parser/parse_utilcmd.h"
+#include "parser/parser.h"
+#include "pgstat.h"
+#include "rewrite/rewriteDefine.h"
+#include "rewrite/rewriteHandler.h"
+#include "rewrite/rewriteManip.h"
+#include "storage/bufmgr.h"
+#include "storage/lmgr.h"
+#include "storage/lock.h"
+#include "storage/predicate.h"
+#include "storage/smgr.h"
+#include "utils/acl.h"
+#include "utils/builtins.h"
+#include "utils/fmgroids.h"
+#include "utils/inval.h"
+#include "utils/lsyscache.h"
+#include "utils/memutils.h"
+#include "utils/relcache.h"
+#include "utils/ruleutils.h"
+#include "utils/snapmgr.h"
+#include "utils/syscache.h"
+#include "utils/tqual.h"
+#include "utils/typcache.h"
+#include "utils/mvstats.h"
+
+
+/* used for sorting the attnums in ExecCreateStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the CREATE STATISTICS name ON table (columns) WITH (options)
+ *
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+ObjectAddress
+CreateStatistics(CreateStatsStmt *stmt)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+ ObjectAddress address = InvalidObjectAddress;
+ char *namestr;
+ NameData staname;
+ Oid statoid;
+ Oid namespaceId;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+ Relation rel;
+ ObjectAddress parentobject, childobject;
+
+ /* by default build nothing */
+ bool build_dependencies = false;
+
+ Assert(IsA(stmt, CreateStatsStmt));
+
+ /* resolve the pieces of the name (namespace etc.) */
+ namespaceId = QualifiedNameGetCreationNamespace(stmt->defnames, &namestr);
+ namestrcpy(&staname, namestr);
+
+ /*
+ * If if_not_exists was given and the statistics already exists, bail out.
+ */
+ if (stmt->if_not_exists &&
+ SearchSysCacheExists2(MVSTATNAMENSP,
+ PointerGetDatum(&staname),
+ ObjectIdGetDatum(namespaceId)))
+ {
+ ereport(NOTICE,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("statistics \"%s\" already exists, skipping",
+ namestr)));
+ return InvalidObjectAddress;
+ }
+
+ rel = heap_openrv(stmt->relation, AccessExclusiveLock);
+
+ /* transform the column names to attnum values */
+
+ foreach(l, stmt->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, stmt->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+ values[Anum_pg_mv_statistic_staname -1] = NameGetDatum(&staname);
+ values[Anum_pg_mv_statistic_stanamespace -1] = ObjectIdGetDatum(namespaceId);
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ statoid = HeapTupleGetOid(htup);
+
+ heap_freetuple(htup);
+
+
+ /*
+ * Store a dependency too, so that statistics are dropped on DROP TABLE
+ */
+ parentobject.classId = RelationRelationId;
+ parentobject.objectId = ObjectIdGetDatum(RelationGetRelid(rel));
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+ /*
+ * Also record dependency on the schema (to drop statistics on DROP SCHEMA)
+ */
+ parentobject.classId = NamespaceRelationId;
+ parentobject.objectId = ObjectIdGetDatum(namespaceId);
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ relation_close(rel, NoLock);
+
+ /*
+ * Invalidate relcache so that others see the new statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ ObjectAddressSet(address, MvStatisticRelationId, statoid);
+
+ return address;
+}
+
+
+/*
+ * Implements the DROP STATISTICS
+ *
+ * DROP STATISTICS stats_name ON table_name
+ *
+ * The first one requires an exact match, the second one just drops
+ * all the statistics on a table.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+ Relation relation;
+ HeapTuple tup;
+
+ /*
+ * Delete the pg_proc tuple.
+ */
+ relation = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ tup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(statsOid));
+ if (!HeapTupleIsValid(tup)) /* should not happen */
+ elog(ERROR, "cache lookup failed for statistics %u", statsOid);
+
+ simple_heap_delete(relation, &tup->t_self);
+
+ ReleaseSysCache(tup);
+
+ heap_close(relation, RowExclusiveLock);
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 96dc923..96ab02f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -37,6 +37,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_tablespace.h"
@@ -95,7 +96,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -143,8 +144,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index a9e9cc3..1a04024 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4124,6 +4124,19 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static CreateStatsStmt *
+_copyCreateStatsStmt(const CreateStatsStmt *from)
+{
+ CreateStatsStmt *newnode = makeNode(CreateStatsStmt);
+
+ COPY_NODE_FIELD(defnames);
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4999,6 +5012,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_CreateStatsStmt:
+ retval = _copyCreateStatsStmt(from);
+ break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 85acce8..474d2c7 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1968,6 +1968,21 @@ _outIndexOptInfo(StringInfo str, const IndexOptInfo *node)
}
static void
+_outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
+{
+ WRITE_NODE_TYPE("MVSTATISTICINFO");
+
+ /* NB: this isn't a complete set of fields */
+ WRITE_OID_FIELD(mvoid);
+
+ /* enabled statistics */
+ WRITE_BOOL_FIELD(deps_enabled);
+
+ /* built/available statistics */
+ WRITE_BOOL_FIELD(deps_built);
+}
+
+static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
@@ -3409,6 +3424,9 @@ _outNode(StringInfo str, const void *obj)
case T_PlannerParamItem:
_outPlannerParamItem(str, obj);
break;
+ case T_MVStatisticInfo:
+ _outMVStatisticInfo(str, obj);
+ break;
case T_ExtensibleNode:
_outExtensibleNode(str, obj);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 0ea9fcf..b9de71d 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -28,6 +28,7 @@
#include "catalog/dependency.h"
#include "catalog/heap.h"
#include "catalog/pg_am.h"
+#include "catalog/pg_mv_statistic.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -40,7 +41,9 @@
#include "parser/parsetree.h"
#include "rewrite/rewriteManip.h"
#include "storage/bufmgr.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
+#include "utils/syscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -94,6 +97,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
Relation relation;
bool hasindex;
List *indexinfos = NIL;
+ List *stainfos = NIL;
/*
* We need not lock the relation since it was already locked, either by
@@ -387,6 +391,65 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->indexlist = indexinfos;
+ if (true)
+ {
+ List *mvstatoidlist;
+ ListCell *l;
+
+ mvstatoidlist = RelationGetMVStatList(relation);
+
+ foreach(l, mvstatoidlist)
+ {
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ Oid mvoid = lfirst_oid(l);
+ Form_pg_mv_statistic mvstat;
+ MVStatisticInfo *info;
+
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ continue;
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /* unavailable stats are not interesting for the planner */
+ if (mvstat->deps_built)
+ {
+ info = makeNode(MVStatisticInfo);
+
+ info->mvoid = mvoid;
+ info->rel = rel;
+
+ /* enabled statistics */
+ info->deps_enabled = mvstat->deps_enabled;
+
+ /* built/available statistics */
+ info->deps_built = mvstat->deps_built;
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ info->stakeys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+
+ stainfos = lcons(info, stainfos);
+ }
+
+ ReleaseSysCache(htup);
+ }
+
+ list_free(mvstatoidlist);
+ }
+
+ rel->mvstatlist = stainfos;
+
/* Grab foreign-table info using the relcache, while we have it */
if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b307b48..3be3f02 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -241,7 +241,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
ConstraintsSetStmt CopyStmt CreateAsStmt CreateCastStmt
CreateDomainStmt CreateExtensionStmt CreateGroupStmt CreateOpClassStmt
CreateOpFamilyStmt AlterOpFamilyStmt CreatePLangStmt
- CreateSchemaStmt CreateSeqStmt CreateStmt CreateTableSpaceStmt
+ CreateSchemaStmt CreateSeqStmt CreateStmt CreateStatsStmt CreateTableSpaceStmt
CreateFdwStmt CreateForeignServerStmt CreateForeignTableStmt
CreateAssertStmt CreateTransformStmt CreateTrigStmt CreateEventTrigStmt
CreateUserStmt CreateUserMappingStmt CreateRoleStmt CreatePolicyStmt
@@ -809,6 +809,7 @@ stmt :
| CreateSchemaStmt
| CreateSeqStmt
| CreateStmt
+ | CreateStatsStmt
| CreateTableSpaceStmt
| CreateTransformStmt
| CreateTrigStmt
@@ -3436,6 +3437,36 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * CREATE STATISTICS stats_name ON relname (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+
+CreateStatsStmt: CREATE STATISTICS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $3;
+ n->relation = $5;
+ n->keys = $7;
+ n->options = $9;
+ n->if_not_exists = false;
+ $$ = (Node *)n;
+ }
+ | CREATE STATISTICS IF_P NOT EXISTS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $6;
+ n->relation = $8;
+ n->keys = $10;
+ n->options = $12;
+ n->if_not_exists = true;
+ $$ = (Node *)n;
+ }
+ ;
+
/*****************************************************************************
*
@@ -5621,6 +5652,7 @@ drop_type: TABLE { $$ = OBJECT_TABLE; }
| TEXT_P SEARCH DICTIONARY { $$ = OBJECT_TSDICTIONARY; }
| TEXT_P SEARCH TEMPLATE { $$ = OBJECT_TSTEMPLATE; }
| TEXT_P SEARCH CONFIGURATION { $$ = OBJECT_TSCONFIGURATION; }
+ | STATISTICS { $$ = OBJECT_STATISTICS; }
;
any_name_list:
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 045f7f0..2ba88e2 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1520,6 +1520,10 @@ ProcessUtilitySlow(Node *parsetree,
address = ExecSecLabelStmt((SecLabelStmt *) parsetree);
break;
+ case T_CreateStatsStmt: /* CREATE STATISTICS */
+ address = CreateStatistics((CreateStatsStmt *) parsetree);
+ break;
+
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(parsetree));
@@ -2160,6 +2164,9 @@ CreateCommandTag(Node *parsetree)
case OBJECT_TRANSFORM:
tag = "DROP TRANSFORM";
break;
+ case OBJECT_STATISTICS:
+ tag = "DROP STATISTICS";
+ break;
default:
tag = "???";
}
@@ -2527,6 +2534,10 @@ CreateCommandTag(Node *parsetree)
tag = "EXECUTE";
break;
+ case T_CreateStatsStmt:
+ tag = "CREATE STATISTICS";
+ break;
+
case T_DeallocateStmt:
{
DeallocateStmt *stmt = (DeallocateStmt *) parsetree;
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 130c06d..3bc4c8a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -47,6 +47,7 @@
#include "catalog/pg_auth_members.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
@@ -3956,6 +3957,62 @@ RelationGetIndexList(Relation relation)
return result;
}
+
+List *
+RelationGetMVStatList(Relation relation)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result;
+ List *oldlist;
+ MemoryContext oldcxt;
+
+ /* Quick exit if we already computed the list. */
+ if (relation->rd_mvstatvalid != 0)
+ return list_copy(relation->rd_mvstatlist);
+
+ /*
+ * We build the list we intend to return (in the caller's context) while
+ * doing the scan. After successfully completing the scan, we copy that
+ * list into the relcache entry. This avoids cache-context memory leakage
+ * if we get some sort of error partway through.
+ */
+ result = NIL;
+
+ /* Prepare to scan pg_index for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(relation)));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ /* TODO maybe include only already built statistics? */
+ result = insert_ordered_oid(result, HeapTupleGetOid(htup));
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* Now save a copy of the completed list in the relcache entry. */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ oldlist = relation->rd_mvstatlist;
+ relation->rd_mvstatlist = list_copy(result);
+
+ relation->rd_mvstatvalid = true;
+ MemoryContextSwitchTo(oldcxt);
+
+ /* Don't leak the old list, if there is one */
+ list_free(oldlist);
+
+ return result;
+}
+
/*
* insert_ordered_oid
* Insert a new Oid into a sorted list of Oids, preserving ordering
@@ -4920,6 +4977,8 @@ load_relcache_init_file(bool shared)
rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_idattr = NULL;
+ rel->rd_mvstatvalid = false;
+ rel->rd_mvstatlist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 65ffe84..3c1bc4b 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -44,6 +44,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -502,6 +503,28 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATNAMENSP */
+ MvStatisticNameIndexId,
+ 2,
+ {
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ 0,
+ 0
+ },
+ 4
+ },
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 4
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.dependencies b/src/backend/utils/mvstats/README.dependencies
new file mode 100644
index 0000000..1f96fbc
--- /dev/null
+++ b/src/backend/utils/mvstats/README.dependencies
@@ -0,0 +1,222 @@
+Soft functional dependencies
+============================
+
+A type of multivariate statistics used to capture cases when one column (or
+possibly a combination of columns) determines values in another column. We may
+also say that one column implies the other one.
+
+A simple artificial example may be a table with two columns, created like this
+
+ CREATE TABLE t (a INT, b INT)
+ AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+
+Clearly, once we know the value for column 'a' the value for 'b' is trivially
+determined, as it's simply (a/10). A more practical example may be addresses,
+where (ZIP code -> city name), i.e. once we know the ZIP, we probably know the
+city it belongs to, as ZIP codes are usually assigned to one city. Larger cities
+may have multiple ZIP codes, so the dependency can't be reversed.
+
+Functional dependencies are a concept well described in relational theory,
+particularly in definition of normalization and "normal forms". Wikipedia has a
+nice definition of a functional dependency [1]:
+
+ In a given table, an attribute Y is said to have a functional dependency on
+ a set of attributes X (written X -> Y) if and only if each X value is
+ associated with precisely one Y value. For example, in an "Employee" table
+ that includes the attributes "Employee ID" and "Employee Date of Birth", the
+ functional dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ It follows from the previous two sentences that each {Employee ID} is
+ associated with precisely one {Employee Date of Birth}.
+
+ [1] http://en.wikipedia.org/wiki/Database_normalization
+
+Many datasets might be normalized not to contain such dependencies, but often
+it's not practical for various reasons. In some cases it's actually a conscious
+design choice to model the dataset in denormalized way, either because of
+performance or to make querying easier.
+
+The functional dependencies are called 'soft' because the implementation is
+meant to allow small number of rows contradicting the dependency. Many actual
+data sets contain some sort of errors, either because of data entry mistakes
+(user mistyping the ZIP code) or issues in generating the data (e.g. a ZIP code
+mistakenly assigned to two cities in different states). A strict implementation
+would ignore dependencies on such noisy data, rendering the approach unusable on
+such data sets.
+
+
+Mining dependencies (ANALYZE)
+-----------------------------
+
+The current build algorithm is rather simple - for each pair (a,b) of columns,
+the data are sorted lexicographically (first by 'a', then by 'b'). Then for each
+group (rows with the same 'a' value) we decide whether the group is neutral,
+supporting or contradicting the dependency (a->b).
+
+A group is considered neutral when it's too small - e.g. when there's a single
+row in the group, there can't possibly be multiple values in 'b'. For this
+reason we ignore groups smaller than a threshold (currently 3 rows).
+
+For sufficiently large groups (3 rows or more), we count the number of distinct
+values in 'b'. When there's a single 'b' value, the group is considered to
+support the dependency (a->b), otherwise it's condidered as contradicting it.
+
+At the end, we compare the number of rows in supporting and contradicting groups,
+and if there are at least 10x as many supporting rows, we consider the
+functional dependency to be valid.
+
+
+This approach has the negative property that the algorithm is that it's a bit
+fragile with respect to the sample - there may be data sets producing quite
+different results for each ANALYZE execution (as even a single row may change
+the outcome of the final 10x test).
+
+It was proposed to make the dependencies "fuzzy" - e.g. track some coefficient
+between [0,1] determining how much the dependency holds. That would however mean
+we have to keep all the dependencies, as eliminating them based on the value of
+the coefficient (e.g. throw away dependencies <= 0.5) would result in exactly
+the same fragility issues. This would also make it more complicated to combine
+dependencies. So this does not seem like a practical approach.
+
+A better approach might be to replace the constants (min_group_size=3 and 10x)
+with values somehow related to the particular data set.
+
+
+Clause reduction (planner/optimizer)
+------------------------------------
+
+Apllying the functional dependencies is quite simple - given a list of equality
+clauses, check which clauses are redundant (i.e. implied by some other clause).
+For example given clause list
+
+ (a = 2) AND (b = 2) AND (c = 3)
+
+and dependencies (a->b) and (a->d), the list of clauses may be simplified to
+
+ (a = 1) AND (c = 3)
+
+Functional dependencies may only be applied to equality clauses, all other types
+of clauses are ignored. See clauselist_apply_dependencies() for more details.
+
+
+Compatibility of clauses
+------------------------
+
+The reduction assumes the clauses really are redundant, and the value in the
+reduced clause (b=2) is the value determined by (a=1). If that's not the case
+and the values are "incompatible" the result will be over-estimation.
+
+This may happen for example when using conditions on ZIP and city name with
+mismatching values (ZIP for a different city), etc. In such case the result
+set will be empty, but we'll estimate the selectivity using the ZIP condition.
+
+In this case the default estimation based on AVIA principle happens to work
+better, but mostly by chance.
+
+
+Dependencies vs. MCV/histogram
+------------------------------
+
+In some cases the "compatibility" of the conditions might be verified using the
+other types of multivariate stats - MCV lists and histograms.
+
+For MCV lists the verification might be very simple - peek into the list if
+there are any items matching the clause on the 'a' column (e.g. ZIP code), and
+if such item is found, check that the 'b' column matches the other clause. If it
+does not, the clauses are contradictory. We can't really say if such item was
+not found, except maybe restricting the selectivity using the MCV data (e.g.
+using min/max selectivity, or something).
+
+With histograms, it might work similarly - we can't check the values directly
+(because histograms use buckets, unlike MCV lists, storing the actual values).
+So we can only observe the buckets matching the clauses - if those buckets have
+very low frequency, it probably means the two clauses are incompatible.
+
+It's unclear what 'low frequency' is, but if one of the clauses is implied
+(automatically true because of the other clause), then
+
+ selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+
+So we might compute selectivity of the first clause - for example using regular
+statistics. And then check if the selectivity computed from the histogram is
+about the same (or significantly lower).
+
+The problem is that histograms work well only when the data ordering matches the
+natural meaning. For values that serve as labels - like city names or ZIP codes,
+or even generated IDs, histograms really don't work all that well. For example
+sorting cities by name won't match the sorting of ZIP codes, rendering the
+histogram unusable.
+
+So MCVs are probably going to work much better, because they don't really assume
+any sort of ordering. And it's probably more appropriate for the label-like data.
+
+A good question however is why even use functional dependencies in such cases
+and not simply use the MCV/histogram instead. One reason is that the functional
+dependencies allow fallback to regular stats, and often produce more accurate
+estimates - especially compared to histograms, that are quite bad in estimating
+equality clauses.
+
+
+Limitations
+-----------
+
+Let's see the main liminations of functional dependencies, especially those
+related to the current implementation.
+
+The current implementation supports only dependencies between two columns, but
+this is merely a simplification of the initial implementation. It's certainly
+useful to mine for dependencies involving multiple columns on the 'left' side,
+i.e. a condition for the dependency. That is dependencies like (a,b -> c).
+
+The implementation may/should be smart enough not to mine redundant conditions,
+e.g. (a->b) and (a,c -> b), because the latter is a trivial consequence of the
+former one (if values of 'a' determine 'b', adding another column won't change
+that relationship). The ANALYZE should first analyze 1:1 dependencies, then 2:1
+dependencies (and skip the already identified ones), etc.
+
+For example the dependency
+
+ (city name -> zip code)
+
+is much stronger, i.e. whenever it hold, then
+
+ (city name, state name -> zip code)
+
+holds too. But in case there are cities with the same name in different states,
+then only the latter dependency will be valid.
+
+Of course, there probably are cities with the same name within a single state,
+but hopefully this is relatively rare occurence (and thus we'll still detect
+the 'soft' dependency).
+
+Handling multiple columns on the right side of the dependency, is not necessary,
+as those dependencies may be simply decomposed into a set of dependencies with
+the same meaning, one for each column on the right side. For example
+
+ (a -> b,c)
+
+is exactly the same as
+
+ (a -> b) & (a -> c)
+
+Of course, storing the first form may be more efficient thant storing multiple
+'simple' dependencies separately.
+
+
+TODO Support dependencies with multiple columns on left/right.
+
+TODO Investigate using histogram and MCV list to verify the dependencies.
+
+TODO Investigate statistical testing of the distribution (to decide whether it
+ makes sense to build the histogram/MCV list).
+
+TODO Using a min/max of selectivities would probably make more sense for the
+ associated columns.
+
+TODO Consider eliminating the implied columns from the histogram and MCV lists
+ (but maybe that's not a good idea, because that'd make it impossible to use
+ these stats for non-equality clauses and also it wouldn't be possible to
+ use the stats for verification of the dependencies).
+
+TODO The reduction probably might be extended to also handle IS NULL clauses,
+ assuming we fix the ANALYZE to properly handle NULL values. We however
+ won't be able to reduce IS NOT NULL (unless I'm missing something).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..a755c49
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,356 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+static List* list_mv_stats(Oid relid);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ ListCell *lc;
+ List *mvstats;
+
+ TupleDesc tupdesc = RelationGetDescr(onerel);
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel));
+
+ foreach (lc, mvstats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies deps = NULL;
+
+ VacAttrStats **stats = NULL;
+ int numatts = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = stat->stakeys;
+
+ /* see how many of the columns are not dropped */
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ numatts += 1;
+
+ /* if there are dropped attributes, build a filtered int2vector */
+ if (numatts != attrs->dim1)
+ {
+ int16 *tmp = palloc0(numatts * sizeof(int16));
+ int attnum = 0;
+
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ tmp[attnum++] = attrs->values[j];
+
+ pfree(attrs);
+ attrs = buildint2vector(tmp, numatts);
+ }
+
+ /* filter only the interesting vacattrstats records */
+ stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(stat->mvoid, deps, attrs);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+static List*
+list_mv_stats(Oid relid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result = NIL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ info->mvoid = HeapTupleGetOid(htup);
+ info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_built = stats->deps_built;
+
+ result = lappend(result, info);
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_stakeys-1] = false;
+
+ /* use the new attnums, in case we removed some dropped ones */
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_stakeys -1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..6d5465b
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/datum.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "access/sysattr.h"
+
+#include "utils/mvstats.h"
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..2a064a0
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,437 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ * Detect functional dependencies between columns.
+ *
+ * TODO This builds a complete set of dependencies, i.e. including transitive
+ * dependencies - if we identify [A => B] and [B => C], we're likely to
+ * identify [A => C] too. It might be better to keep only the minimal set
+ * of dependencies, i.e. prune all the dependencies that we can recreate
+ * by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may be
+ * recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check before
+ * actually doing the work
+ *
+ * The second option has the advantage that we don't really need to perform
+ * the sort/count. It's not sufficient alone, though, because we may
+ * discover the dependencies in the wrong order. For example we may find
+ *
+ * (a -> b), (a -> c) and then (b -> c)
+ *
+ * None of those dependencies is a combination of the already known ones,
+ * yet (a -> C) is a combination of (a -> b) and (b -> c).
+ *
+ *
+ * FIXME Currently we simply replace NULL values with 0 and then handle is as
+ * a regular value, but that groups NULL and actual 0 values. That's
+ * clearly incorrect - we need to handle NULL values as a separate value.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct values in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we can estimate the
+ * average group size and use it as a threshold. Or something
+ * like that. Seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, stats);
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ SortItem current;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, stats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ current = items[0];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
+ {
+ n_violations += 1;
+ }
+
+ current = items[i];
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means there's a 1:1 relation (or one is a 'label'), making the
+ * conditions rather redundant. Although it's possible that the
+ * query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index fd8dc91..4f106c3 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2104,6 +2104,50 @@ describeOneTableDetails(const char *schemaname,
PQclear(result);
}
+ /* print any multivariate statistics */
+ if (pset.sversion >= 90500)
+ {
+ printfPQExpBuffer(&buf,
+ "SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
+ " deps_enabled,\n"
+ " deps_built,\n"
+ " (SELECT string_agg(attname::text,', ')\n"
+ " FROM ((SELECT unnest(stakeys) AS attnum) s\n"
+ " JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
+ "FROM pg_mv_statistic stat WHERE starelid = '%s' ORDER BY 1;",
+ oid);
+
+ result = PSQLexec(buf.data);
+ if (!result)
+ goto error_return;
+ else
+ tuples = PQntuples(result);
+
+ if (tuples > 0)
+ {
+ printTableAddFooter(&cont, _("Statistics:"));
+ for (i = 0; i < tuples; i++)
+ {
+ printfPQExpBuffer(&buf, " ");
+
+ /* statistics name (qualified with namespace) */
+ appendPQExpBuffer(&buf, "\"%s.%s\" ",
+ PQgetvalue(result, i, 1),
+ PQgetvalue(result, i, 2));
+
+ /* options */
+ if (!strcmp(PQgetvalue(result, i, 4), "t"))
+ appendPQExpBuffer(&buf, "(dependencies)");
+
+ appendPQExpBuffer(&buf, " ON (%s)",
+ PQgetvalue(result, i, 6));
+
+ printTableAddFooter(&cont, buf.data);
+ }
+ }
+ PQclear(result);
+ }
+
/* print rules */
if (tableinfo.hasrules && tableinfo.relkind != 'm')
{
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 049bf9f..12211fe 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -153,10 +153,11 @@ typedef enum ObjectClass
OCLASS_EXTENSION, /* pg_extension */
OCLASS_EVENT_TRIGGER, /* pg_event_trigger */
OCLASS_POLICY, /* pg_policy */
- OCLASS_TRANSFORM /* pg_transform */
+ OCLASS_TRANSFORM, /* pg_transform */
+ OCLASS_STATISTICS /* pg_mv_statistics */
} ObjectClass;
-#define LAST_OCLASS OCLASS_TRANSFORM
+#define LAST_OCLASS OCLASS_STATISTICS
/* in dependency.c */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index b80d8d8..5ae42f7 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -119,6 +119,7 @@ extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
DropBehavior behavior, bool complain, bool internal);
extern void RemoveAttrDefaultById(Oid attrdefId);
extern void RemoveStatistics(Oid relid, AttrNumber attnum);
+extern void RemoveMVStatistics(Oid relid, AttrNumber attnum);
extern Form_pg_attribute SystemAttributeDefinition(AttrNumber attno,
bool relhasoids);
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index ab2c1a8..a768bb5 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,13 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_name_index, 3997, on pg_mv_statistic using btree(staname name_ops, stanamespace oid_ops));
+#define MvStatisticNameIndexId 3997
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index 2ccb3a7..44cf9c6 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -137,6 +137,8 @@ extern Oid get_collation_oid(List *collname, bool missing_ok);
extern Oid get_conversion_oid(List *conname, bool missing_ok);
extern Oid FindDefaultConversionProc(int32 for_encoding, int32 to_encoding);
+extern Oid get_statistics_oid(List *names, bool missing_ok);
+
/* initialization & transaction cleanup code */
extern void InitializeSearchPath(void);
extern void AtEOXact_Namespace(bool isCommit, bool parallel);
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..a568a07
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,73 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+ NameData staname; /* statistics name */
+ Oid stanamespace; /* OID of namespace containing this statistics */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_mv_statistic
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 7
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_staname 2
+#define Anum_pg_mv_statistic_stanamespace 3
+#define Anum_pg_mv_statistic_deps_enabled 4
+#define Anum_pg_mv_statistic_deps_built 5
+#define Anum_pg_mv_statistic_stakeys 6
+#define Anum_pg_mv_statistic_stadeps 7
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 62b9125..20d565c 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2666,6 +2666,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index b7a38ce..a52096b 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3577, 3578);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 54f67e9..99a6a62 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -75,6 +75,10 @@ extern ObjectAddress DefineOperator(List *names, List *parameters);
extern void RemoveOperatorById(Oid operOid);
extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
+/* commands/statscmds.c */
+extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
+extern void RemoveStatisticsById(Oid statsOid);
+
/* commands/aggregatecmds.c */
extern ObjectAddress DefineAggregate(List *name, List *args, bool oldstyle,
List *parameters, const char *queryString);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index c407fa2..2226aad 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -251,6 +251,7 @@ typedef enum NodeTag
T_PlaceHolderInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
+ T_MVStatisticInfo,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
@@ -386,6 +387,7 @@ typedef enum NodeTag
T_CreatePolicyStmt,
T_AlterPolicyStmt,
T_CreateTransformStmt,
+ T_CreateStatsStmt,
/*
* TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fd0629..e1807fb 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -601,6 +601,17 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct CreateStatsStmt
+{
+ NodeTag type;
+ List *defnames; /* qualified name (list of Value strings) */
+ RangeVar *relation; /* relation to build statistics on */
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+ bool if_not_exists; /* just do nothing if statistics already exists? */
+} CreateStatsStmt;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1410,6 +1421,7 @@ typedef enum ObjectType
OBJECT_RULE,
OBJECT_SCHEMA,
OBJECT_SEQUENCE,
+ OBJECT_STATISTICS,
OBJECT_TABCONSTRAINT,
OBJECT_TABLE,
OBJECT_TABLESPACE,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index af8cb6b..de86d01 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -503,6 +503,7 @@ typedef struct RelOptInfo
List *lateral_vars; /* LATERAL Vars and PHVs referenced by rel */
Relids lateral_referencers; /* rels that reference me laterally */
List *indexlist; /* list of IndexOptInfo */
+ List *mvstatlist; /* list of MVStatisticInfo */
BlockNumber pages; /* size estimates derived from pg_class */
double tuples;
double allvisfrac;
@@ -600,6 +601,33 @@ typedef struct IndexOptInfo
void (*amcostestimate) (); /* AM's cost estimator */
} IndexOptInfo;
+/*
+ * MVStatisticInfo
+ * Information about multivariate stats for planning/optimization
+ *
+ * This contains information about which columns are covered by the
+ * statistics (stakeys), which options were requested while adding the
+ * statistics (*_enabled), and which kinds of statistics were actually
+ * built and are available for the optimizer (*_built).
+ */
+typedef struct MVStatisticInfo
+{
+ NodeTag type;
+
+ Oid mvoid; /* OID of the statistics row */
+ RelOptInfo *rel; /* back-link to index's table */
+
+ /* enabled statistics */
+ bool deps_enabled; /* functional dependencies enabled */
+
+ /* built/available statistics */
+ bool deps_built; /* functional dependencies built */
+
+ /* columns in the statistics (attnums) */
+ int2vector *stakeys; /* attnums of the columns covered */
+
+} MVStatisticInfo;
+
/*
* EquivalenceClasses
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..7ebd961
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,70 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "fmgr.h"
+#include "commands/vacuum.h"
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+
+#endif
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index f2bebf2..8771f9c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -61,6 +61,7 @@ typedef struct RelationData
bool rd_isvalid; /* relcache entry is valid */
char rd_indexvalid; /* state of rd_indexlist: 0 = not valid, 1 =
* valid, 2 = temporarily forced */
+ bool rd_mvstatvalid; /* state of rd_mvstatlist: true/false */
/*
* rd_createSubid is the ID of the highest subtransaction the rel has
@@ -93,6 +94,9 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
+
+ /* data managed by RelationGetMVStatList: */
+ List *rd_mvstatlist; /* list of OIDs of multivariate stats */
/* data managed by RelationGetIndexAttrBitmap: */
Bitmapset *rd_indexattr; /* identifies columns used in indexes */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 1b48304..9f03c8d 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -38,6 +38,7 @@ extern void RelationClose(Relation relation);
* Routines to compute/retrieve additional cached information
*/
extern List *RelationGetIndexList(Relation relation);
+extern List *RelationGetMVStatList(Relation relation);
extern Oid RelationGetOidIndex(Relation relation);
extern Oid RelationGetReplicaIndex(Relation relation);
extern List *RelationGetIndexExpressions(Relation relation);
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 256615b..0e0658d 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,8 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATNAMENSP,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 81bc5c9..84b4425 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1368,6 +1368,15 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.staname,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index eb0bc88..92a0d8a 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
--
2.1.0
0003-clause-reduction-using-functional-dependencies.patchbinary/octet-stream; name=0003-clause-reduction-using-functional-dependencies.patchDownload
From 730f652aa01850d09586e354fe37c1478cc22e46 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 19:42:18 +0200
Subject: [PATCH 3/9] clause reduction using functional dependencies
During planning, use functional dependencies to decide which
clauses to skip during cardinality estimation. Initial and
rather simplistic implementation.
This only works with regular WHERE clauses, not clauses used
for join clauses.
Note: The clause_is_mv_compatible() needs to identify the
relation (so that we can fetch the list of multivariate stats
by OID). planner_rt_fetch() seems like the appropriate way to
get the relation OID, but apparently it only works with simple
vars. Maybe examine_variable() would make this work with more
complex vars too?
Includes regression tests analyzing functional dependencies
(part of ANALYZE) on several datasets (no dependencies, no
transitive dependencies, ...).
Checks that a query with conditions on two columns, where one (B)
is functionally dependent on the other one (A), correctly ignores
the clause on (B) and chooses bitmap index scan instead of plain
index scan (which is what happens otherwise, thanks to assumption
of independence).
Note: Functional dependencies only work with equality clauses,
no inequalities etc.
---
src/backend/optimizer/path/clausesel.c | 830 +++++++++++++++++++++++++-
src/backend/utils/mvstats/README.stats | 36 ++
src/backend/utils/mvstats/common.c | 5 +-
src/backend/utils/mvstats/dependencies.c | 24 +
src/include/utils/mvstats.h | 16 +-
src/test/regress/expected/mv_dependencies.out | 172 ++++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 150 +++++
9 files changed, 1232 insertions(+), 5 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.stats
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 02660c2..c11aa3b 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -14,14 +14,19 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
+#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/selfuncs.h"
+#include "utils/typcache.h"
/*
@@ -41,6 +46,47 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+
+static bool clause_is_mv_compatible(Node *clause, Oid varRelid,
+ Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+
+static Bitmapset *collect_mv_attnums(List *clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+
+static int count_mv_attnums(List *clauses, Oid varRelid,
+ SpecialJoinInfo *sjinfo);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, List *stats,
+ SpecialJoinInfo *sjinfo);
+
+static bool has_stats(List *stats, int type);
+
+static List * find_stats(PlannerInfo *root, List *clauses,
+ Oid varRelid, Index *relid);
+
+static Bitmapset* fdeps_collect_attnums(List *stats);
+
+static int *make_idx_to_attnum_mapping(Bitmapset *attnums);
+static int *make_attnum_to_idx_mapping(Bitmapset *attnums);
+
+static bool *build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx);
+
+static void multiply_adjacency_matrix(bool *matrix, int natts);
+
+static List* fdeps_reduce_clauses(List *clauses,
+ Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx,
+ Index relid);
+
+static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+
+static Bitmapset * get_varattnos(Node * node, Index relid);
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
@@ -60,7 +106,19 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
+ * The first thing we try to do is applying multivariate statistics, in a way
+ * that intends to minimize the overhead when there are no multivariate stats
+ * on the relation. Thus we do several simple (and inexpensive) checks first,
+ * to verify that suitable multivariate statistics exist.
+ *
+ * If we identify such multivariate statistics apply, we try to apply them.
+ * Currently we only have (soft) functional dependencies, so we try to reduce
+ * the list of clauses.
+ *
+ * Then we remove the clauses estimated using multivariate stats, and process
+ * the rest of the clauses using the regular per-column stats.
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -99,6 +157,15 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+
+ /* list of multivariate stats on the relation */
+ List *stats = NIL;
+
+ /* use clauses (not conditions), because those are always non-empty */
+ stats = find_stats(root, clauses, varRelid, &relid);
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +175,25 @@ clauselist_selectivity(PlannerInfo *root,
varRelid, jointype, sjinfo);
/*
+ * Apply functional dependencies, but first check that there are some stats
+ * with functional dependencies built (by simply walking the stats list),
+ * and that there are at two or more attributes referenced by clauses that
+ * may be reduced using functional dependencies.
+ *
+ * We would find that anyway when trying to actually apply the functional
+ * dependencies, but let's do the cheap checks first.
+ *
+ * After applying the functional dependencies we get the remainig clauses
+ * that need to be estimated by other types of stats (MCV, histograms etc).
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
+ (count_mv_attnums(clauses, varRelid, sjinfo) >= 2))
+ {
+ clauses = clauselist_apply_dependencies(root, clauses, varRelid,
+ stats, sjinfo);
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -763,3 +849,745 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Oid varRelid,
+ Index *relid, SpecialJoinInfo *sjinfo)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(clause, varRelid, relid, &attnum, sjinfo))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ *relid = InvalidOid;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Oid varRelid, SpecialJoinInfo *sjinfo)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, varRelid,
+ NULL, sjinfo);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(Node *clause, Oid varRelid,
+ Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+{
+
+ if (IsA(clause, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* no support for OR clauses at this point */
+ if (rinfo->orclause)
+ return false;
+
+ /* get the actual clause from the RestrictInfo (it's not an OR clause) */
+ clause = (Node*)rinfo->clause;
+
+ /* only simple opclauses are compatible with multivariate stats */
+ if (! is_opclause(clause))
+ return false;
+
+ /* we don't support join conditions at this moment */
+ if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
+ return false;
+
+ /* is it 'variable op constant' ? */
+ if (list_length(((OpExpr *) clause)->args) == 2)
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
+ ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ rinfo->right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ rinfo->left_relids)));
+
+ if (ok)
+ {
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
+
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ if (relid)
+ *relid = var->varno;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+ *attnum = var->varattno;
+ return true;
+ }
+ }
+ }
+ }
+
+ return false;
+
+}
+
+/*
+ * reduce list of equality clauses using soft functional dependencies
+ *
+ * We simply walk through list of functional dependencies, and for each one we
+ * check whether the dependency 'matches' the clauses, i.e. if there's a clause
+ * matching the condition. If yes, we attempt to remove all clauses matching
+ * the implied part of the dependency from the list.
+ *
+ * This only reduces equality clauses, and ignores all the other types. We might
+ * extend it to handle IS NULL clause, in the future.
+ *
+ * We also assume the equality clauses are 'compatible'. For example we can't
+ * identify when the clauses use a mismatching zip code and city name. In such
+ * case the usual approach (product of selectivities) would produce a better
+ * estimate, although mostly by chance.
+ *
+ * The implementation needs to be careful about cyclic dependencies, e.g. when
+ *
+ * (a -> b) and (b -> a)
+ *
+ * at the same time, which means there's 1:1 relationship between te columns.
+ * In this case we must not reduce clauses on both attributes at the same time.
+ *
+ * TODO Currently we only apply functional dependencies at the same level, but
+ * maybe we could transfer the clauses from upper levels to the subtrees?
+ * For example let's say we have (a->b) dependency, and condition
+ *
+ * (a=1) AND (b=2 OR c=3)
+ *
+ * Currently, we won't be able to perform any reduction, because we'll
+ * consider (a=1) and (b=2 OR c=3) independently. But maybe we could pass
+ * (a=1) into the other expression, and only check it against conditions
+ * of the functional dependencies?
+ *
+ * In this case we'd end up with
+ *
+ * (a=1)
+ *
+ * as we'd consider (b=2) implied thanks to the rule, rendering the whole
+ * OR clause valid.
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Oid varRelid, List *stats,
+ SpecialJoinInfo *sjinfo)
+{
+ List *reduced_clauses = NIL;
+ Index relid;
+
+ /*
+ * matrix of (natts x natts), 1 means x=>y
+ *
+ * This serves two purposes - first, it merges dependencies from all
+ * the statistics, second it makes generating all the transitive
+ * dependencies easier.
+ *
+ * We need to build this only for attributes from the dependencies,
+ * not for all attributes in the table.
+ *
+ * We can't do that only for attributes from the clauses, because we
+ * want to build transitive dependencies (including those going
+ * through attributes not listed in the stats).
+ *
+ * This only works for A=>B dependencies, not sure how to do that
+ * for complex dependencies.
+ */
+ bool *deps_matrix;
+ int deps_natts; /* size of the matric */
+
+ /* mapping attnum <=> matrix index */
+ int *deps_idx_to_attnum;
+ int *deps_attnum_to_idx;
+
+ /* attnums in dependencies and clauses (and intersection) */
+ List *deps_clauses = NIL;
+ Bitmapset *deps_attnums = NULL;
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *intersect_attnums = NULL;
+
+ /*
+ * Is there at least one statistics with functional dependencies?
+ * If not, return the original clauses right away.
+ *
+ * XXX Isn't this pointless, thanks to exactly the same check in
+ * clauselist_selectivity()? Can we trigger the condition here?
+ */
+ if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ return clauses;
+
+ /*
+ * Build the dependency matrix, i.e. attribute adjacency matrix,
+ * where 1 means (a=>b). Once we have the adjacency matrix, we'll
+ * multiply it by itself, to get transitive dependencies.
+ *
+ * Note: This is pretty much transitive closure from graph theory.
+ *
+ * First, let's see what attributes are covered by functional
+ * dependencies (sides of the adjacency matrix), and also a maximum
+ * attribute (size of mapping to simple integer indexes);
+ */
+ deps_attnums = fdeps_collect_attnums(stats);
+
+ /*
+ * Walk through the clauses - clauses that are (one of)
+ *
+ * (a) not mv-compatible
+ * (b) are using more than a single attnum
+ * (c) using attnum not covered by functional depencencies
+ *
+ * may be copied directly to the result. The interesting clauses are
+ * kept in 'deps_clauses' and will be processed later.
+ */
+ clause_attnums = fdeps_filter_clauses(root, clauses, deps_attnums,
+ &reduced_clauses, &deps_clauses,
+ varRelid, &relid, sjinfo);
+
+ /*
+ * we need at least two clauses referencing two different attributes
+ * referencing to do the reduction
+ */
+ if ((list_length(deps_clauses) < 2) || (bms_num_members(clause_attnums) < 2))
+ {
+ bms_free(clause_attnums);
+ list_free(reduced_clauses);
+ list_free(deps_clauses);
+
+ return clauses;
+ }
+
+
+ /*
+ * We need at least two matching attributes in the clauses and
+ * dependencies, otherwise we can't really reduce anything.
+ */
+ intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
+ if (bms_num_members(intersect_attnums) < 2)
+ {
+ bms_free(clause_attnums);
+ bms_free(deps_attnums);
+ bms_free(intersect_attnums);
+
+ list_free(deps_clauses);
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ /*
+ * Build mapping between matrix indexes and attnums, and then the
+ * adjacency matrix itself.
+ */
+ deps_idx_to_attnum = make_idx_to_attnum_mapping(deps_attnums);
+ deps_attnum_to_idx = make_attnum_to_idx_mapping(deps_attnums);
+
+ /* build the adjacency matrix */
+ deps_matrix = build_adjacency_matrix(stats, deps_attnums,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx);
+
+ deps_natts = bms_num_members(deps_attnums);
+
+ /*
+ * Multiply the matrix N-times (N = size of the matrix), so that we
+ * get all the transitive dependencies. That makes the next step
+ * much easier and faster.
+ *
+ * This is essentially an adjacency matrix from graph theory, and
+ * by multiplying it we get transitive edges. We don't really care
+ * about the exact number (number of paths between vertices) though,
+ * so we can do the multiplication in-place (we don't care whether
+ * we found the dependency in this round or in the previous one).
+ *
+ * Track how many new dependencies were added, and stop when 0, but
+ * we can't multiply more than N-times (longest path in the graph).
+ */
+ multiply_adjacency_matrix(deps_matrix, deps_natts);
+
+ /*
+ * Walk through the clauses, and see which other clauses we may
+ * reduce. The matrix contains all transitive dependencies, which
+ * makes this very fast.
+ *
+ * We have to be careful not to reduce the clause using itself, or
+ * reducing all clauses forming a cycle (so we have to skip already
+ * eliminated clauses).
+ *
+ * I'm not sure whether this guarantees finding the best solution,
+ * i.e. reducing the most clauses, but it probably does (thanks to
+ * having all the transitive dependencies).
+ */
+ deps_clauses = fdeps_reduce_clauses(deps_clauses,
+ deps_attnums, deps_matrix,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx, relid);
+
+ /* join the two lists of clauses */
+ reduced_clauses = list_union(reduced_clauses, deps_clauses);
+
+ pfree(deps_matrix);
+ pfree(deps_idx_to_attnum);
+ pfree(deps_attnum_to_idx);
+
+ bms_free(deps_attnums);
+ bms_free(clause_attnums);
+ bms_free(intersect_attnums);
+
+ return reduced_clauses;
+}
+
+static bool
+has_stats(List *stats, int type)
+{
+ ListCell *s;
+
+ foreach (s, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Determing relid (either from varRelid or from clauses) and then
+ * lookup stats using the relid.
+ */
+static List *
+find_stats(PlannerInfo *root, List *clauses, Oid varRelid, Index *relid)
+{
+ /* unknown relid by default */
+ *relid = InvalidOid;
+
+ /*
+ * First we need to find the relid (index info simple_rel_array).
+ * If varRelid is not 0, we already have it, otherwise we have to
+ * look it up from the clauses.
+ */
+ if (varRelid != 0)
+ *relid = varRelid;
+ else
+ {
+ Relids relids = pull_varnos((Node*)clauses);
+
+ /*
+ * We only expect 0 or 1 members in the bitmapset. If there are
+ * no vars, we'll get empty bitmapset, otherwise we'll get the
+ * relid as the single member.
+ *
+ * FIXME For some reason we can get 2 relids here (e.g. \d in
+ * psql does that).
+ */
+ if (bms_num_members(relids) == 1)
+ *relid = bms_singleton_member(relids);
+
+ bms_free(relids);
+ }
+
+ /*
+ * if we found the relid, we can get the stats from simple_rel_array
+ *
+ * This only gets stats that are already built, because that's how
+ * we load it into RelOptInfo (see get_relation_info), but we don't
+ * detoast the whole stats yet. That'll be done later, after we
+ * decide which stats to use.
+ */
+ if (*relid != InvalidOid)
+ return root->simple_rel_array[*relid]->mvstatlist;
+
+ return NIL;
+}
+
+static Bitmapset*
+fdeps_collect_attnums(List *stats)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ int2vector *stakeys = info->stakeys;
+
+ /* skip stats without functional dependencies built */
+ if (! info->deps_built)
+ continue;
+
+ for (j = 0; j < stakeys->dim1; j++)
+ attnums = bms_add_member(attnums, stakeys->values[j]);
+ }
+
+ return attnums;
+}
+
+
+static int*
+make_idx_to_attnum_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum = -1;
+
+ int *mapping = (int*)palloc0(bms_num_members(attnums) * sizeof(int));
+
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attidx++] = attnum;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+static int*
+make_attnum_to_idx_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum = -1;
+ int maxattnum = -1;
+ int *mapping;
+
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ maxattnum = attnum;
+
+ mapping = (int*)palloc0((maxattnum+1) * sizeof(int));
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attnum] = attidx++;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+static bool*
+build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx)
+{
+ ListCell *lc;
+ int natts = bms_num_members(attnums);
+ bool *matrix = (bool*)palloc0(natts * natts * sizeof(bool));
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies dependencies = NULL;
+
+ /* skip stats without functional dependencies built */
+ if (! stat->deps_built)
+ continue;
+
+ /* fetch and deserialize dependencies */
+ dependencies = load_mv_dependencies(stat->mvoid);
+ if (dependencies == NULL)
+ {
+ elog(WARNING, "failed to deserialize func deps %d", stat->mvoid);
+ continue;
+ }
+
+ /* set matrix[a,b] to 'true' if 'a=>b' */
+ for (j = 0; j < dependencies->ndeps; j++)
+ {
+ int aidx = attnum_to_idx[dependencies->deps[j]->a];
+ int bidx = attnum_to_idx[dependencies->deps[j]->b];
+
+ /* a=> b */
+ matrix[aidx * natts + bidx] = true;
+ }
+ }
+
+ return matrix;
+}
+
+static void
+multiply_adjacency_matrix(bool *matrix, int natts)
+{
+ int i;
+
+ for (i = 0; i < natts; i++)
+ {
+ int k, l, m;
+ int nchanges = 0;
+
+ /* k => l */
+ for (k = 0; k < natts; k++)
+ {
+ for (l = 0; l < natts; l++)
+ {
+ /* we already have this dependency */
+ if (matrix[k * natts + l])
+ continue;
+
+ /* we don't really care about the exact value, just 0/1 */
+ for (m = 0; m < natts; m++)
+ {
+ if (matrix[k * natts + m] * matrix[m * natts + l])
+ {
+ matrix[k * natts + l] = true;
+ nchanges += 1;
+ break;
+ }
+ }
+ }
+ }
+
+ /* no transitive dependency added here, so terminate */
+ if (nchanges == 0)
+ break;
+ }
+}
+
+static List*
+fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx, Index relid)
+{
+ int i;
+ ListCell *lc;
+ List *reduced_clauses = NIL;
+
+ int nmvclauses; /* size of the arrays */
+ bool *reduced;
+ AttrNumber *mvattnums;
+ Node **mvclauses;
+
+ int natts = bms_num_members(attnums);
+
+ /*
+ * Preallocate space for all clauses (the list only containst
+ * compatible clauses at this point). This makes it somewhat easier
+ * to access the stats / attnums randomly.
+ *
+ * XXX This assumes each clause references exactly one Var, so the
+ * arrays are sized accordingly - for functional dependencies
+ * this is safe, because it only works with Var=Const.
+ */
+ mvclauses = (Node**)palloc0(list_length(clauses) * sizeof(Node*));
+ mvattnums = (AttrNumber*)palloc0(list_length(clauses) * sizeof(AttrNumber));
+ reduced = (bool*)palloc0(list_length(clauses) * sizeof(bool));
+
+ /* fill the arrays */
+ nmvclauses = 0;
+ foreach (lc, clauses)
+ {
+ Node * clause = (Node*)lfirst(lc);
+ Bitmapset * attnums = get_varattnos(clause, relid);
+
+ mvclauses[nmvclauses] = clause;
+ mvattnums[nmvclauses] = bms_singleton_member(attnums);
+ nmvclauses++;
+ }
+
+ Assert(nmvclauses == list_length(clauses));
+
+ /* now try to reduce the clauses (using the dependencies) */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ int j;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[i], attnums))
+ continue;
+
+ /* this clause was already reduced, so let's skip it */
+ if (reduced[i])
+ continue;
+
+ /* walk the potentially 'implied' clauses */
+ for (j = 0; j < nmvclauses; j++)
+ {
+ int aidx, bidx;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[j], attnums))
+ continue;
+
+ aidx = attnum_to_idx[mvattnums[i]];
+ bidx = attnum_to_idx[mvattnums[j]];
+
+ /* can't reduce the clause by itself, or if already reduced */
+ if ((i == j) || reduced[j])
+ continue;
+
+ /* mark the clause as reduced (if aidx => bidx) */
+ reduced[j] = matrix[aidx * natts + bidx];
+ }
+ }
+
+ /* now walk through the clauses, and keep only those not reduced */
+ for (i = 0; i < nmvclauses; i++)
+ if (! reduced[i])
+ reduced_clauses = lappend(reduced_clauses, mvclauses[i]);
+
+ pfree(reduced);
+ pfree(mvclauses);
+ pfree(mvattnums);
+
+ return reduced_clauses;
+}
+
+
+static Bitmapset *
+fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo)
+{
+ ListCell *lc;
+ Bitmapset *clause_attnums = NULL;
+
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(lc);
+
+ if (! clause_is_mv_compatible(clause, varRelid, relid,
+ &attnum, sjinfo))
+
+ /* clause incompatible with functional dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(attnum, deps_attnums))
+
+ /* clause not covered by the dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else
+ {
+ *deps_clauses = lappend(*deps_clauses, clause);
+ clause_attnums = bms_add_member(clause_attnums, attnum);
+ }
+ }
+
+ return clause_attnums;
+}
+
+/*
+ * Pull varattnos from the clauses, similarly to pull_varattnos() but:
+ *
+ * (a) only get attributes for a particular relation (relid)
+ * (b) ignore system attributes (we can't build stats on them anyway)
+ *
+ * This makes it possible to directly compare the result with attnum
+ * values from pg_attribute etc.
+ */
+static Bitmapset *
+get_varattnos(Node * node, Index relid)
+{
+ int k;
+ Bitmapset *varattnos = NULL;
+ Bitmapset *result = NULL;
+
+ /* get the varattnos */
+ pull_varattnos(node, relid, &varattnos);
+
+ k = -1;
+ while ((k = bms_next_member(varattnos, k)) >= 0)
+ {
+ if (k + FirstLowInvalidHeapAttributeNumber > 0)
+ result
+ = bms_add_member(result,
+ k + FirstLowInvalidHeapAttributeNumber);
+ }
+
+ bms_free(varattnos);
+
+ return result;
+}
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
new file mode 100644
index 0000000..a38ea7b
--- /dev/null
+++ b/src/backend/utils/mvstats/README.stats
@@ -0,0 +1,36 @@
+Multivariate statististics
+==========================
+
+When estimating various quantities (e.g. condition selectivities) the default
+approach relies on the assumption of independence. In practice that's often
+not true, resulting in estimation errors.
+
+Multivariate stats track different types of dependencies between the columns,
+hopefully improving the estimates.
+
+Currently we only have one kind of multivariate statistics - soft functional
+dependencies, and we use it to improve estimates of equality clauses. See
+README.dependencies for details.
+
+
+Selectivity estimation
+----------------------
+
+When estimating selectivity, we aim to achieve several things:
+
+ (a) maximize the estimate accuracy
+
+ (b) minimize the overhead, especially when no suitable multivariate stats
+ exist (so if you are not using multivariate stats, there's no overhead)
+
+This clauselist_selectivity() performs several inexpensive checks first, before
+even attempting to do the more expensive estimation.
+
+ (1) check if there are multivariate stats on the relation
+
+ (2) check there are at least two attributes referenced by clauses compatible
+ with multivariate statistics (equality clauses for func. dependencies)
+
+ (3) perform reduction of equality clauses using func. dependencies
+
+ (4) estimate the reduced list of clauses using regular statistics
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index a755c49..bd200bc 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -84,7 +84,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(stat->mvoid, deps, attrs);
@@ -163,6 +164,7 @@ list_mv_stats(Oid relid)
info->mvoid = HeapTupleGetOid(htup);
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
result = lappend(result, info);
@@ -274,6 +276,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 2a064a0..c80ba33 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -435,3 +435,27 @@ pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(result));
}
+
+MVDependencies
+load_mv_dependencies(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->deps_enabled && mvstat->deps_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_dependencies(DatumGetByteaP(deps));
+}
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 7ebd961..cc43a79 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,12 +17,20 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
int16 a;
@@ -48,6 +56,8 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
+MVDependencies load_mv_dependencies(Oid mvoid);
+
bytea * serialize_mv_dependencies(MVDependencies dependencies);
/* deserialization of stats (serialization is private to analyze) */
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..e759997
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,172 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index bec0316..4f2ffb8 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -110,3 +110,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7e9b319..097a04f 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -162,3 +162,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..48dea4d
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,150 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
--
2.1.0
0004-multivariate-MCV-lists.patchbinary/octet-stream; name=0004-multivariate-MCV-lists.patchDownload
From c63618ac6f0696f6c863cfb1b048b6ecdc611b97 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 16:52:15 +0200
Subject: [PATCH 4/9] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
---
doc/src/sgml/ref/create_statistics.sgml | 18 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 45 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 1032 ++++++++++++++++++++++++++---
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.mcv | 137 ++++
src/backend/utils/mvstats/README.stats | 89 ++-
src/backend/utils/mvstats/common.c | 104 ++-
src/backend/utils/mvstats/common.h | 11 +-
src/backend/utils/mvstats/mcv.c | 1094 +++++++++++++++++++++++++++++++
src/bin/psql/describe.c | 25 +-
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 69 +-
src/test/regress/expected/mv_mcv.out | 207 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 178 +++++
22 files changed, 2936 insertions(+), 116 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.mcv
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index a86eae3..193e4b0 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -132,6 +132,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>max_mcv_items</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of MCV list items.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>mcv</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables MCV list for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index b8a264e..2d570ee 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -165,7 +165,9 @@ CREATE VIEW pg_mv_stats AS
S.staname AS staname,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 84a8b13..90bfaed 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -136,7 +136,13 @@ CreateStatistics(CreateStatsStmt *stmt)
ObjectAddress parentobject, childobject;
/* by default build nothing */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -212,6 +218,29 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -220,10 +249,16 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -243,8 +278,12 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 474d2c7..e3983fd 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1977,9 +1977,11 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
+ WRITE_BOOL_FIELD(mcv_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
+ WRITE_BOOL_FIELD(mcv_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index c11aa3b..ce7d231 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -15,6 +15,7 @@
#include "postgres.h"
#include "access/sysattr.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
@@ -47,20 +48,41 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
static bool clause_is_mv_compatible(Node *clause, Oid varRelid,
- Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo);
+ Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int type);
static Bitmapset *collect_mv_attnums(List *clauses,
- Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo);
+ Oid varRelid, Index *relid, SpecialJoinInfo *sjinfo,
+ int type);
static int count_mv_attnums(List *clauses, Oid varRelid,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo, int type);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, List *stats,
SpecialJoinInfo *sjinfo);
+static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
+
+static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid,
+ List **mvclauses, MVStatisticInfo *mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
@@ -88,6 +110,13 @@ static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
static Bitmapset * get_varattnos(Node * node, Index relid);
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,isor) \
+ (m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -113,11 +142,13 @@ static Bitmapset * get_varattnos(Node * node, Index relid);
* to verify that suitable multivariate statistics exist.
*
* If we identify such multivariate statistics apply, we try to apply them.
- * Currently we only have (soft) functional dependencies, so we try to reduce
- * the list of clauses.
*
- * Then we remove the clauses estimated using multivariate stats, and process
- * the rest of the clauses using the regular per-column stats.
+ * First we try to reduce the list of clauses by applying (soft) functional
+ * dependencies, and then we try to estimate the selectivity of the reduced
+ * list of clauses using the multivariate MCV list.
+ *
+ * Finally we remove the portion of clauses estimated using multivariate stats,
+ * and process the rest of the clauses using the regular per-column stats.
*
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
@@ -187,13 +218,49 @@ clauselist_selectivity(PlannerInfo *root,
* that need to be estimated by other types of stats (MCV, histograms etc).
*/
if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
- (count_mv_attnums(clauses, varRelid, sjinfo) >= 2))
+ (count_mv_attnums(clauses, varRelid, sjinfo, MV_CLAUSE_TYPE_FDEP) >= 2))
{
clauses = clauselist_apply_dependencies(root, clauses, varRelid,
stats, sjinfo);
}
/*
+ * Check that there are statistics with MCV list or histogram, and also the
+ * number of attributes covered by these types of statistics.
+ *
+ * If there are no such stats or not enough attributes, don't waste time
+ * with the multivariate code and simply skip to estimation using the
+ * regular per-column stats.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
+ (count_mv_attnums(clauses, varRelid, sjinfo, MV_CLAUSE_TYPE_MCV) >= 2))
+ {
+ /* collect attributes from the compatible conditions */
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, varRelid, NULL, sjinfo,
+ MV_CLAUSE_TYPE_MCV);
+
+ /* and search for the statistic covering the most attributes */
+ MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+
+ if (mvstat != NULL) /* we have a matching stats */
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, sjinfo, clauses,
+ varRelid, &mvclauses, mvstat,
+ MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -850,12 +917,75 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * estimate selectivity of clauses using multivariate statistic
+ *
+ * Perform estimation of the clauses using a MCV list.
+ *
+ * This assumes all the clauses are compatible with the selected statistics
+ * (e.g. only reference columns covered by the statistics, use supported
+ * operator, etc.).
+ *
+ * TODO We may support some additional conditions, most importantly those
+ * matching multiple columns (e.g. "a = b" or "a < b").
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities (i.e. the
+ * selectivity of the most restrictive clause), because that's the maximum
+ * we can ever get from ANDed list of clauses. This may probably prevent
+ * issues with hitting too many buckets and low precision histograms.
+ *
+ * TODO We may remember the lowest frequency in the MCV list, and then later use
+ * it as a upper boundary for the selectivity (had there been a more
+ * frequent item, it'd be in the MCV list). This might improve cases with
+ * low-detail histograms.
+ *
+ * TODO We may also derive some additional boundaries for the selectivity from
+ * the MCV list, because
+ *
+ * (a) if we have a "full equality condition" (one equality condition on
+ * each column of the statistic) and we found a match in the MCV list,
+ * then this is the final selectivity (and pretty accurate),
+ *
+ * (b) if we have a "full equality condition" and we haven't found a match
+ * in the MCV list, then the selectivity is below the lowest frequency
+ * found in the MCV list,
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive). But this
+ * requires really knowing the per-clause selectivities in advance,
+ * and that's not what we do now.
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
collect_mv_attnums(List *clauses, Oid varRelid,
- Index *relid, SpecialJoinInfo *sjinfo)
+ Index *relid, SpecialJoinInfo *sjinfo, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
@@ -871,12 +1001,11 @@ collect_mv_attnums(List *clauses, Oid varRelid,
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(clause, varRelid, relid, &attnum, sjinfo))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, varRelid, relid, &attnums,
+ sjinfo, types);
}
/*
@@ -898,11 +1027,11 @@ collect_mv_attnums(List *clauses, Oid varRelid,
* Count the number of attributes in clauses compatible with multivariate stats.
*/
static int
-count_mv_attnums(List *clauses, Oid varRelid, SpecialJoinInfo *sjinfo)
+count_mv_attnums(List *clauses, Oid varRelid, SpecialJoinInfo *sjinfo, int type)
{
int c;
Bitmapset *attnums = collect_mv_attnums(clauses, varRelid,
- NULL, sjinfo);
+ NULL, sjinfo, type);
c = bms_num_members(attnums);
@@ -912,6 +1041,188 @@ count_mv_attnums(List *clauses, Oid varRelid, SpecialJoinInfo *sjinfo)
}
/*
+ * We're looking for statistics matching at least 2 attributes,
+ * referenced in the clauses compatible with multivariate statistics.
+ * The current selection criteria is very simple - we choose the
+ * statistics referencing the most attributes.
+ *
+ * If there are multiple statistics referencing the same number of
+ * columns (from the clauses), the one with less source columns
+ * (as listed in the ADD STATISTICS when creating the statistics) wins.
+ * Other wise the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns,
+ * but one has 100 buckets and the other one has 1000 buckets (thus
+ * likely providing better estimates), this is not currently
+ * considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list,
+ * another one with just a histogram and a third one with both,
+ * this is not considered.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts,
+ * so if there are multiple clauses on a single attribute, this
+ * still counts as a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example
+ * equality clauses probably work better with MCV lists than with
+ * histograms. But IS [NOT] NULL conditions may often work better
+ * with histograms (thanks to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
+ * selected as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list
+ * of clauses into two parts - conditions that are compatible with the
+ * selected stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while
+ * the last condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting
+ * conditions instead of just referenced attributes), but eventually
+ * the best option should be to combine multiple statistics. But that's
+ * much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing
+ * the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses,
+ * because 'dependencies' will probably work only with equality
+ * clauses.
+ */
+static MVStatisticInfo *
+choose_mv_statistics(List *stats, Bitmapset *attnums)
+{
+ int i;
+ ListCell *lc;
+
+ MVStatisticInfo *choice = NULL;
+
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements)
+ * and for each one count the referenced attributes (encoded in
+ * the 'attnums' bitmap).
+ */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = info->stakeys;
+ int numattrs = attrs->dim1;
+
+ /* skip dependencies-only stats */
+ if (! info->mcv_built)
+ continue;
+
+ /* count columns covered by the histogram */
+ for (i = 0; i < numattrs; i++)
+ if (bms_is_member(attrs->values[i], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = info;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses
+ * that will be evaluated using the chosen statistics, and the remaining
+ * clauses (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
+ List *clauses, Oid varRelid, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes covered by the stats, so we can
+ * do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(clause, varRelid, NULL,
+ &attnums, sjinfo, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list
+ * of mv-compatible clauses. Otherwise, keep it in the list of
+ * 'regular' clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible
+ * with the chosen histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
+
+/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
* into simple_rte_array) and a bitmap of attributes. This is then
@@ -930,8 +1241,12 @@ count_mv_attnums(List *clauses, Oid varRelid, SpecialJoinInfo *sjinfo)
*/
static bool
clause_is_mv_compatible(Node *clause, Oid varRelid,
- Index *relid, AttrNumber *attnum, SpecialJoinInfo *sjinfo)
+ Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
+ int types)
{
+ Relids clause_relids;
+ Relids left_relids;
+ Relids right_relids;
if (IsA(clause, RestrictInfo))
{
@@ -941,83 +1256,176 @@ clause_is_mv_compatible(Node *clause, Oid varRelid,
if (rinfo->pseudoconstant)
return false;
- /* no support for OR clauses at this point */
- if (rinfo->orclause)
- return false;
-
/* get the actual clause from the RestrictInfo (it's not an OR clause) */
clause = (Node*)rinfo->clause;
- /* only simple opclauses are compatible with multivariate stats */
- if (! is_opclause(clause))
- return false;
-
/* we don't support join conditions at this moment */
if (treat_as_join_clause(clause, rinfo, varRelid, sjinfo))
return false;
+ clause_relids = rinfo->clause_relids;
+ left_relids = rinfo->left_relids;
+ right_relids = rinfo->right_relids;
+ }
+ else if (is_opclause(clause) && list_length(((OpExpr *) clause)->args) == 2)
+ {
+ left_relids = pull_varnos(get_leftop((Expr*)clause));
+ right_relids = pull_varnos(get_rightop((Expr*)clause));
+
+ clause_relids = bms_union(left_relids,
+ right_relids);
+ }
+ else
+ {
+ /* Not a binary opclause, so mark left/right relid sets as empty */
+ left_relids = NULL;
+ right_relids = NULL;
+ /* and get the total relid set the hard way */
+ clause_relids = pull_varnos((Node *) clause);
+ }
+
+ /*
+ * Only simple opclauses and IS NULL tests are compatible with
+ * multivariate stats at this point.
+ */
+ if ((is_opclause(clause))
+ && (list_length(((OpExpr *) clause)->args) == 2))
+ {
+ OpExpr *expr = (OpExpr *) clause;
+ bool varonleft = true;
+ bool ok;
+
/* is it 'variable op constant' ? */
- if (list_length(((OpExpr *) clause)->args) == 2)
+
+ ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
+ (is_pseudo_constant_clause_relids(lsecond(expr->args),
+ right_relids) ||
+ (varonleft = false,
+ is_pseudo_constant_clause_relids(linitial(expr->args),
+ left_relids)));
+
+ if (ok)
{
- OpExpr *expr = (OpExpr *) clause;
- bool varonleft = true;
- bool ok;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
- ok = (bms_membership(rinfo->clause_relids) == BMS_SINGLETON) &&
- (is_pseudo_constant_clause_relids(lsecond(expr->args),
- rinfo->right_relids) ||
- (varonleft = false,
- is_pseudo_constant_clause_relids(linitial(expr->args),
- rinfo->left_relids)));
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
- if (ok)
- {
- Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
- /*
- * Simple variables only - otherwise the planner_rt_fetch seems to fail
- * (return NULL).
- *
- * TODO Maybe use examine_variable() would fix that?
- */
- if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
- return false;
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
- /*
- * Only consider this variable if (varRelid == 0) or when the varno
- * matches varRelid (see explanation at clause_selectivity).
- *
- * FIXME I suspect this may not be really necessary. The (varRelid == 0)
- * part seems to be enforced by treat_as_join_clause().
- */
- if (! ((varRelid == 0) || (varRelid == var->varno)))
- return false;
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
+ *relid = var->varno;
- /* Also skip special varno values, and system attributes ... */
- if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
- return false;
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ * This uses the function for estimating selectivity, ont the
+ * operator directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+ /* not compatible with functional dependencies */
+ if (types & MV_CLAUSE_TYPE_MCV)
+ {
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return (types & MV_CLAUSE_TYPE_MCV);
+ }
+ return false;
+
+ case F_EQSEL:
+ *attnums = bms_add_member(*attnums, var->varattno);
+ return true;
+ }
+ }
+ }
+ else if (IsA(clause, NullTest)
+ && IsA(((NullTest*)clause)->arg, Var))
+ {
+ Var * var = (Var*)((NullTest*)clause)->arg;
- if (relid)
- *relid = var->varno;
+ /*
+ * Simple variables only - otherwise the planner_rt_fetch seems to fail
+ * (return NULL).
+ *
+ * TODO Maybe use examine_variable() would fix that?
+ */
+ if (! (IsA(var, Var) && (varRelid == 0 || varRelid == var->varno)))
+ return false;
- /*
- * If it's not a "<" or ">" or "=" operator, just ignore the
- * clause. Otherwise note the relid and attnum for the variable.
- * This uses the function for estimating selectivity, ont the
- * operator directly (a bit awkward, but well ...).
- */
- switch (get_oprrest(expr->opno))
- {
- case F_EQSEL:
- *attnum = var->varattno;
- return true;
- }
- }
+ /*
+ * Only consider this variable if (varRelid == 0) or when the varno
+ * matches varRelid (see explanation at clause_selectivity).
+ *
+ * FIXME I suspect this may not be really necessary. The (varRelid == 0)
+ * part seems to be enforced by treat_as_join_clause().
+ */
+ if (! ((varRelid == 0) || (varRelid == var->varno)))
+ return false;
+
+ /* Also skip special varno values, and system attributes ... */
+ if ((IS_SPECIAL_VARNO(var->varno)) || (! AttrNumberIsForUserDefinedAttr(var->varattno)))
+ return false;
+
+ /* Lookup info about the base relation (we need to pass the OID out) */
+ if (relid != NULL)
+ *relid = var->varno;
+
+ *attnums = bms_add_member(*attnums, var->varattno);
+
+ return true;
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /*
+ * AND/OR-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses
+ * are supported and some are not, and treat all supported
+ * subclauses as a single clause, compute it's selectivity
+ * using mv stats, and compute the total selectivity using
+ * the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the
+ * orclause with nested RestrictInfo - we won't have to
+ * call pull_varnos() for each clause, saving time.
+ */
+ Bitmapset *tmp = NULL;
+ ListCell *l;
+ foreach (l, ((BoolExpr*)clause)->args)
+ {
+ if (! clause_is_mv_compatible((Node*)lfirst(l),
+ varRelid, relid, &tmp, sjinfo, types))
+ return false;
}
+
+ /* add the attnums from the OR-clause to the set of attnums */
+ *attnums = bms_join(*attnums, tmp);
+
+ return true;
}
return false;
-
}
/*
@@ -1240,6 +1648,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
}
return false;
@@ -1535,25 +1946,39 @@ fdeps_filter_clauses(PlannerInfo *root,
foreach (lc, clauses)
{
- AttrNumber attnum;
+ Bitmapset *attnums = NULL;
Node *clause = (Node *) lfirst(lc);
- if (! clause_is_mv_compatible(clause, varRelid, relid,
- &attnum, sjinfo))
+ if (! clause_is_mv_compatible(clause, varRelid, relid, &attnums,
+ sjinfo, MV_CLAUSE_TYPE_FDEP))
/* clause incompatible with functional dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
- else if (! bms_is_member(attnum, deps_attnums))
+ else if (bms_num_members(attnums) > 1)
+
+ /*
+ * clause referencing multiple attributes (strange, should
+ * this be handled by clause_is_mv_compatible directly)
+ */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(bms_singleton_member(attnums), deps_attnums))
/* clause not covered by the dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
else
{
+ /* ok, clause compatible with existing dependencies */
+ Assert(bms_num_members(attnums) == 1);
+
*deps_clauses = lappend(*deps_clauses, clause);
- clause_attnums = bms_add_member(clause_attnums, attnum);
+ clause_attnums = bms_add_member(clause_attnums,
+ bms_singleton_member(attnums));
}
+
+ bms_free(attnums);
}
return clause_attnums;
@@ -1591,3 +2016,454 @@ get_varattnos(Node * node, Index relid)
return result;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = load_mv_mcvlist(mvstats->mvoid);
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* used to 'scale' for MCV lists not covering all tuples */
+ u += mcvlist->items[i]->frequency;
+
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s*u;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /* frequency of the lowest MCV item */
+ *lowsel = 1.0;
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ *
+ * FIXME This would probably deserve a refactoring, I guess. Unify
+ * the two loops and put the checks inside, or something like
+ * that.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ /* operator */
+ FmgrInfo opproc;
+
+ /* get procedure computing operator selectivity */
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo ltproc, gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype,
+ TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* TODO consider bsearch here (list is sorted by values)
+ * TODO handle other operators too (LT, GT)
+ * TODO identify "full match" when the clauses fully
+ * match the whole MCV list (so that checking the
+ * histogram is not needed)
+ */
+ if (oprrest == F_EQSEL)
+ {
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ bool match = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (match)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ mismatch = (! match);
+ }
+ else if (oprrest == F_SCALARLTSEL) /* column < constant */
+ {
+
+ if (! isgt) /* (var < const) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ } /* (get_oprrest(expr->opno) == F_SCALARLTSEL) */
+ else /* (const < var) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+ }
+ else if (oprrest == F_SCALARGTSEL) /* column > constant */
+ {
+
+ if (! isgt) /* (var > const) */
+ {
+ /*
+ * First check whether the constant is above the upper boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+ }
+ else /* (const > var) */
+ {
+ /*
+ * First check whether the constant is below the lower boundary (in
+ * that case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ item->values[idx],
+ cst->constvalue));
+ }
+
+ } /* (get_oprrest(expr->opno) == F_SCALARGTSEL) */
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * find the lowest selectivity in the MCV
+ * FIXME Maybe not the best place do do this (in for all clauses).
+ */
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! mcvlist->items[i]->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (mcvlist->items[i]->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b9de71d..a92f889 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -416,7 +416,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built)
+ if (mvstat->deps_built || mvstat->mcv_built)
{
info = makeNode(MVStatisticInfo);
@@ -425,9 +425,11 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
+ info->mcv_enabled = mvstat->mcv_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
+ info->mcv_built = mvstat->mcv_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..f9bf10c 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o dependencies.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.mcv b/src/backend/utils/mvstats/README.mcv
new file mode 100644
index 0000000..e93cfe4
--- /dev/null
+++ b/src/backend/utils/mvstats/README.mcv
@@ -0,0 +1,137 @@
+MCV lists
+=========
+
+Multivariate MCV (most-common values) lists are a straightforward extension of
+regular MCV list, tracking most frequent combinations of values for a group of
+attributes.
+
+This works particularly well for columns with a small number of distinct values,
+as the list may include all the combinations and approximate the distribution
+very accurately.
+
+For columns with large number of distinct values (e.g. those with continuous
+domains), the list will only track the most frequent combinations. If the
+distribution is mostly uniform (all combinations about equally frequent), the
+MCV list will be empty.
+
+Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
+than when using histograms.
+
+Also, MCV lists don't necessarily require sorting of the values (the fact that
+we use sorting when building them is implementation detail), but even more
+importantly the ordering is not built into the approximation (while histograms
+are built on ordering). So MCV lists work well even for attributes where the
+ordering of the data type is disconnected from the meaning of the data. For
+example we know how to sort strings, but it's unlikely to make much sense for
+city names (or other label-like attributes).
+
+
+Selectivity estimation
+----------------------
+
+The estimation, implemented in clauselist_mv_selectivity_mcvlist(), is quite
+simple in principle - we need to identify MCV items matching all the clauses
+and sum frequencies of all those items.
+
+Currently MCV lists support estimation of the following clause types:
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR clauses WHERE (a < 1) OR (b >= 2)
+
+It's possible to add support for additional clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and possibly others. These are tasks for the future, not yet implemented.
+
+
+Estimating equality clauses
+---------------------------
+
+When computing selectivity estimate for equality clauses
+
+ (a = 1) AND (b = 2)
+
+we can do this estimate pretty exactly assuming that two conditions are met:
+
+ (1) there's an equality condition on all attributes of the statistic
+
+ (2) we find a matching item in the MCV list
+
+In this case we know the MCV item represents all tuples matching the clauses,
+and the selectivity estimate is complete (i.e. we don't need to perform
+estimation using the histogram). This is what we call 'full match'.
+
+When only (1) holds, but there's no matching MCV item, we don't know whether
+there are no such rows or just are not very frequent. We can however use the
+frequency of the least frequent MCV item as an upper bound for the selectivity.
+
+For a combination of equality conditions (not full-match case) we can clamp the
+selectivity by the minimum of selectivities for each condition. For example if
+we know the number of distinct values for each column, we can use 1/ndistinct
+as a per-column estimate. Or rather 1/ndistinct + selectivity derived from the
+MCV list.
+
+We should also probably only use the 'residual ndistinct' by exluding the items
+included in the MCV list (and also residual frequency):
+
+ f = (1.0 - sum(MCV frequencies)) / (ndistinct - ndistinct(MCV list))
+
+but it's worth pointing out the ndistinct values are multi-variate for the
+columns referenced by the equality conditions.
+
+Note: Only the "full match" limit is currently implemented.
+
+
+Hashed MCV (not yet implemented)
+--------------------------------
+
+Regular MCV lists have to include actual values for each item, so if those items
+are large the list may be quite large. This is especially true for multi-variate
+MCV lists, although the current implementation partially mitigates this by
+performing de-duplicating the values before storing them on disk.
+
+It's possible to only store hashes (32-bit values) instead of the actual values,
+significantly reducing the space requirements. Obviously, this would only make
+the MCV lists useful for estimating equality conditions (assuming the 32-bit
+hashes make the collisions rare enough).
+
+This might also complicate matching the columns to available stats.
+
+
+TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
+ of the actual values. This type of MCV list will be useful only for
+ estimating equality clauses, and will reduce space requirements for large
+ varlena types (in such cases we usually only want equality anyway).
+
+TODO Currently there's no logic to consider building only a MCV list (and not
+ building the histogram at all), except for doing this decision manually in
+ ADD STATISTICS.
+
+
+Inspecting the MCV list
+-----------------------
+
+Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarrays, so we
+simply get the text representation of the arrays.
+
+With multivariate MCV lits it's not that simple due to the possible mix of
+data types. It might be possible to produce similar array-like representation,
+but that'd unnecessarily complicate further processing and analysis of the MCV
+list. Instead, there's a SRF function providing values, frequencies etc.
+
+ SELECT * FROM pg_mv_mcv_items();
+
+It has two input parameters:
+
+ oid - OID of the MCV list (pg_mv_statistic.staoid)
+
+and produces a table with these columns:
+
+ - item ID (0...nitems-1)
+ - values (string array)
+ - nulls only (boolean array)
+ - frequency (double precision)
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index a38ea7b..5c5c59a 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -8,9 +8,50 @@ not true, resulting in estimation errors.
Multivariate stats track different types of dependencies between the columns,
hopefully improving the estimates.
-Currently we only have one kind of multivariate statistics - soft functional
-dependencies, and we use it to improve estimates of equality clauses. See
-README.dependencies for details.
+
+Types of statistics
+-------------------
+
+Currently we only have two kinds of multivariate statistics
+
+ (a) soft functional dependencies (README.dependencies)
+
+ (b) MCV lists (README.mcv)
+
+
+Compatible clause types
+-----------------------
+
+Each type of statistics may be used to estimate some subset of clause types.
+
+ (a) functional dependencies - equality clauses (AND), possibly IS NULL
+
+ (b) MCV list - equality and inequality clauses, IS [NOT] NULL, AND/OR
+
+Currently only simple operator clauses (Var op Const) are supported, but it's
+possible to support more complex clause types, e.g. (Var op Var).
+
+
+Complex clauses
+---------------
+
+We also support estimating more complex clauses - essentially AND/OR clauses
+with (Var op Const) as leaves, as long as all the referenced attributes are
+covered by a single statistics.
+
+For example this condition
+
+ (a=1) AND ((b=2) OR ((c=3) AND (d=4)))
+
+may be estimated using statistics on (a,b,c,d). If we only have statistics on
+(b,c,d) we may estimate the second part, and estimate (a=1) using simple stats.
+
+If we only have statistics on (a,b,c) we can't apply it at all at this point,
+but it's worth pointing out clauselist_selectivity() works recursively and when
+handling the second part (the OR-clause), we'll be able to apply the statistics.
+
+Note: The multi-statistics estimation patch also makes it possible to pass some
+clauses as 'conditions' into the deeper parts of the expression tree.
Selectivity estimation
@@ -23,14 +64,48 @@ When estimating selectivity, we aim to achieve several things:
(b) minimize the overhead, especially when no suitable multivariate stats
exist (so if you are not using multivariate stats, there's no overhead)
-This clauselist_selectivity() performs several inexpensive checks first, before
+Thus clauselist_selectivity() performs several inexpensive checks first, before
even attempting to do the more expensive estimation.
(1) check if there are multivariate stats on the relation
- (2) check there are at least two attributes referenced by clauses compatible
- with multivariate statistics (equality clauses for func. dependencies)
+ (2) check that there are functional dependencies on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equality clauses for func. dependencies)
(3) perform reduction of equality clauses using func. dependencies
- (4) estimate the reduced list of clauses using regular statistics
+ (4) check that there are multivariate MCV lists on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equalities, inequalities, etc.)
+
+ (5) find the best multivariate statistics (matching the most conditions)
+ and use it to compute the estimate
+
+ (6) estimate the remaining clauses (not estimated using multivariate stats)
+ using the regular per-column statistics
+
+Whenever we find there are no suitable stats, we skip the expensive steps.
+
+
+Further (possibly crazy) ideas
+------------------------------
+
+Currently the clauses are only estimated using a single statistics, even if
+there are multiple candidate statistics - for example assume we have statistics
+on (a,b,c) and (b,c,d), and estimate conditions
+
+ (b = 1) AND (c = 2)
+
+Then both statistics may be used, but we only use one of them. Maybe we could
+use compute estimates using all candidate stats, and somehow aggregate them
+into the final estimate by using average or median.
+
+Some stats may give better estimates than others, but it's very difficult to say
+in advance which stats are the best (it depends on the number of buckets, number
+of additional columns not referenced in the clauses, type of condition etc.).
+
+But of course, this may result in expensive estimation (CPU-wise).
+
+So we might add a GUC to choose between a simple (single statistics) and thus
+multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index bd200bc..d1da714 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,12 +16,14 @@
#include "common.h"
+#include "utils/array.h"
+
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+ int natts,
+ VacAttrStats **vacattrstats);
static List* list_mv_stats(Oid relid);
-
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -49,6 +51,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int j;
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -87,8 +91,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (stat->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, attrs);
+ update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -166,6 +174,8 @@ list_mv_stats(Oid relid)
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
+ info->mcv_enabled = stats->mcv_enabled;
+ info->mcv_built = stats->mcv_built;
result = lappend(result, info);
}
@@ -180,8 +190,56 @@ list_mv_stats(Oid relid)
return result;
}
+
+/*
+ * Find attnims of MV stats using the mvoid.
+ */
+int2vector*
+find_mv_attnums(Oid mvoid, Oid *relid)
+{
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ HeapTuple htup;
+ int2vector *keys;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ htup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+ /* starelid */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_starelid, &isnull);
+ Assert(!isnull);
+
+ *relid = DatumGetObjectId(adatum);
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+ ReleaseSysCache(htup);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return keys;
+}
+
+
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -206,18 +264,29 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
@@ -246,6 +315,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -256,11 +340,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index 6d5465b..f4309f7 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -46,7 +46,15 @@ typedef struct
Datum value; /* a data value */
int tupno; /* position index for tuple it came from */
} ScalarItem;
-
+
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -71,5 +79,6 @@ int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..551c934
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1094 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(int32))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(int32) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/* pointers into a flat serialized item of ITEM_SIZE(n) bytes */
+#define ITEM_INDEXES(item) ((uint16*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+/*
+ * Builds MCV list from sample rows, and removes rows represented by
+ * the MCV list from the sample (the number of remaining sample rows is
+ * returned by the numrows_filtered parameter).
+ *
+ * The method is quite simple - in short it does about these steps:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * For more details, see the comments in the code.
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we
+ * want to check the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or
+ * float4). Maybe we could save some space here, but the bytea
+ * compression should handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed
+ * from the table, but rather estimate the number of distinct
+ * values in the table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * Preallocate space for all the items as a single chunk, and point
+ * the items to the appropriate parts of the array.
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ /* keep all the rows by default (as if there was no MCV list) */
+ *numrows_filtered = numrows;
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ items[j].values[i] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Count the number of distinct groups - just walk through the
+ * sorted list and count the number of key changes. We use this to
+ * determine the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array. We'll always
+ * require at least 4 rows per group.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e.
+ * if there are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS),
+ * we'll require only 2 rows per group.
+ *
+ * TODO For now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ * for the multivariate case.
+ *
+ * TODO We can do this only if we believe we got all the distinct
+ * values of the table.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog)
+ * instead of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /*
+ * Walk through the sorted data again, and see how many groups
+ * reach the mcv_threshold (and become an item in the MCV list).
+ */
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or new group, so check if we exceed mcv_threshold */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* group hits the threshold, count the group as MCV item */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* within group, so increase the number of items */
+ count += 1;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as
+ * we'll pass this outside this method and thus it needs to be
+ * easy to pfree() the data (and we wouldn't know where the
+ * arrays start).
+ *
+ * TODO Maybe the reasoning that we can't allocate a single
+ * piece because we're passing it out is bogus? Who'd
+ * free a single item of the MCV list, anyway?
+ *
+ * TODO Maybe with a proper encoding (stuffing all the values
+ * into a list-level array, this will be untrue)?
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /*
+ * Repeat the same loop as above, but this time copy the data
+ * into the MCV list (for items exceeding the threshold).
+ *
+ * TODO Maybe we could simply remember indexes of the last item
+ * in each group (from the previous loop)?
+ */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[nitems];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, items[(i-1)].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, items[(i-1)].isnull, sizeof(bool) * numattrs);
+
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ /* next item */
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows
+ * that are not represented by the MCV list).
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ *
+ * A better option would be keeping the ID of the row in
+ * the sort item, and then just walk through the items and
+ * mark rows to remove (in a bitmap of the same size).
+ * There's not space for that in SortItem at this moment,
+ * but it's trivial to add 'private' pointer, or just
+ * using another structure with extra field (starting with
+ * SortItem, so that the comparators etc. still work).
+ *
+ * Another option is to use the sorted array of items
+ * (because that's how we sorted the source data), and
+ * simply do a bsearch() into it. If we find a matching
+ * item, the row belongs to the MCV list.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* used for the searches */
+ SortItem item, mcvitem;;
+
+ item.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ item.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * FIXME we don't need to allocate this, we can reference
+ * the MCV item directly ...
+ */
+ mcvitem.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ mcvitem.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ item.values[j] = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &item.isnull[j]);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /*
+ * TODO Create a SortItem/MCVItem comparator so that
+ * we don't need to do memcpy() like crazy.
+ */
+ memcpy(mcvitem.values, mcvlist->items[j]->values,
+ numattrs * sizeof(Datum));
+ memcpy(mcvitem.isnull, mcvlist->items[j]->isnull,
+ numattrs * sizeof(bool));
+
+ if (multi_sort_compare(&item, &mcvitem, mss) == 0)
+ {
+ match = true;
+ break;
+ }
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the rows and remember how many rows we kept */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(rows_filtered);
+ pfree(item.values);
+ pfree(item.isnull);
+ pfree(mcvitem.values);
+ pfree(mcvitem.isnull);
+ }
+ }
+
+ pfree(values);
+ pfree(items);
+ pfree(isnull);
+
+ return mcvlist;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+MCVList
+load_mv_mcvlist(Oid mvoid)
+{
+ bool isnull = false;
+ Datum mcvlist;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->mcv_enabled && mvstat->mcv_built);
+#endif
+
+ mcvlist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_mcvlist(DatumGetByteaP(mcvlist));
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Serialize MCV list into a bytea value. The basic algorithm is simple:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use uint16 values for the indexes in step (3), as we don't
+ * allow more than 8k MCV items (see list max_mcv_items). We might
+ * increase this to 65k and still fit into uint16.
+ *
+ * We don't really expect the high compression as with histograms,
+ * because we're not doing any bucket splits etc. (which is the source
+ * of high redundancy there), but we need to do it anyway as we need
+ * to serialize varlena values etc. We might invent another way to
+ * serialize MCV lists, but let's keep it consistent.
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider using 16-bit values for the indexes in step (3).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ Size total_length = 0;
+
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for all values, including NULLs (won't use them) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ if (! mcvlist->items[j]->isnull[i]) /* skip NULL values */
+ {
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* do not exceed UINT16_MAX */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval || (info[i].typlen > 0))
+ /* by value pased by reference, but fixed length */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data.
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized MCV exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'ptr' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the values for each item */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! mcvlist->items[i]->isnull[j])
+ {
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ v = (Datum*)bsearch(&mcvlist->items[i]->values[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims),
+ mcvlist->items[i]->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims),
+ &mcvlist->items[i]->frequency, sizeof(double));
+
+ /* copy the item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/*
+ * Inverse to serialize_mv_mcvlist() - see the comment there.
+ *
+ * We'll do full deserialization, because we don't really expect high
+ * duplication of values so the caching may not be as efficient as with
+ * histograms.
+ */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ uint16 *indexes = NULL;
+ Datum **values = NULL;
+
+ /* local allocation buffer (used only for deserialization) */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* buffer used for the result */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert(nitems > 0);
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MCVListData,items) +
+ ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * We'll allocate one large chunk of memory for the intermediate
+ * data, needed only for deserializing the MCV list, and we'll pack
+ * use a local dense allocation to minimize the palloc overhead.
+ *
+ * Let's see how much space we'll actually need, and also include
+ * space for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ /* for full-size byval types, we reuse the serialized value */
+ if (! (info[i].typbyval && info[i].typlen == sizeof(Datum)))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ /* pased by reference, but fixed length (name, tid, ...) */
+ if (info[i].typlen > 0)
+ {
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should exhaust the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for the MCV items in a single piece */
+ rbufflen = (sizeof(MCVItem) + sizeof(MCVItemData) +
+ sizeof(Datum)*ndims + sizeof(bool)*ndims) * nitems;
+
+ rbuff = palloc(rbufflen);
+ rptr = rbuff;
+
+ mcvlist->items = (MCVItem*)rbuff;
+ rptr += (sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)rptr;
+ rptr += (sizeof(MCVItemData));
+
+ item->values = (Datum*)rptr;
+ rptr += (sizeof(Datum)*ndims);
+
+ item->isnull = (bool*)rptr;
+ rptr += (sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+ for (j = 0; j < ndims; j++)
+ Assert(indexes[j] <= UINT16_MAX);
+#endif
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ /* release the temporary buffer */
+ pfree(buff);
+
+ return mcvlist;
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_mcv_items);
+
+Datum
+pg_mv_mcv_items(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MCVList mcvlist;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ mcvlist = load_mv_mcvlist(PG_GETARG_OID(0));
+
+ funcctx->user_fctx = mcvlist;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = mcvlist->nitems;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MCVList mcvlist;
+ MCVItem item;
+
+ mcvlist = (MCVList)funcctx->user_fctx;
+
+ Assert(call_cntr < mcvlist->nitems);
+
+ item = mcvlist->items[call_cntr];
+
+ stakeys = find_mv_attnums(PG_GETARG_OID(0), &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(4 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+
+ /* frequency */
+ values[3] = (char *) palloc(64 * sizeof(char));
+
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * mcvlist->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* item ID */
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ Datum val, valout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == mcvlist->ndimensions-1)
+ format = "%s, %s}";
+
+ val = item->values[i];
+ valout = FunctionCall1(&fmgrinfo[i], val);
+
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4f106c3..6339631 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,8 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled,\n"
- " deps_built,\n"
+ " deps_enabled, mcv_enabled,\n"
+ " deps_built, mcv_built,\n"
+ " mcv_max_items,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2128,6 +2129,8 @@ describeOneTableDetails(const char *schemaname,
printTableAddFooter(&cont, _("Statistics:"));
for (i = 0; i < tuples; i++)
{
+ bool first = true;
+
printfPQExpBuffer(&buf, " ");
/* statistics name (qualified with namespace) */
@@ -2137,10 +2140,22 @@ describeOneTableDetails(const char *schemaname,
/* options */
if (!strcmp(PQgetvalue(result, i, 4), "t"))
- appendPQExpBuffer(&buf, "(dependencies)");
+ {
+ appendPQExpBuffer(&buf, "(dependencies");
+ first = false;
+ }
+
+ if (!strcmp(PQgetvalue(result, i, 5), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", mcv");
+ else
+ appendPQExpBuffer(&buf, "(mcv");
+ first = false;
+ }
- appendPQExpBuffer(&buf, " ON (%s)",
- PQgetvalue(result, i, 6));
+ appendPQExpBuffer(&buf, ") ON (%s)",
+ PQgetvalue(result, i, 9));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index a568a07..fd7107d 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -37,15 +37,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -61,13 +67,17 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 7
+#define Natts_pg_mv_statistic 11
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
-#define Anum_pg_mv_statistic_deps_built 5
-#define Anum_pg_mv_statistic_stakeys 6
-#define Anum_pg_mv_statistic_stadeps 7
+#define Anum_pg_mv_statistic_mcv_enabled 5
+#define Anum_pg_mv_statistic_mcv_max_items 6
+#define Anum_pg_mv_statistic_deps_built 7
+#define Anum_pg_mv_statistic_mcv_built 8
+#define Anum_pg_mv_statistic_stakeys 9
+#define Anum_pg_mv_statistic_stadeps 10
+#define Anum_pg_mv_statistic_stamcv 11
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 20d565c..66b4bcd 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2670,6 +2670,10 @@ DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index de86d01..5ae6b3c 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -619,9 +619,11 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
+ bool mcv_enabled; /* MCV list enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
+ bool mcv_built; /* MCV list built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index cc43a79..4535db7 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -51,30 +51,89 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
*/
MVDependencies load_mv_dependencies(Oid mvoid);
+MCVList load_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..56748e3
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON mcv_list (unknown_column) WITH (mcv);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON mcv_list (a) WITH (mcv);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a, b) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+ERROR: max number of MCV items is 8192
+-- correct command
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON mcv_list (a, b, c, d) WITH (mcv);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 84b4425..66071d8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1373,7 +1373,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
s.staname,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 4f2ffb8..85d94f1 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 097a04f..6584d73 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -163,3 +163,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..af4c9f4
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,178 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON mcv_list (unknown_column) WITH (mcv);
+
+-- single column
+CREATE STATISTICS s1 ON mcv_list (a) WITH (mcv);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a) WITH (mcv);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a, b) WITH (mcv);
+
+-- unknown option
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (unknown_option);
+
+-- missing MCV statistics
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+
+-- correct command
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON mcv_list (a, b, c, d) WITH (mcv);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DROP TABLE mcv_list;
--
2.1.0
0005-multivariate-histograms.patchbinary/octet-stream; name=0005-multivariate-histograms.patchDownload
From 820bbb5d00ee143d32dac16f48776346a2ddd81c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 5/9] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
doc/src/sgml/ref/create_statistics.sgml | 18 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 44 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 606 ++++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.histogram | 287 ++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 37 +-
src/backend/utils/mvstats/histogram.c | 2032 ++++++++++++++++++++++++++++
src/bin/psql/describe.c | 17 +-
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 136 +-
src/test/regress/expected/mv_histogram.out | 207 +++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 176 +++
21 files changed, 3570 insertions(+), 41 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.histogram
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index 193e4b0..fd3382e 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -133,6 +133,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</varlistentry>
<varlistentry>
+ <term><literal>histogram</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables histogram for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>max_buckets</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of histogram buckets.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><literal>max_mcv_items</> (<type>integer</>)</term>
<listitem>
<para>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d570ee..6afdee0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -167,7 +167,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 90bfaed..b974655 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -137,12 +137,15 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -241,6 +244,29 @@ CreateStatistics(CreateStatsStmt *stmt)
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("maximum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -249,10 +275,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -260,6 +286,11 @@ CreateStatistics(CreateStatsStmt *stmt)
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -279,11 +310,14 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index e3983fd..d3a96f0 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1978,10 +1978,12 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
WRITE_BOOL_FIELD(mcv_enabled);
+ WRITE_BOOL_FIELD(hist_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
WRITE_BOOL_FIELD(mcv_built);
+ WRITE_BOOL_FIELD(hist_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index ce7d231..647212a 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -49,6 +49,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(Node *clause, Oid varRelid,
Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
@@ -76,6 +77,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStatisticInfo *mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -83,6 +86,12 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
@@ -117,6 +126,7 @@ static Bitmapset * get_varattnos(Node * node, Index relid);
#define UPDATE_RESULT(m,r,isor) \
(m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -145,7 +155,7 @@ static Bitmapset * get_varattnos(Node * node, Index relid);
*
* First we try to reduce the list of clauses by applying (soft) functional
* dependencies, and then we try to estimate the selectivity of the reduced
- * list of clauses using the multivariate MCV list.
+ * list of clauses using the multivariate MCV list and histograms.
*
* Finally we remove the portion of clauses estimated using multivariate stats,
* and process the rest of the clauses using the regular per-column stats.
@@ -232,12 +242,13 @@ clauselist_selectivity(PlannerInfo *root,
* with the multivariate code and simply skip to estimation using the
* regular per-column stats.
*/
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
- (count_mv_attnums(clauses, varRelid, sjinfo, MV_CLAUSE_TYPE_MCV) >= 2))
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) &&
+ (count_mv_attnums(clauses, varRelid, sjinfo,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
/* collect attributes from the compatible conditions */
Bitmapset *mvattnums = collect_mv_attnums(clauses, varRelid, NULL, sjinfo,
- MV_CLAUSE_TYPE_MCV);
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* and search for the statistic covering the most attributes */
MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
@@ -249,8 +260,8 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- MV_CLAUSE_TYPE_MCV);
+ varRelid, &mvclauses, mvstat,
+ (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -962,6 +973,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -975,9 +987,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* FIXME if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1136,7 +1163,7 @@ choose_mv_statistics(List *stats, Bitmapset *attnums)
int numattrs = attrs->dim1;
/* skip dependencies-only stats */
- if (! info->mcv_built)
+ if (! (info->mcv_built || info->hist_built))
continue;
/* count columns covered by the histogram */
@@ -1296,7 +1323,6 @@ clause_is_mv_compatible(Node *clause, Oid varRelid,
bool ok;
/* is it 'variable op constant' ? */
-
ok = (bms_membership(clause_relids) == BMS_SINGLETON) &&
(is_pseudo_constant_clause_relids(lsecond(expr->args),
right_relids) ||
@@ -1346,10 +1372,10 @@ clause_is_mv_compatible(Node *clause, Oid varRelid,
case F_SCALARLTSEL:
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (types & MV_CLAUSE_TYPE_MCV)
+ if (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST))
{
*attnums = bms_add_member(*attnums, var->varattno);
- return (types & MV_CLAUSE_TYPE_MCV);
+ return (types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
}
return false;
@@ -1651,6 +1677,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
}
return false;
@@ -2467,3 +2496,556 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ int nmatches = 0;
+ char *matches = NULL;
+
+ MVSerializedHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = load_mv_histogram(mvstats->mvoid);
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * Find out what part of the data is covered by the histogram,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample got into the histogram, and the rest
+ * is in a MCV list).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole histogram, which might save us some time
+ * spent accessing the not-matching part of the histogram.
+ * Although it's likely in a cache, so it's very fast.
+ */
+ u += mvhist->buckets[i]->ntuples;
+
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+#ifdef DEBUG_MVHIST
+ debug_histogram_matches(mvhist, matches);
+#endif
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s * u;
+}
+
+#define HIST_CACHE_NOT_FOUND 0x00
+#define HIST_CACHE_FALSE 0x01
+#define HIST_CACHE_TRUE 0x03
+#define HIST_CACHE_MASK 0x02
+
+static char bucket_contains_value(FmgrInfo ltproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache);
+
+static char bucket_is_smaller_than_value(FmgrInfo opproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache, bool isgt);
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ /*
+ * Used for caching function calls, only once per deduplicated value.
+ *
+ * We know may have up to (2 * nbuckets) values per dimension. It's
+ * probably overkill, but let's allocate that once for all clauses,
+ * to minimize overhead.
+ *
+ * Also, we only need two bits per value, but this allocates byte
+ * per value. Might be worth optimizing.
+ *
+ * 0x00 - not yet called
+ * 0x01 - called, result is 'false'
+ * 0x03 - called, result is 'true'
+ */
+ char *callcache = palloc(mvhist->nbuckets);
+
+ Assert(mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ /* reset the cache (per clause) */
+ memset(callcache, 0, mvhist->nbuckets);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ /*
+ * TODO Fetch only when really needed (probably for equality only)
+ *
+ * TODO Technically either lt/gt is sufficient.
+ *
+ * FIXME The code in analyze.c creates histograms only for types
+ * with enough ordering (by calling get_sort_group_operators).
+ * Is this the same assumption, i.e. are we certain that we
+ * get the ltproc/gtproc every time we ask? Or are there types
+ * where get_sort_group_operators returns ltopr and here we
+ * get nothing?
+ */
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_EQ_OPR | TYPECACHE_LT_OPR
+ | TYPECACHE_GT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ char res = MVSTATS_MATCH_NONE;
+
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /* histogram boundaries */
+ Datum minval, maxval;
+ bool mininclude, maxinclude;
+ int minidx, maxidx;
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* lookup the values and cache of function calls */
+ minidx = bucket->min[idx];
+ maxidx = bucket->max[idx];
+
+ minval = mvhist->values[idx][bucket->min[idx]];
+ maxval = mvhist->values[idx][bucket->max[idx]];
+
+ mininclude = bucket->min_inclusive[idx];
+ maxinclude = bucket->max_inclusive[idx];
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ */
+ switch (oprrest)
+ {
+ case F_SCALARLTSEL: /* Var < Const */
+
+ res = bucket_is_smaller_than_value(opproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache, isgt);
+ break;
+
+ case F_SCALARGTSEL: /* Const < Var */
+
+ res = bucket_is_smaller_than_value(opproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache, isgt);
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the
+ * lt operator, and we also check for equality with the boundaries.
+ */
+
+ res = bucket_contains_value(ltproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache);
+ break;
+ }
+
+ UPDATE_RESULT(matches[i], res, is_or);
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+
+ /* free the call cache */
+ pfree(callcache);
+
+ return nmatches;
+}
+
+static char
+bucket_contains_value(FmgrInfo ltproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache)
+{
+ bool a, b;
+
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /*
+ * First some quick checks on equality - if any of the boundaries equals,
+ * we have a partial match (so no need to call the comparator).
+ */
+ if (((min_value == constvalue) && (min_include)) ||
+ ((max_value == constvalue) && (max_include)))
+ return MVSTATS_MATCH_PARTIAL;
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ /*
+ * If result for the bucket lower bound not in cache, evaluate the function
+ * and store the result in the cache.
+ */
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, min_value));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /* And do the same for the upper bound. */
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, max_value));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ return (a ^ b) ? MVSTATS_MATCH_PARTIAL : MVSTATS_MATCH_NONE;
+}
+
+static char
+bucket_is_smaller_than_value(FmgrInfo opproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache, bool isgt)
+{
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ bool a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ bool b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ min_value,
+ constvalue));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ max_value,
+ constvalue));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /*
+ * Now, we need to combine both results into the final answer, and we need
+ * to be careful about the 'isgt' variable which kinda inverts the meaning.
+ *
+ * First, we handle the case when each boundary returns different results.
+ * In that case the outcome can only be 'partial' match.
+ */
+ if (a != b)
+ return MVSTATS_MATCH_PARTIAL;
+
+ /*
+ * When the results are the same, then it depends on the 'isgt' value. There
+ * are four options:
+ *
+ * isgt=false a=b=true => full match
+ * isgt=false a=b=false => empty
+ * isgt=true a=b=true => empty
+ * isgt=true a=b=false => full match
+ *
+ * We'll cheat a bit, because we know that (a=b) so we'll use just one of them.
+ */
+ if (isgt)
+ return (!a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+ else
+ return ( a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index a92f889..d46aed2 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -416,7 +416,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
{
info = makeNode(MVStatisticInfo);
@@ -426,10 +426,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
+ info->hist_enabled = mvstat->hist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
+ info->hist_built = mvstat->hist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index f9bf10c..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.histogram b/src/backend/utils/mvstats/README.histogram
new file mode 100644
index 0000000..8234d2c
--- /dev/null
+++ b/src/backend/utils/mvstats/README.histogram
@@ -0,0 +1,287 @@
+Multivariate histograms
+=======================
+
+Histograms on individual attributes consist of buckets represented by ranges,
+covering the domain of the attribute. That is, each bucket is a [min,max]
+interval, and contains all values in this range. The histogram is built in such
+a way that all buckets have about the same frequency.
+
+Multivariate histograms are an extension into n-dimensional space - the buckets
+are n-dimensional intervals (i.e. n-dimensional rectagles), covering the domain
+of the combination of attributes. That is, each bucket has a vector of lower
+and upper boundaries, denoted min[i] and max[i] (where i = 1..n).
+
+In addition to the boundaries, each bucket tracks additional info:
+
+ * frequency (fraction of tuples in the bucket)
+ * whether the boundaries are inclusive or exclusive
+ * whether the dimension contains only NULL values
+ * number of distinct values in each dimension (for building only)
+
+It's possible that in the future we'll multiple histogram types, with different
+features. We do however expect all the types to share the same representation
+(buckets as ranges) and only differ in how we build them.
+
+The current implementation builds non-overlapping buckets, that may not be true
+for some histogram types and the code should not rely on this assumption. There
+are interesting types of histograms (or algorithms) with overlapping buckets.
+
+When used on low-cardinality data, histograms usually perform considerably worse
+than MCV lists (which are a good fit for this kind of data). This is especially
+true on label-like values, where ordering of the values is mostly unrelated to
+meaning of the data, as proper ordering is crucial for histograms.
+
+On high-cardinality data the histograms are usually a better choice, because MCV
+lists can't represent the distribution accurately enough.
+
+
+Selectivity estimation
+----------------------
+
+The estimation is implemented in clauselist_mv_selectivity_histogram(), and
+works very similarly to clauselist_mv_selectivity_mcvlist.
+
+The main difference is that while MCV lists support exact matches, histograms
+often result in approximate matches - e.g. with equality we can only say if
+the constant would be part of the bucket, but not whether it really is there
+or what fraction of the bucket it corresponds to. In this case we rely on
+some defaults just like in the per-column histograms.
+
+The current implementation uses histograms to estimates those types of clauses
+(think of WHERE conditions):
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR-clauses WHERE (a = 1) OR (b = 2)
+
+Similarly to MCV lists, it's possible to add support for additional types of
+clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and so on. These are tasks for the future, not yet implemented.
+
+
+When evaluating a clause on a bucket, we may get one of three results:
+
+ (a) FULL_MATCH - The bucket definitely matches the clause.
+
+ (b) PARTIAL_MATCH - The bucket matches the clause, but not necessarily all
+ the tuples it represents.
+
+ (c) NO_MATCH - The bucket definitely does not match the clause.
+
+This may be illustrated using a range [1, 5], which is essentially a 1-D bucket.
+With clause
+
+ WHERE (a < 10) => FULL_MATCH (all range values are below
+ 10, so the whole bucket matches)
+
+ WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ the clause, but we don't know how many)
+
+ WHERE (a < 0) => NO_MATCH (the whole range is above 1, so
+ no values from the bucket can match)
+
+Some clauses may produce only some of those results - for example equality
+clauses may never produce FULL_MATCH as we always hit only part of the bucket
+(we can't match both boundaries at the same time). This results in less accurate
+estimates compared to MCV lists, where we can hit a MCV items exactly (there's
+no PARTIAL match in MCV).
+
+There are also clauses that may not produce any PARTIAL_MATCH results. A nice
+example of that is 'IS [NOT] NULL' clause, which either matches the bucket
+completely (FULL_MATCH) or not at all (NO_MATCH), thanks to how the NULL-buckets
+are constructed.
+
+Computing the total selectivity estimate is trivial - simply sum selectivities
+from all the FULL_MATCH and PARTIAL_MATCH buckets (but for buckets marked with
+PARTIAL_MATCH, multiply the frequency by 0.5 to minimize the average error).
+
+
+Building a histogram
+---------------------
+
+The algorithm of building a histogram in general is quite simple:
+
+ (a) create an initial bucket (containing all sample rows)
+
+ (b) create NULL buckets (by splitting the initial bucket)
+
+ (c) repeat
+
+ (1) choose bucket to split next
+
+ (2) terminate if no bucket that might be split found, or if we've
+ reached the maximum number of buckets (16384)
+
+ (3) choose dimension to partition the bucket by
+
+ (4) partition the bucket by the selected dimension
+
+The main complexity is hidden in steps (c.1) and (c.3), i.e. how we choose the
+bucket and dimension for the split.
+
+Similarly to one-dimensional histograms, we want to produce buckets with roughly
+the same frequency. We also need to produce "regular" buckets, because buckets
+with one "side" much longer than the others are very likely to match a lot of
+conditions (which increases error, even if the bucket frequency is very low).
+
+To achieve this, we choose the largest bucket (containing the most sample rows),
+but we only choose buckets that can actually be split (have at least 3 different
+combinations of values).
+
+Then we choose the "longest" dimension of the bucket, which is computed by using
+the distinct values in the sample as a measure.
+
+For details see functions select_bucket_to_partition() and partition_bucket().
+
+The current limit on number of buckets (16384) is mostly arbitrary, but chosen
+so that it guarantees we don't exceed the number of distinct values indexable by
+uint16 in any of the dimensions. In practice we could handle more buckets as we
+index each dimension separately and the splits should use the dimensions evenly.
+
+Also, histograms this large (with 16k values in multiple dimensions) would be
+quite expensive to build and process, so the 16k limit is rather reasonable.
+
+The actual number of buckets is also related to statistics target, because we
+require MIN_BUCKET_ROWS (10) tuples per bucket before a split, so we can't have
+more than (2 * 300 * target / 10) buckets. For the default target (100) this
+evaluates to ~6k.
+
+
+NULL handling (create_null_buckets)
+-----------------------------------
+
+When building histograms on a single attribute, we first filter out NULL values.
+In the multivariate case, we can't really do that because the rows may contain
+a mix of NULL and non-NULL values in different columns (so we can't simply
+filter all of them out).
+
+For this reason, the histograms are built in a way so that for each bucket, each
+dimension only contains only NULL or non-NULL values. Building the NULL-buckets
+happens as the first step in the build, by the create_null_buckets() function.
+The number of NULL buckets, as produced by this function, has a clear upper
+boundary (2^N) where N is the number of dimensions (attributes the histogram is
+built on). Or rather 2^K where K is the number of attributes that are not marked
+as not-NULL.
+
+The buckets with NULL dimensions are then subject to the same build algorithm
+(i.e. may be split into smaller buckets) just like any other bucket, but may
+only be split by non-NULL dimension.
+
+
+Serialization
+-------------
+
+To store the histogram in pg_mv_statistic table, it is serialized into a more
+efficient form. We also use the representation for estimation, i.e. we don't
+fully deserialize the histogram.
+
+For example the boundary values are deduplicated to minimize the required space.
+How much redundancy is there, actually? Let's assume there are no NULL values,
+so we start with a single bucket - in that case we have 2*N boundaries. Each
+time we split a bucket we introduce one new value (in the "middle" of one of
+the dimensions), and keep boundries for all the other dimensions. So after K
+splits, we have up to
+
+ 2*N + K
+
+unique boundary values (we may have fewe values, if the same value is used for
+several splits). But after K splits we do have (K+1) buckets, so
+
+ (K+1) * 2 * N
+
+boundary values. Using e.g. N=4 and K=999, we arrive to those numbers:
+
+ 2*N + K = 1007
+ (K+1) * 2 * N = 8000
+
+wich means a lot of redundancy. It's somewhat counter-intuitive that the number
+of distinct values does not really depend on the number of dimensions (except
+for the initial bucket, but that's negligible compared to the total).
+
+By deduplicating the values and replacing them with 16-bit indexes (uint16), we
+reduce the required space to
+
+ 1007 * 8 + 8000 * 2 ~= 24kB
+
+which is significantly less than 64kB required for the 'raw' histogram (assuming
+the values are 8B).
+
+While the bytea compression (pglz) might achieve the same reduction of space,
+the deduplicated representation is used to optimize the estimation by caching
+results of function calls for already visited values. This significantly
+reduces the number of calls to (often quite expensive) operators.
+
+Note: Of course, this reasoning only holds for histograms built by the algorithm
+that simply splits the buckets in half. Other histograms types (e.g. containing
+overlapping buckets) may behave differently and require different serialization.
+
+Serialized histograms are marked with 'magic' constant, to make it easier to
+check the bytea value really is a serialized histogram.
+
+
+varlena compression
+-------------------
+
+This serialization may however disable automatic varlena compression, the array
+of unique values is placed at the beginning of the serialized form. Which is
+exactly the chunk used by pglz to check if the data is compressible, and it
+will probably decide it's not very compressible. This is similar to the issue
+we had with JSONB initially.
+
+Maybe storing buckets first would make it work, as the buckets may be better
+compressible.
+
+On the other hand the serialization is actually a context-aware compression,
+usually compressing to ~30% (or even less, with large data types). So the lack
+of additional pglz compression may be acceptable.
+
+
+Deserialization
+---------------
+
+The deserialization is not a perfect inverse of the serialization, as we keep
+the deduplicated arrays. This reduces the amount of memory and also allows
+optimizations during estimation (e.g. we can cache results for the distinct
+values, saving expensive function calls).
+
+
+Inspecting the histogram
+------------------------
+
+Inspecting the regular (per-attribute) histograms is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarray, so we
+simply get the text representation of the array.
+
+With multivariate histograms it's not that simple due to the possible mix of
+data types in the histogram. It might be possible to produce similar array-like
+text representation, but that'd unnecessarily complicate further processing
+and analysis of the histogram. Instead, there's a SRF function that allows
+access to lower/upper boundaries, frequencies etc.
+
+ SELECT * FROM pg_mv_histogram_buckets();
+
+It has two input parameters:
+
+ oid - OID of the histogram (pg_mv_statistic.staoid)
+ otype - type of output
+
+and produces a table with these columns:
+
+ - bucket ID (0...nbuckets-1)
+ - lower bucket boundaries (string array)
+ - upper bucket boundaries (string array)
+ - nulls only dimensions (boolean array)
+ - lower boundary inclusive (boolean array)
+ - upper boundary includive (boolean array)
+ - frequency (double precision)
+
+The 'otype' accepts three values, determining what will be returned in the
+lower/upper boundary arrays:
+
+ - 0 - values stored in the histogram, encoded as text
+ - 1 - indexes into the deduplicated arrays
+ - 2 - idnexes into the deduplicated arrays, scaled to [0,1]
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 5c5c59a..3e4f4d1 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -18,6 +18,8 @@ Currently we only have two kinds of multivariate statistics
(b) MCV lists (README.mcv)
+ (c) multivariate histograms (README.histogram)
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index d1da714..ffb76f4 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -13,11 +13,11 @@
*
*-------------------------------------------------------------------------
*/
+#include "postgres.h"
+#include "utils/array.h"
#include "common.h"
-#include "utils/array.h"
-
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts,
VacAttrStats **vacattrstats);
@@ -52,7 +52,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -95,8 +96,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (stat->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
}
}
@@ -176,6 +181,8 @@ list_mv_stats(Oid relid)
info->deps_built = stats->deps_built;
info->mcv_enabled = stats->mcv_enabled;
info->mcv_built = stats->mcv_built;
+ info->hist_enabled = stats->hist_enabled;
+ info->hist_built = stats->hist_built;
result = lappend(result, info);
}
@@ -190,7 +197,6 @@ list_mv_stats(Oid relid)
return result;
}
-
/*
* Find attnims of MV stats using the mvoid.
*/
@@ -236,9 +242,16 @@ find_mv_attnums(Oid mvoid, Oid *relid)
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -271,22 +284,34 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..9e5620a
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,2032 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+#include <math.h>
+
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(int32))
+ * - max boundary indexes (2 * ndim * sizeof(int32))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(int32) + 3 * sizeof(bool)) +
+ * 2 * sizeof(float)
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) ((float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* can't split bucket with less than 10 rows */
+#define MIN_BUCKET_ROWS 10
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size.
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket
+ * for more details about the algorithm.
+ *
+ * The current algorithm works like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (largest bucket)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (largest dimension)
+ * split the bucket into two buckets
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ int *ndistvalues;
+ Datum **distvalues;
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets
+ = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * Collect info on distinct values in each dimension (used later
+ * to select dimension to partition).
+ */
+ ndistvalues = (int*)palloc0(sizeof(int) * numattrs);
+ distvalues = (Datum**)palloc0(sizeof(Datum*) * numattrs);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ int j;
+ int nvals;
+ Datum *tmp;
+
+ SortSupportData ssup;
+ StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ nvals = 0;
+ tmp = (Datum*)palloc0(sizeof(Datum) * numrows);
+
+ for (j = 0; j < numrows; j++)
+ {
+ bool isnull;
+
+ /* remember the index of the sample row, to make the partitioning simpler */
+ Datum value = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+
+ if (isnull)
+ continue;
+
+ tmp[nvals++] = value;
+ }
+
+ /* do the sort and stuff only if there are non-NULL values */
+ if (nvals > 0)
+ {
+ /* sort the array of values */
+ qsort_arg((void *) tmp, nvals, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /* count distinct values */
+ ndistvalues[i] = 1;
+ for (j = 1; j < nvals; j++)
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ ndistvalues[i] += 1;
+
+ /* FIXME allocate only needed space (count ndistinct first) */
+ distvalues[i] = (Datum*)palloc0(sizeof(Datum) * ndistvalues[i]);
+
+ /* now collect distinct values into the array */
+ distvalues[i][0] = tmp[0];
+ ndistvalues[i] = 1;
+
+ for (j = 1; j < nvals; j++)
+ {
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ {
+ distvalues[i][ndistvalues[i]] = tmp[j];
+ ndistvalues[i] += 1;
+ }
+ }
+ }
+
+ pfree(tmp);
+ }
+
+ /*
+ * The initial bucket may contain NULL values, so we have to create
+ * buckets with NULL-only dimensions.
+ *
+ * FIXME We may need up to 2^ndims buckets - check that there are
+ * enough buckets (MVSTAT_HIST_MAX_BUCKETS >= 2^ndims).
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets]
+ = partition_bucket(bucket, attrs, stats,
+ ndistvalues, distvalues);
+
+ histogram->nbuckets += 1;
+ }
+
+ /* finalize the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data
+ = ((HistogramBuild)histogram->buckets[i]->build_data);
+
+ /*
+ * The frequency has to be computed from the whole sample, in
+ * case some of the rows were used for MCV (and thus are missing
+ * from the histogram).
+ */
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVSerializedHistogram
+load_mv_histogram(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram(DatumGetByteaP(histogram));
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVSerializedHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm is quite
+ * simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ *
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ *
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we're mixing different
+ * datatypes, and we we need to use the right operators to compare/sort them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs
+ * (we won't use them, but we don't know how many are there),
+ * and then collect all non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* make sure we fit into uint16 */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typlen > 0)
+ /* byval or byref, but with fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (10 * 1024 * 1024))
+ elog(ERROR, "serialized histogram exceeds 10MB (%ld > %d)",
+ total_length, (10 * 1024 * 1024));
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typlen > 0)
+ {
+ /* pased by value or reference, but fixed length */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the histogram buckets */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ *BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ uint16 idx;
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ /* min boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* FIXME free the values/counts arrays here */
+
+ return output;
+}
+
+/*
+ * Returns histogram in a partially-serialized form (keeps the boundary
+ * values deduplicated, so that it's possible to optimize the estimation
+ * part by caching function call results between buckets etc.).
+ */
+MVSerializedHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+
+ MVSerializedHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram
+ = (MVSerializedHistogram)palloc(sizeof(MVSerializedHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVSerializedHistogramData, buckets));
+ tmp += offsetof(MVSerializedHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVSerializedHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* now let's allocate a single buffer for all the values and counts */
+
+ bufflen = (sizeof(int) + sizeof(Datum*)) * ndims;
+ for (i = 0; i < ndims; i++)
+ {
+ /* don't allocate space for byval types, matching Datum */
+ if (! (info[i].typbyval && (info[i].typlen == sizeof(Datum))))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+ }
+
+ /* also, include space for the result, tracking the buckets */
+ bufflen += nbuckets * (
+ sizeof(MVSerializedBucket) + /* bucket pointer */
+ sizeof(MVSerializedBucketData)); /* bucket data */
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ histogram->nvalues = (int*)ptr;
+ ptr += (sizeof(int) * ndims);
+
+ histogram->values = (Datum**)ptr;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ histogram->nvalues[i] = info[i].nvalues;
+
+ if (info[i].typbyval && info[i].typlen == sizeof(Datum))
+ {
+ /* passed by value / Datum - simply reuse the array */
+ histogram->values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typbyval)
+ {
+ /* pased by value, but smaller than Datum */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&histogram->values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ histogram->buckets = (MVSerializedBucket*)ptr;
+ ptr += (sizeof(MVSerializedBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVSerializedBucket bucket = (MVSerializedBucket)ptr;
+ ptr += sizeof(MVSerializedBucketData);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+ bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+ bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+ bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+ bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+ bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ /* we should exhaust the output buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller ones.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size. We
+ * select the bucket with the highest number of distinct values, and
+ * then split it by the longest dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this
+ * is used to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this
+ * contains values for all the tuples from the sample, not just
+ * the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ * For example use the "bucket volume" (product of dimension
+ * lengths) to select the bucket.
+ *
+ * We need buckets containing about the same number of tuples (so
+ * about the same frequency), as that limits the error when we
+ * match the bucket partially (in that case use 1/2 the bucket).
+ *
+ * We also need buckets with "regular" size, i.e. not "narrow" in
+ * some dimensions and "wide" in the others, because that makes
+ * partial matches more likely and increases the estimation error,
+ * especially when the clauses match many buckets partially. This
+ * is especially serious for OR-clauses, because in that case any
+ * of them may add the bucket as a (partial) match. With AND-clauses
+ * all the clauses have to match the bucket, which makes this issue
+ * somewhat less pressing.
+ *
+ * For example this table:
+ *
+ * CREATE TABLE t AS SELECT i AS a, i AS b
+ * FROM generate_series(1,1000000) s(i);
+ * ALTER TABLE t ADD STATISTICS (histogram) ON (a,b);
+ * ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because
+ * every bucket always has exactly the same number of distinct
+ * values in all dimensions, which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ * SELECT * FROM t WHERE a < 10 AND b < 10;
+ *
+ * is estimated to return ~120 rows, while in reality it returns 9.
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=117 width=8)
+ * (actual time=0.185..270.774 rows=9 loops=1)
+ * Filter: ((a < 10) AND (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * while the query using OR clauses is estimated like this:
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=8100 width=8)
+ * (actual time=0.118..189.919 rows=9 loops=1)
+ * Filter: ((a < 10) OR (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * which is clearly much worse. This happens because the histogram
+ * contains buckets like this:
+ *
+ * bucket 592 [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the
+ * length of "b" is (30593-30134)=459. So the "b" dimension is much
+ * narrower than "a". Of course, there are buckets where "b" is the
+ * wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension
+ * in partition_bucket() but that only happens after we already
+ * selected the bucket. So if we never select the bucket, we can't
+ * really fix it there.
+ *
+ * The other reason why this particular example behaves so poorly
+ * is due to the way we split the partition in partition_bucket().
+ * Currently we attempt to divide the bucket into two parts with
+ * the same number of sampled tuples (frequency), but that does not
+ * work well when all the tuples are squashed on one end of the
+ * bucket (e.g. exactly at the diagonal, as a=b). In that case we
+ * split the bucket into a tiny bucket on the diagonal, and a huge
+ * remaining part of the bucket, which is still going to be narrow
+ * and we're unlikely to fix that.
+ *
+ * So perhaps we need two partitioning strategies - one aiming to
+ * split buckets with high frequency (number of sampled rows), the
+ * other aiming to split "large" buckets. And alternating between
+ * them, somehow.
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ *
+ * TODO Consider using similar lower boundary for row count as for simple
+ * histograms, i.e. 300 tuples per bucket.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int numrows = 0;
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+ /* if the number of rows is higher, use this bucket */
+ if ((data->ndistinct > 2) &&
+ (data->numrows > numrows) &&
+ (data->numrows >= MIN_BUCKET_ROWS)) {
+ bucket = buckets[i];
+ numrows = data->numrows;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest
+ * bucket dimension, measured using the array of distinct values built
+ * at the very beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly
+ * distributed, and then use this to measure length. It's essentially
+ * a number of distinct values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts
+ * with roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning
+ * the new bucket (essentially shrinking the existing one in-place and
+ * returning the other "half" as a new bucket). The caller is responsible
+ * for adding the new bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case
+ * of strongly dependent columns - e.g. y=x). The current algorithm
+ * minimizes this, but may still happen for perfectly dependent
+ * examples (when all the dimensions have equal length, the first
+ * one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ // int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+ double delta;
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split.
+ */
+ delta = 0.0;
+ dimension = -1;
+
+ for (i = 0; i < numattrs; i++)
+ {
+ Datum *a, *b;
+
+ mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ /* can't split NULL-only dimension */
+ if (bucket->nullsonly[i])
+ continue;
+
+ /* can't split dimension with a single ndistinct value */
+ if (data->ndistincts[i] <= 1)
+ continue;
+
+ /* sort support for the bsearch_comparator */
+ ssup_private = &ssup;
+
+ /* search for min boundary in the distinct list */
+ a = (Datum*)bsearch(&bucket->min[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ b = (Datum*)bsearch(&bucket->max[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ /* if this dimension is 'larger' then partition by it */
+ if (((b-a)*1.0 / ndistvalues[i]) > delta)
+ {
+ delta = ((b-a)*1.0 / ndistvalues[i]);
+ dimension = i;
+ }
+ }
+
+ /*
+ * If we haven't found a dimension here, we've done something
+ * wrong in select_bucket_to_partition.
+ */
+ Assert(dimension != -1);
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ delta = fabs(data->numrows);
+ split_value = values[0].value;
+
+ for (i = 1; i < data->numrows; i++)
+ {
+ if (values[i].value != values[i-1].value)
+ {
+ /* are we closer to splitting the bucket in half? */
+ if (fabs(i - data->numrows/2.0) < delta)
+ {
+ /* let's assume we'll use this value for the split */
+ split_value = values[i].value;
+ delta = fabs(i - data->numrows/2.0);
+ nrows = i;
+ }
+ }
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and
+ * non-NULL values in a single dimension. Each dimension may either be
+ * marked as 'nulls only', and thus containing only NULL values, or
+ * it must not contain any NULL values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns,
+ * it's necessary to build those NULL-buckets. This is done in an
+ * iterative way using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL
+ * and non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not
+ * marked as NULL-only, mark it as NULL-only and run the
+ * algorithm again (on this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the
+ * bucket into two parts - one with NULL values, one with
+ * non-NULL values (replacing the current one). Then run
+ * the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions
+ * should be quite low - limited by the number of NULL-buckets. Also,
+ * in each branch the number of nested calls is limited by the number
+ * of dimensions (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The
+ * number of buckets produced by this algorithm is rather limited - with
+ * N dimensions, there may be only 2^N such buckets (each dimension may
+ * be either NULL or non-NULL). So with 8 dimensions (current value of
+ * MVSTATS_MAX_DIMENSIONS) there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further
+ * optimizing the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL
+ * in a dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ /*
+ * FIXME We don't need to start from the first attribute
+ * here - we can start from the last known dimension.
+ */
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only,
+ * but is not yet marked like that. It's enough to mark it and
+ * repeat the process recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in
+ * the dimension, one with non-NULL values. We don't need to sort
+ * the data or anything, but otherwise it's similar to what's done
+ * in partition_bucket().
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each
+ * bucket (NULL is not a value, so 0, and the other bucket got
+ * all the ndistinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram (or if there's no
+ * statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ * - prints actual values
+ * - using the output function of the data type (as string)
+ * - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ * - prints index of the distinct value (into the serialized array)
+ * - makes it easier to spot neighbor buckets, etc.
+ * - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ * - prints index of the distinct value, but normalized into [0,1]
+ * - similar to 1, but shows how 'long' the bucket range is
+ * - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options
+ * skew the lengths by distributing the distinct values uniformly. For
+ * data types without a clear meaning of 'distance' (e.g. strings) that
+ * is not a big deal, but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_histogram_buckets);
+
+Datum
+pg_mv_histogram_buckets(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ Oid mvoid = PG_GETARG_OID(0);
+ int otype = PG_GETARG_INT32(1);
+
+ if ((otype < 0) || (otype > 2))
+ elog(ERROR, "invalid output type specified");
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MVSerializedHistogram histogram;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ histogram = load_mv_histogram(mvoid);
+
+ funcctx->user_fctx = histogram;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = histogram->nbuckets;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+ double bucket_size = 1.0;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MVSerializedHistogram histogram;
+ MVSerializedBucket bucket;
+
+ histogram = (MVSerializedHistogram)funcctx->user_fctx;
+
+ Assert(call_cntr < histogram->nbuckets);
+
+ bucket = histogram->buckets[call_cntr];
+
+ stakeys = find_mv_attnums(mvoid, &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(9 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+ values[3] = (char *) palloc0(1024 * sizeof(char));
+ values[4] = (char *) palloc0(1024 * sizeof(char));
+ values[5] = (char *) palloc0(1024 * sizeof(char));
+
+ values[6] = (char *) palloc(64 * sizeof(char));
+ values[7] = (char *) palloc(64 * sizeof(char));
+ values[8] = (char *) palloc(64 * sizeof(char));
+
+ /* we need to do this only when printing the actual values */
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * histogram->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* bucket ID */
+
+ /*
+ * currently we only print array of indexes, but the deduplicated
+ * values should be sorted, so this is actually quite useful
+ *
+ * TODO print the actual min/max values, using the output
+ * function of the attribute type
+ */
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bucket_size *= (bucket->max[i] - bucket->min[i]) * 1.0
+ / (histogram->nvalues[i]-1);
+
+ /* print the actual values, i.e. use output function etc. */
+ if (otype == 0)
+ {
+ Datum minval, maxval;
+ Datum minout, maxout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ minval = histogram->values[i][bucket->min[i]];
+ minout = FunctionCall1(&fmgrinfo[i], minval);
+
+ maxval = histogram->values[i][bucket->max[i]];
+ maxout = FunctionCall1(&fmgrinfo[i], maxval);
+
+ // snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(minout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ // snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ snprintf(buff, 1024, format, values[2], DatumGetPointer(maxout));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else if (otype == 1)
+ {
+ format = "%s, %d";
+ if (i == 0)
+ format = "{%s%d";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %d}";
+
+ snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else
+ {
+ format = "%s, %f";
+ if (i == 0)
+ format = "{%s%f";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %f}";
+
+ snprintf(buff, 1024, format, values[1],
+ bucket->min[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2],
+ bucket->max[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ snprintf(buff, 1024, format, values[3], bucket->nullsonly[i] ? "t" : "f");
+ strncpy(values[3], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[4], bucket->min_inclusive[i] ? "t" : "f");
+ strncpy(values[4], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[5], bucket->max_inclusive[i] ? "t" : "f");
+ strncpy(values[5], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[6], 64, "%f", bucket->ntuples); /* frequency */
+ snprintf(values[7], 64, "%f", bucket->ntuples / bucket_size); /* density */
+ snprintf(values[8], 64, "%f", bucket_size); /* bucket_size */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+ pfree(values[4]);
+ pfree(values[5]);
+ pfree(values[6]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
+
+#ifdef DEBUG_MVHIST
+/*
+ * prints debugging info about matched histogram buckets (full/partial)
+ *
+ * XXX Currently works only for INT data type.
+ */
+void
+debug_histogram_matches(MVSerializedHistogram mvhist, char *matches)
+{
+ int i, j;
+
+ float ffull = 0, fpartial = 0;
+ int nfull = 0, npartial = 0;
+
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ char ranges[1024];
+
+ if (! matches[i])
+ continue;
+
+ /* increment the counters */
+ nfull += (matches[i] == MVSTATS_MATCH_FULL) ? 1 : 0;
+ npartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? 1 : 0;
+
+ /* and also update the frequencies */
+ ffull += (matches[i] == MVSTATS_MATCH_FULL) ? bucket->ntuples : 0;
+ fpartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? bucket->ntuples : 0;
+
+ memset(ranges, 0, sizeof(ranges));
+
+ /* build ranges for all the dimentions */
+ for (j = 0; j < mvhist->ndimensions; j++)
+ {
+ sprintf(ranges, "%s [%d %d]", ranges,
+ DatumGetInt32(mvhist->values[j][bucket->min[j]]),
+ DatumGetInt32(mvhist->values[j][bucket->max[j]]));
+ }
+
+ elog(WARNING, "bucket %d %s => %d [%f]", i, ranges, matches[i], bucket->ntuples);
+ }
+
+ elog(WARNING, "full=%f partial=%f (%f)", ffull, fpartial, (ffull + 0.5 * fpartial));
+}
+#endif
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 6339631..3543239 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,9 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled, mcv_enabled,\n"
- " deps_built, mcv_built,\n"
- " mcv_max_items,\n"
+ " deps_enabled, mcv_enabled, hist_enabled,\n"
+ " deps_built, mcv_built, hist_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2154,8 +2154,17 @@ describeOneTableDetails(const char *schemaname,
first = false;
}
+ if (!strcmp(PQgetvalue(result, i, 6), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", histogram");
+ else
+ appendPQExpBuffer(&buf, "(histogram");
+ first = false;
+ }
+
appendPQExpBuffer(&buf, ") ON (%s)",
- PQgetvalue(result, i, 9));
+ PQgetvalue(result, i, 12));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index fd7107d..a5945af 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -38,13 +38,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -52,6 +55,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -67,17 +71,21 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 11
+#define Natts_pg_mv_statistic 15
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
#define Anum_pg_mv_statistic_mcv_enabled 5
-#define Anum_pg_mv_statistic_mcv_max_items 6
-#define Anum_pg_mv_statistic_deps_built 7
-#define Anum_pg_mv_statistic_mcv_built 8
-#define Anum_pg_mv_statistic_stakeys 9
-#define Anum_pg_mv_statistic_stadeps 10
-#define Anum_pg_mv_statistic_stamcv 11
+#define Anum_pg_mv_statistic_hist_enabled 6
+#define Anum_pg_mv_statistic_mcv_max_items 7
+#define Anum_pg_mv_statistic_hist_max_buckets 8
+#define Anum_pg_mv_statistic_deps_built 9
+#define Anum_pg_mv_statistic_mcv_built 10
+#define Anum_pg_mv_statistic_hist_built 11
+#define Anum_pg_mv_statistic_stakeys 12
+#define Anum_pg_mv_statistic_stadeps 13
+#define Anum_pg_mv_statistic_stamcv 14
+#define Anum_pg_mv_statistic_stahist 15
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 66b4bcd..7e915bd 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2674,6 +2674,10 @@ DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f
DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
DESCR("details about MCV list items");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3374 ( pg_mv_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_size}" _null_ _null_ pg_mv_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 5ae6b3c..46bece6 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -620,10 +620,12 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
+ bool hist_enabled; /* histogram enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
+ bool hist_built; /* histogram built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 4535db7..f05a517 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -92,6 +92,123 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ *
+ * TODO add more detailed description here
+ */
+
+typedef struct MVSerializedBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ uint16 *min;
+ bool *min_inclusive;
+
+ /* indexes of upper boundaries - values and information about the
+ * inequalities (exclusive vs. inclusive) */
+ uint16 *max;
+ bool *max_inclusive;
+
+} MVSerializedBucketData;
+
+typedef MVSerializedBucketData *MVSerializedBucket;
+
+typedef struct MVSerializedHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ /*
+ * keep this the same with MVHistogramData, because of
+ * deserialization (same offset)
+ */
+ MVSerializedBucket *buckets; /* array of buckets */
+
+ /*
+ * serialized boundary values, one array per dimension, deduplicated
+ * (the min/max indexes point into these arrays)
+ */
+ int *nvalues;
+ Datum **values;
+
+} MVSerializedHistogramData;
+
+typedef MVSerializedHistogramData *MVSerializedHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -99,20 +216,25 @@ typedef MCVListData *MCVList;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
+MVSerializedHistogram load_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVSerializedHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
* dimension within the stats).
*/
int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
@@ -121,6 +243,8 @@ extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_histogram_buckets(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -130,10 +254,20 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
+#ifdef DEBUG_MVHIST
+extern void debug_histogram_matches(MVSerializedHistogram mvhist, char *matches);
+#endif
+
+
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..a34edb8
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON mv_histogram (unknown_column) WITH (histogram);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON mv_histogram (a) WITH (histogram);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a, b) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+ERROR: maximum number of buckets is 16384
+-- correct command
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON mv_histogram (a, b, c, d) WITH (histogram);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 66071d8..1a1a4ca 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1375,7 +1375,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 85d94f1..a885235 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 6584d73..2efdcd7 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -164,3 +164,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..02f49b4
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,176 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON mv_histogram (unknown_column) WITH (histogram);
+
+-- single column
+CREATE STATISTICS s1 ON mv_histogram (a) WITH (histogram);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a) WITH (histogram);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a, b) WITH (histogram);
+
+-- unknown option
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (unknown_option);
+
+-- missing histogram statistics
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+
+-- invalid max_buckets value / too low
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+
+-- invalid max_buckets value / too high
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+
+-- correct command
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON mv_histogram (a, b, c, d) WITH (histogram);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DROP TABLE mv_histogram;
--
2.1.0
0006-multi-statistics-estimation.patchbinary/octet-stream; name=0006-multi-statistics-estimation.patchDownload
From 0e10f5f26e546d835b493f84a3ebe2c904390228 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 6/9] multi-statistics estimation
The general idea is that a probability (which
is what selectivity is) can be split into a product of
conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part
may be simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute
the original probability.
The implementation works in the other direction, though.
We know what probability P(A & B & C) we need to compute,
and also what statistics are available.
So we search for a combinations of statistics, covering
the clauses in an optimal way (most clauses covered, most
dependencies exploited).
There are two possible approaches - exhaustive and greedy.
The exhaustive one walks through all permutations of
stats using dynamic programming, so it's guaranteed to
find the optimal solution, but it soon gets very slow as
it's roughly O(N!). The dynamic programming may improve
that a bit, but it's still far too expensive for large
numbers of statistics (on a single table).
The greedy algorithm is very simple - in every step choose
the best solution. That may not guarantee the best solution
globally (but maybe it does?), but it only needs N steps
to find the solution, so it's very fast (processing the
selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with
respect to runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply
them to the clauses using the conditional probabilities.
We process the selected stats one by one, and for each
we select the estimated clauses and conditions. See
clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to
be covered by a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single
multivariate statistics.
Clauses not covered by a single statistics at this level
will be passed to clause_selectivity() but this will treat
them as a collection of simpler clauses (connected by AND
or OR), and the clauses from the previous level will be
used as conditions.
So using the same example, the last clause will be passed
to clause_selectivity() with 'clause1' and 'clause2' as
conditions, and it will be processed using multivariate
stats if possible.
The other limitation is that all the expressions have to
be mv-compatible, i.e. there can't be a mix of expressions.
If this is violated, the clause may be passed to the next
level (just like with list of clauses not covered by
a single statistics), which splits that into clauses
handled by multivariate stats and clauses handler by
regular statistics.
rework clauselist_selectivity_or to handle OR-clauses correctly
We might invent a completely new set of functions here, resembling
clauselist_selectivity but adapting the ideas to OR-clauses.
But luckily we know that each OR-clause
(a OR b OR c)
may be rewritten as an equivalent AND-clause using negation:
NOT ((NOT a) AND (NOT b) AND (NOT c))
And that's something we can pass to clauselist_selectivity.
histogram call cache
--------------------
The call cache was removed because it did not initially work
well with OR clauses, but that was just a stupid thinko in the
implementation. This patch re-adds it, hopefully correctly.
The code in update_match_bitmap_histogram() is overly complex,
the branches handling various inequality cases are redundant.
This needs to be simplified somehow.
---
contrib/file_fdw/file_fdw.c | 3 +-
contrib/postgres_fdw/postgres_fdw.c | 6 +-
src/backend/optimizer/path/clausesel.c | 1875 +++++++++++++++++++++++++++-----
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/backend/utils/mvstats/README.stats | 166 +++
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
10 files changed, 1833 insertions(+), 295 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index dc035d7..8f11b7a 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -969,7 +969,8 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
baserel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d79e4cc..2f4af21 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -498,7 +498,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
@@ -2149,7 +2150,8 @@ estimate_path_cost_size(PlannerInfo *root,
local_param_join_conds,
foreignrel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 647212a..d239488 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -29,6 +29,8 @@
#include "utils/selfuncs.h"
#include "utils/typcache.h"
+#include "miscadmin.h"
+
/*
* Data structure for accumulating info about possible range-query
@@ -44,6 +46,13 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
+
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -62,23 +71,27 @@ static Bitmapset *collect_mv_attnums(List *clauses,
static int count_mv_attnums(List *clauses, Oid varRelid,
SpecialJoinInfo *sjinfo, int type);
+static List *clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ int varRelid, SpecialJoinInfo *sjinfo, int types,
+ bool remove);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Oid varRelid, List *stats,
SpecialJoinInfo *sjinfo);
-static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
-
-static List *clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
- List *clauses, Oid varRelid,
- List **mvclauses, MVStatisticInfo *mvstats, int types);
-
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats, List *clauses,
+ List *conditions, bool is_or);
+
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats,
- bool *fullmatch, Selectivity *lowsel);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or, bool *fullmatch,
+ Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -92,11 +105,59 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static List *choose_mv_statistics(PlannerInfo *root,
+ List *mvstats,
+ List *clauses, List *conditions,
+ Oid varRelid,
+ SpecialJoinInfo *sjinfo);
+
+static List *filter_clauses(PlannerInfo *root, Oid varRelid,
+ SpecialJoinInfo *sjinfo, int type,
+ List *stats, List *clauses,
+ Bitmapset **attnums);
+
+static List *filter_stats(List *stats, Bitmapset *new_attnums,
+ Bitmapset *all_attnums);
+
+static Bitmapset **make_stats_attnums(MVStatisticInfo *mvstats,
+ int nmvstats);
+
+static MVStatisticInfo *make_stats_array(List *stats, int *nmvstats);
+
+static List* filter_redundant_stats(List *stats,
+ List *clauses, List *conditions);
+
+static Node** make_clauses_array(List *clauses, int *nclauses);
+
+static Bitmapset ** make_clauses_attnums(PlannerInfo *root, Oid varRelid,
+ SpecialJoinInfo *sjinfo, int type,
+ Node **clauses, int nclauses);
+
+static bool* make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, List *clauses,
Oid varRelid, Index *relid);
-
+
static Bitmapset* fdeps_collect_attnums(List *stats);
static int *make_idx_to_attnum_mapping(Bitmapset *attnums);
@@ -119,6 +180,8 @@ static Bitmapset *fdeps_filter_clauses(PlannerInfo *root,
static Bitmapset * get_varattnos(Node * node, Index relid);
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
+
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
#define MIN(x, y) (((x) < (y)) ? (x) : (y))
@@ -192,14 +255,15 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
/* processing mv stats */
- Oid relid = InvalidOid;
+ Index relid = InvalidOid;
/* list of multivariate stats on the relation */
List *stats = NIL;
@@ -208,12 +272,13 @@ clauselist_selectivity(PlannerInfo *root,
stats = find_stats(root, clauses, varRelid, &relid);
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, or matching multivariate statistics, so just go directly
+ * to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
+ varRelid, jointype, sjinfo, conditions);
/*
* Apply functional dependencies, but first check that there are some stats
@@ -246,32 +311,101 @@ clauselist_selectivity(PlannerInfo *root,
(count_mv_attnums(clauses, varRelid, sjinfo,
MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
- /* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, varRelid, NULL, sjinfo,
- MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ ListCell *s;
+
+ /*
+ * Copy the conditions we got from the upper part of the expression tree
+ * so that we can add local conditions to it (we need to keep the
+ * original list intact, for sibling expressions - other expressions
+ * at the same level).
+ */
+ List *conditions_local = list_copy(conditions);
- /* and search for the statistic covering the most attributes */
- MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+ /* find the best combination of statistics */
+ List *solution = choose_mv_statistics(root, stats,
+ clauses, conditions,
+ varRelid, sjinfo);
- if (mvstat != NULL) /* we have a matching stats */
+ /*
+ * We have a good solution, which is merely a list of statistics that
+ * we need to apply. We'll apply the statistics one by one (in the order
+ * as they appear in the list), and for each statistic we'll
+ *
+ * (1) find clauses compatible with the statistic (and remove them
+ * from the list)
+ *
+ * (2) find local conditions compatible with the statistic
+ *
+ * (3) do the estimation P(clauses | conditions)
+ *
+ * (4) append the estimated clauses to local conditions
+ *
+ * continuously modify
+ */
+ foreach (s, solution)
{
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
+
+ /* clauses compatible with the statistic we're applying right now */
+ List *stat_clauses = NIL;
+ List *stat_conditions = NIL;
+
+ /*
+ * Find clauses and conditions matching the statistic - the clauses
+ * need to be removed from the list, while conditions should remain
+ * there (so that we can apply them repeatedly).
+ *
+ * FIXME Perhaps this should also check compatibility with the type
+ * of stats (i.e. MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST).
+ */
+ stat_clauses
+ = clauses_matching_statistic(&clauses, mvstat, varRelid, sjinfo,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ true);
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, sjinfo, clauses,
- varRelid, &mvclauses, mvstat,
- (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST));
+ stat_conditions
+ = clauses_matching_statistic(&conditions_local, mvstat, varRelid, sjinfo,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ false);
- /* we've chosen the histogram to match the clauses */
- Assert(mvclauses != NIL);
+ /*
+ * If we got no clauses to estimate, we've done something wrong,
+ * either during the optimization, detecting compatible clause, or
+ * somewhere else.
+ *
+ * Also, we need at least two attributes in clauses and conditions.
+ */
+ Assert(stat_clauses != NIL);
+ Assert(count_mv_attnums(list_union(stat_clauses, stat_conditions),
+ varRelid, sjinfo,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2);
/* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ stat_clauses, stat_conditions,
+ false); /* AND */
+
+ /*
+ * Add the new clauses to the local conditions, so that we can use
+ * them for the subsequent statistics. We only add the clauses,
+ * because the conditions are already there (or should be).
+ */
+ conditions_local = list_concat(conditions_local, stat_clauses);
}
+
+ /* from now on, work only with the 'local' list of conditions */
+ conditions = conditions_local;
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ return s1 * clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo, conditions);
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -283,7 +417,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions);
/*
* Check for being passed a RestrictInfo.
@@ -442,6 +577,55 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Similar to clauselist_selectivity(), but for OR-clauses. We can't simply use
+ * the same multi-statistic estimation logic for AND-clauses, at least not
+ * directly, because there are a few key differences:
+ *
+ * - functional dependencies don't really apply to OR-clauses
+ *
+ * - clauselist_selectivity() is based on decomposing the selectivity into
+ * a sequence of conditional probabilities (selectivities), but that can
+ * be done only for AND-clauses
+ *
+ * We might invent a similar infrastructure for optimizing OR-clauses, doing
+ * something similar to what clause_selectivity does for AND-clauses, but
+ * luckily we know that each disjunctive normal form (aka OR-clause)
+ *
+ * (a OR b OR c)
+ *
+ * may be rewritten as an equivalent conjunctive normal form (aka AND-clause)
+ * by using negation:
+ *
+ * NOT ((NOT a) AND (NOT b) AND (NOT c))
+ *
+ * And that's something we can pass to clauselist_selectivity and let it do
+ * all the heavy lifting.
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
+{
+ List *args = NIL;
+ ListCell *l;
+ Expr *expr;
+
+ /* build arguments for the AND-clause by negating args of the OR-clause */
+ foreach (l, clauses)
+ args = lappend(args, makeBoolExpr(NOT_EXPR, list_make1(lfirst(l)), -1));
+
+ /* and then the actual OR-clause on the negated args */
+ expr = makeBoolExpr(AND_EXPR, args, -1);
+
+ /* instead of constructing NOT expression, just do (1.0 - s) */
+ return 1.0 - clauselist_selectivity(root, list_make1(expr), varRelid,
+ jointype, sjinfo, conditions);
+}
+
+/*
* addRangeClause --- add a new range clause for clauselist_selectivity
*
* Here is where we try to match up pairs of range-query clauses
@@ -648,7 +832,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -768,7 +953,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -777,29 +963,18 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
- /*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
- */
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(arg),
- varRelid,
- jointype,
- sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ /* just call to clauselist_selectivity_or() */
+ s1 = clauselist_selectivity_or(root,
+ ((BoolExpr *) clause)->args,
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -889,7 +1064,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -898,7 +1074,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else
{
@@ -962,15 +1139,16 @@ clause_selectivity(PlannerInfo *root,
* in the MCV list, then the selectivity is below the lowest frequency
* found in the MCV list,
*
- * TODO When applying the clauses to the histogram/MCV list, we can do
- * that from the most selective clauses first, because that'll
- * eliminate the buckets/items sooner (so we'll be able to skip
- * them without inspection, which is more expensive). But this
- * requires really knowing the per-clause selectivities in advance,
- * and that's not what we do now.
+ * TODO When applying the clauses to the histogram/MCV list, we can do that from
+ * the most selective clauses first, because that'll eliminate the
+ * buckets/items sooner (so we'll be able to skip them without inspection,
+ * which is more expensive). But this requires really knowing the
+ * per-clause selectivities in advance, and that's not what we do now.
+ *
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
@@ -988,7 +1166,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
*/
/* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions, is_or,
&fullmatch, &mcv_low);
/*
@@ -1001,7 +1180,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
/* FIXME if (fullmatch) without matching MCV item, use the mcv_low
* selectivity as upper bound */
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions, is_or);
/* TODO clamp to <= 1.0 (or more strictly, when possible) */
return s1 + s2;
@@ -1041,8 +1221,7 @@ collect_mv_attnums(List *clauses, Oid varRelid,
*/
if (bms_num_members(attnums) <= 1)
{
- if (attnums != NULL)
- pfree(attnums);
+ bms_free(attnums);
attnums = NULL;
*relid = InvalidOid;
}
@@ -1067,186 +1246,876 @@ count_mv_attnums(List *clauses, Oid varRelid, SpecialJoinInfo *sjinfo, int type)
return c;
}
+static List *
+clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ int varRelid, SpecialJoinInfo *sjinfo, int types,
+ bool remove)
+{
+ int i;
+ Bitmapset *stat_attnums = NULL;
+ List *matching_clauses = NIL;
+ ListCell *lc;
+
+ /* build attnum bitmapset for this statistics */
+ for (i = 0; i < statistic->stakeys->dim1; i++)
+ stat_attnums = bms_add_member(stat_attnums,
+ statistic->stakeys->values[i]);
+
+ /*
+ * We can't use foreach here, because we may need to remove some of the
+ * clauses if (remove=true).
+ */
+ lc = list_head(*clauses);
+ while (lc)
+ {
+ Node *clause = (Node*)lfirst(lc);
+ Bitmapset *attnums = NULL;
+
+ /* must advance lc before list_delete possibly pfree's it */
+ lc = lnext(lc);
+
+ /*
+ * skip clauses that are not compatible with stats (just leave them
+ * in the original list)
+ *
+ * FIXME Perhaps this should check what stats are actually available in
+ * the statistics (not a big deal now, because MCV and histograms
+ * handle the same types of conditions).
+ */
+ if (! clause_is_mv_compatible(clause, varRelid, NULL, &attnums, sjinfo,
+ types))
+ {
+ bms_free(attnums);
+ continue;
+ }
+
+ /* if the clause is covered by the statistic, add it to the list */
+ if (bms_is_subset(attnums, stat_attnums))
+ {
+ matching_clauses = lappend(matching_clauses, clause);
+
+ /* if remove=true, remove the matching item from the main list */
+ if (remove)
+ *clauses = list_delete_ptr(*clauses, clause);
+ }
+
+ bms_free(attnums);
+ }
+
+ bms_free(stat_attnums);
+
+ return matching_clauses;
+}
+
/*
- * We're looking for statistics matching at least 2 attributes,
- * referenced in the clauses compatible with multivariate statistics.
- * The current selection criteria is very simple - we choose the
- * statistics referencing the most attributes.
+ * Selects the best combination of multivariate statistics, in an exhaustive
+ * way, where 'best' means:
*
- * If there are multiple statistics referencing the same number of
- * columns (from the clauses), the one with less source columns
- * (as listed in the ADD STATISTICS when creating the statistics) wins.
- * Other wise the first one wins.
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ * (c) using the most conditions to exploit dependency
*
- * This is a very simple criteria, and has several weaknesses:
+ * Don't call this directly but through choose_mv_statistics(), which does some
+ * additional tricks to minimize the runtime.
*
- * (a) does not consider the accuracy of the statistics
*
- * If there are two histograms built on the same set of columns,
- * but one has 100 buckets and the other one has 1000 buckets (thus
- * likely providing better estimates), this is not currently
- * considered.
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with maximum
+ * depth equal to the number of multi-variate statistics available on the table.
+ * It actually explores all valid combinations of stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it matches are
+ * divided into 'conditions' (clauses already matched by at least one previous
+ * statistics) and clauses that are estimated.
*
- * (b) does not consider the type of statistics
+ * Then several checks are performed:
*
- * If there are three statistics - one containing just a MCV list,
- * another one with just a histogram and a third one with both,
- * this is not considered.
+ * (a) The statistics covers at least 2 columns, referenced in the estimated
+ * clauses (otherwise multi-variate stats are useless).
*
- * (c) does not consider the number of clauses
+ * (b) The statistics covers at least 1 new column, i.e. column not refefenced
+ * by the already used stats (and the new column has to be referenced by
+ * the clauses, of couse). Otherwise the statistics would not add any new
+ * information.
*
- * As explained, only the number of referenced attributes counts,
- * so if there are multiple clauses on a single attribute, this
- * still counts as a single attribute.
+ * There are some other sanity checks (e.g. stats must not be used twice etc.).
*
- * (d) does not consider type of condition
*
- * Some clauses may work better with some statistics - for example
- * equality clauses probably work better with MCV lists than with
- * histograms. But IS [NOT] NULL conditions may often work better
- * with histograms (thanks to NULL-buckets).
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a rather simple optimality criteria, so it may
+ * not do the best choice when
*
- * So for example with five WHERE conditions
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but with
+ * statistics in a different order). It's unclear which solution in the best
+ * one - in a sense all of them are equal.
*
- * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ * TODO It might be possible to compute estimate for each of those solutions,
+ * and then combine them to get the final estimate (e.g. by using average
+ * or median).
*
- * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be
- * selected as it references the most columns.
+ * (b) Does not consider that some types of stats are a better match for some
+ * types of clauses (e.g. MCV list is generally a better match for equality
+ * conditions than a histogram).
*
- * Once we have selected the multivariate statistics, we split the list
- * of clauses into two parts - conditions that are compatible with the
- * selected stats, and conditions are estimated using simple statistics.
+ * But maybe this is pointless - generally, each column is either a label
+ * (it's not important whether because of the data type or how it's used),
+ * or a value with ordering that makes sense. So either a MCV list is more
+ * appropriate (labels) or a histogram (values with orderings).
*
- * From the example above, conditions
+ * Now sure what to do with statistics on columns mixing both types of data
+ * (some columns would work best with MCVs, some with histograms). Maybe we
+ * could invent a new type of statistics combining MCV list and histogram
+ * (keeping a small histogram for each MCV item, and a separate histogram
+ * for values not on the MCV list).
*
- * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ * TODO The algorithm should probably count number of Vars (not just attnums)
+ * when computing the 'score' of each solution. Computing the ratio of
+ * (num of all vars) / (num of condition vars) as a measure of how well
+ * the solution uses conditions might be useful.
+ */
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* this may run for a long sime, so let's make it interruptible */
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ /*
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int c;
+
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
+
+ Bitmapset *all_attnums = NULL;
+ Bitmapset *new_attnums = NULL;
+
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nclauses; c++)
+ {
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
+
+ if (covered)
+ break;
+ }
+
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
+
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
+
+ /* add the attnums into attnums from 'new clauses' */
+ // new_attnums = bms_union(new_attnums, clause_attnums);
+ }
+
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+ bms_free(new_attnums);
+
+ all_attnums = NULL;
+ new_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
+ {
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+ }
+
+ /*
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
+ */
+ ruled_out[i] = step;
+
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
+
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
+
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ /* we can't get more conditions that clauses and conditions combined
+ *
+ * FIXME This assert does not work because we count the conditions
+ * repeatedly (once for each statistics covering it).
+ */
+ /* Assert((nconditions + nclauses) >= current->nconditions); */
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics covering
+ * the clauses. This chooses the "best" statistics at each step, so the
+ * resulting solution may not be the best solution globally, but this produces
+ * the solution in only N steps (where N is the number of statistics), while
+ * the exhaustive approach may have to walk through ~N! combinations (although
+ * some of those are terminated early).
+ *
+ * See the comments at choose_mv_statistics_exhaustive() as this does the same
+ * thing (but in a different way).
*
- * will be estimated using the multivariate statistics (a,b,c,d) while
- * the last condition (e = 1) will get estimated using the regular ones.
+ * Don't call this directly, but through choose_mv_statistics().
*
- * There are various alternative selection criteria (e.g. counting
- * conditions instead of just referenced attributes), but eventually
- * the best option should be to combine multiple statistics. But that's
- * much harder to do correctly.
+ * TODO There are probably other metrics we might use - e.g. using number of
+ * columns (num_cond_columns / num_cov_columns), which might work better
+ * with a mix of simple and complex clauses.
*
- * TODO Select multiple statistics and combine them when computing
- * the estimate.
+ * TODO Also the choice at the very first step should be handled in a special
+ * way, because there will be 0 conditions at that moment, so there needs
+ * to be some other criteria - e.g. using the simplest (or most complex?)
+ * clause might be a good idea.
*
- * TODO This will probably have to consider compatibility of clauses,
- * because 'dependencies' will probably work only with equality
- * clauses.
+ * TODO We might also select multiple stats using different criteria, and branch
+ * the search. This is however tricky, because if we choose k statistics at
+ * each step, we get k^N branches to walk through (with N steps). That's
+ * not really good with large number of stats (yet better than exhaustive
+ * search).
*/
-static MVStatisticInfo *
-choose_mv_statistics(List *stats, Bitmapset *attnums)
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
{
- int i;
- ListCell *lc;
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
- MVStatisticInfo *choice = NULL;
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
/*
- * Walk through the statistics (simple array with nmvstats elements)
- * and for each one count the referenced attributes (encoded in
- * the 'attnums' bitmap).
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
*/
- foreach (lc, stats)
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
{
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *attnums_new
+ = bms_union(attnums_covered, clauses_attnums[j]);
- /* columns matching this statistics */
- int matches = 0;
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = attnums_new;
- int2vector * attrs = info->stakeys;
- int numattrs = attrs->dim1;
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
- /* skip dependencies-only stats */
- if (! (info->mcv_built || info->hist_built))
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ attnums_new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = attnums_new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
continue;
- /* count columns covered by the histogram */
- for (i = 0; i < numattrs; i++)
- if (bms_is_member(attrs->values[i], attnums))
- matches++;
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
- /*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
- */
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
+ if (gain > max_gain)
{
- choice = info;
- current_matches = matches;
- current_dims = numattrs;
+ max_gain = gain;
+ best_stat = i;
}
}
- return choice;
-}
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
+ }
+
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
+}
/*
- * This splits the clauses list into two parts - one containing clauses
- * that will be evaluated using the chosen statistics, and the remaining
- * clauses (either non-mvcompatible, or not related to the histogram).
+ * Chooses the combination of statistics, optimal for estimation of a particular
+ * clause list.
+ *
+ * This only handles a 'preparation' shared by the exhaustive and greedy
+ * implementations (see the previous methods), mostly trying to reduce the size
+ * of the problem (eliminate clauses/statistics that can't be really used in
+ * the solution).
+ *
+ * It also precomputes bitmaps for attributes covered by clauses and statistics,
+ * so that we don't need to do that over and over in the actual optimizations
+ * (as it's both CPU and memory intensive).
+ *
+ * TODO This will probably have to consider compatibility of clauses, because
+ * 'dependencies' will probably work only with equality clauses.
+ *
+ * TODO Another way to make the optimization problems smaller might be splitting
+ * the statistics into several disjoint subsets, i.e. if we can split the
+ * graph of statistics (after the elimination) into multiple components
+ * (so that stats in different components share no attributes), we can do
+ * the optimization for each component separately.
+ *
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew that we
+ * can cover 10 clauses and reuse 8 dependencies, maybe covering 9 clauses
+ * and 7 dependencies would be OK?
*/
-static List *
-clauselist_mv_split(PlannerInfo *root, SpecialJoinInfo *sjinfo,
- List *clauses, Oid varRelid, List **mvclauses,
- MVStatisticInfo *mvstats, int types)
+static List*
+choose_mv_statistics(PlannerInfo *root, List *stats,
+ List *clauses, List *conditions,
+ Oid varRelid, SpecialJoinInfo *sjinfo)
{
int i;
- ListCell *l;
- List *non_mvclauses = NIL;
+ mv_solution_t *best = NULL;
+ List *result = NIL;
+
+ int nmvstats;
+ MVStatisticInfo *mvstats;
+
+ /* we only work with MCV lists and histograms here */
+ int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
+
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
+
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
+ stats = list_copy(stats);
+
+ /*
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
+ */
+ while (true)
+ {
+ List *tmp;
+
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
+
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. We'll also keep info
+ * about attnums in clauses (without conditions) so that we can
+ * ignore stats covering just conditions (which is pointless).
+ */
+ tmp = filter_clauses(root, varRelid, sjinfo, type,
+ stats, clauses, &compatible_attnums);
+
+ /* discard the original list */
+ list_free(clauses);
+ clauses = tmp;
+
+ /*
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new
+ * attribute (by comparing with clauses).
+ */
+ if (conditions != NIL)
+ {
+ tmp = filter_clauses(root, varRelid, sjinfo, type,
+ stats, conditions, &condition_attnums);
+
+ /* discard the original list */
+ list_free(conditions);
+ conditions = tmp;
+ }
+
+ /* get a union of attnums (from conditions and new clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ tmp = filter_stats(stats, compatible_attnums, all_attnums);
- /* FIXME is there a better way to get info on int2vector? */
- int2vector * attrs = mvstats->stakeys;
- int numattrs = mvstats->stakeys->dim1;
+ /* if we've not eliminated anything, terminate */
+ if (list_length(stats) == list_length(tmp))
+ break;
- Bitmapset *mvattnums = NULL;
+ /* work only with filtered statistics from now */
+ list_free(stats);
+ stats = tmp;
+ }
- /* build bitmap of attributes covered by the stats, so we can
- * do bms_is_subset later */
- for (i = 0; i < numattrs; i++)
- mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+ /* only do the optimization if we have clauses/statistics */
+ if ((list_length(stats) == 0) || (list_length(clauses) == 0))
+ return NULL;
- /* erase the list of mv-compatible clauses */
- *mvclauses = NIL;
+ /* remove redundant stats (stats covered by another stats) */
+ stats = filter_redundant_stats(stats, clauses, conditions);
- foreach (l, clauses)
- {
- bool match = false; /* by default not mv-compatible */
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(l);
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
- if (clause_is_mv_compatible(clause, varRelid, NULL,
- &attnums, sjinfo, types))
- {
- /* are all the attributes part of the selected stats? */
- if (bms_is_subset(attnums, mvattnums))
- match = true;
- }
+ /* collect clauses an bitmap of attnums */
+ clauses_array = make_clauses_array(clauses, &nclauses);
+ clauses_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
+ clauses_array, nclauses);
- /*
- * The clause matches the selected stats, so put it to the list
- * of mv-compatible clauses. Otherwise, keep it in the list of
- * 'regular' clauses (that may be selected later).
- */
- if (match)
- *mvclauses = lappend(*mvclauses, clause);
- else
- non_mvclauses = lappend(non_mvclauses, clause);
- }
+ /* collect conditions and bitmap of attnums */
+ conditions_array = make_clauses_array(conditions, &nconditions);
+ conditions_attnums = make_clauses_attnums(root, varRelid, sjinfo, type,
+ conditions_array, nconditions);
/*
- * Perform regular estimation using the clauses incompatible
- * with the chosen histogram (or MV stats in general).
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
*/
- return non_mvclauses;
+ clause_cover_map = make_cover_map(stats_attnums, nmvstats,
+ clauses_attnums, nclauses);
+
+ condition_cover_map = make_cover_map(stats_attnums, nmvstats,
+ conditions_attnums, nconditions);
+
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ /* no stats are ruled out by default */
+ for (i = 0; i < nmvstats; i++)
+ ruled_out[i] = -1;
+
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* create a list of statistics from the array */
+ if (best != NULL)
+ {
+ for (i = 0; i < best->nstats; i++)
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
+ result = lappend(result, info);
+ }
+ pfree(best);
+ }
+
+ /* cleanup (maybe leave it up to the memory context?) */
+ for (i = 0; i < nmvstats; i++)
+ bms_free(stats_attnums[i]);
+
+ for (i = 0; i < nclauses; i++)
+ bms_free(clauses_attnums[i]);
+
+ for (i = 0; i < nconditions; i++)
+ bms_free(conditions_attnums[i]);
+
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(conditions_attnums);
+
+ pfree(clauses_array);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+ pfree(mvstats);
+ list_free(clauses);
+ list_free(conditions);
+ list_free(stats);
+
+ return result;
}
/*
@@ -1421,10 +2290,10 @@ clause_is_mv_compatible(Node *clause, Oid varRelid,
return true;
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/*
- * AND/OR-clauses are supported if all sub-clauses are supported
+ * AND/OR/NOT-clauses are supported if all sub-clauses are supported
*
* TODO We might support mixed case, where some of the clauses
* are supported and some are not, and treat all supported
@@ -1434,7 +2303,10 @@ clause_is_mv_compatible(Node *clause, Oid varRelid,
*
* TODO For RestrictInfo above an OR-clause, we might use the
* orclause with nested RestrictInfo - we won't have to
- * call pull_varnos() for each clause, saving time.
+ * call pull_varnos() for each clause, saving time.
+ *
+ * TODO Perhaps this needs a bit more thought for functional
+ * dependencies? Those don't quite work for NOT cases.
*/
Bitmapset *tmp = NULL;
ListCell *l;
@@ -1454,6 +2326,7 @@ clause_is_mv_compatible(Node *clause, Oid varRelid,
return false;
}
+
/*
* reduce list of equality clauses using soft functional dependencies
*
@@ -2079,22 +2952,26 @@ get_varattnos(Node * node, Index relid)
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -2105,32 +2982,85 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
- nmatches, matches,
- lowsel, fullmatch, false);
+ ((is_or) ? 0 : nmatches), matches,
+ lowsel, fullmatch, is_or);
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
{
- /* used to 'scale' for MCV lists not covering all tuples */
+ /*
+ * Find out what part of the data is covered by the MCV list,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample items got into the MCV, and the rest
+ * is either in a histogram, or not covered by stats).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole list, which might save us a bit of time
+ * spent on accessing the not-matching part of the MCV list.
+ * Although it's likely in a cache, so it's very fast.
+ */
u += mcvlist->items[i]->frequency;
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s*u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -2423,64 +3353,57 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
}
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each MCV item */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching MCV items */
- or_nmatches = mcvlist->nitems;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
/* by default none of the MCV items matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
stakeys, mcvlist,
- or_nmatches, or_matches,
+ tmp_nmatches, tmp_matches,
lowsel, fullmatch, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mcvlist->nitems; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
+ pfree(tmp_matches);
}
else
- {
elog(ERROR, "unknown clause type: %d", clause->type);
- }
}
/*
@@ -2538,15 +3461,18 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
MVSerializedHistogram mvhist = NULL;
@@ -2559,25 +3485,55 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
Assert (mvhist != NULL);
Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
/*
- * Bitmap of bucket matches (mismatch, partial, full). by default
- * all buckets fully match (and we'll eliminate them).
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+ matches = palloc0(sizeof(char) * nmatches);
- nmatches = mvhist->nbuckets;
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, false);
- /* build the match bitmap */
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
- nmatches, matches, false);
+ ((is_or) ? 0 : nmatches), matches,
+ is_or);
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
/*
* Find out what part of the data is covered by the histogram,
* so that we can 'scale' the selectivity properly (e.g. when
@@ -2591,10 +3547,23 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
*/
u += mvhist->buckets[i]->ntuples;
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
#ifdef DEBUG_MVHIST
@@ -2603,9 +3572,14 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s * u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
#define HIST_CACHE_NOT_FOUND 0x00
@@ -2652,7 +3626,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
{
int i;
ListCell * l;
-
+
/*
* Used for caching function calls, only once per deduplicated value.
*
@@ -2695,7 +3669,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
FmgrInfo opproc; /* operator */
fmgr_info(get_opcode(expr->opno), &opproc);
-
+
/* reset the cache (per clause) */
memset(callcache, 0, mvhist->nbuckets);
@@ -2876,64 +3850,57 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each bucket */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching buckets */
- or_nmatches = mvhist->nbuckets;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mvhist->nbuckets;
- /* by default none of the buckets matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ /* by default none of the buckets matches the clauses (OR clause) */
+ tmp_matches = palloc0(sizeof(char) * mvhist->nbuckets);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* but AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ tmp_nmatches = update_match_bitmap_histogram(root, tmp_clauses,
stakeys, mvhist,
- or_nmatches, or_matches, or_clause(clause));
+ tmp_nmatches, tmp_matches, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mvhist->nbuckets; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
-
+ pfree(tmp_matches);
}
else
elog(ERROR, "unknown clause type: %d", clause->type);
}
- /* free the call cache */
pfree(callcache);
return nmatches;
@@ -3049,3 +4016,363 @@ bucket_is_smaller_than_value(FmgrInfo opproc, Datum constvalue,
else
return ( a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
}
+
+/*
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics.
+ */
+static List *
+filter_clauses(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
+ int type, List *stats, List *clauses, Bitmapset **attnums)
+{
+ ListCell *c;
+ ListCell *s;
+
+ /* results (list of compatible clauses, attnums) */
+ List *rclauses = NIL;
+
+ foreach (c, clauses)
+ {
+ Node *clause = (Node*)lfirst(c);
+ Bitmapset *clause_attnums = NULL;
+ Index relid;
+
+ /*
+ * The clause has to be mv-compatible (suitable operators etc.).
+ */
+ if (! clause_is_mv_compatible(clause, varRelid,
+ &relid, &clause_attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ /* is there a statistics covering this clause? */
+ foreach (s, stats)
+ {
+ int k, matches = 0;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ if (bms_is_member(stat->stakeys->values[k],
+ clause_attnums))
+ matches += 1;
+ }
+
+ /*
+ * The clause is compatible if all attributes it references
+ * are covered by the statistics.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ *attnums = bms_union(*attnums, clause_attnums);
+ rclauses = lappend(rclauses, clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(clauses) >= list_length(rclauses));
+
+ return rclauses;
+}
+
+
+/*
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ *
+ * This check might be made more strict by checking against individual
+ * clauses, because by using the bitmapsets of all attnums we may
+ * actually use attnums from clauses that are not covered by the
+ * statistics. For example, we may have a condition
+ *
+ * (a=1 AND b=2)
+ *
+ * and a new clause
+ *
+ * (c=1 AND d=1)
+ *
+ * With only bitmapsets, statistics on [b,c] will pass through this
+ * (assuming there are some statistics covering both clases).
+ *
+ * TODO Do the more strict check.
+ */
+static List *
+filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+{
+ ListCell *s;
+ List *stats_filtered = NIL;
+
+ foreach (s, stats)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* see how many attributes the statistics covers */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ /* attributes from new clauses */
+ if (bms_is_member(stat->stakeys->values[k], new_attnums))
+ matches_new += 1;
+
+ /* attributes from onditions */
+ if (bms_is_member(stat->stakeys->values[k], all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ stats_filtered = lappend(stats_filtered, stat);
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(list_length(stats) >= list_length(stats_filtered));
+
+ return stats_filtered;
+}
+
+static MVStatisticInfo *
+make_stats_array(List *stats, int *nmvstats)
+{
+ int i;
+ ListCell *l;
+
+ MVStatisticInfo *mvstats = NULL;
+ *nmvstats = list_length(stats);
+
+ mvstats
+ = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
+
+ i = 0;
+ foreach (l, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
+ memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
+ }
+
+ return mvstats;
+}
+
+static Bitmapset **
+make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
+{
+ int i, j;
+ Bitmapset **stats_attnums = NULL;
+
+ Assert(nmvstats > 0);
+
+ /* build bitmaps of attnums for the stats (easier to compare) */
+ stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i]
+ = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+
+ return stats_attnums;
+}
+
+
+/*
+ * Now let's remove redundant statistics, covering the same columns
+ * as some other stats, when restricted to the attributes from
+ * remaining clauses.
+ *
+ * If statistics S1 covers S2 (covers S2 attributes and possibly
+ * some more), we can probably remove S2. What actually matters are
+ * attributes from covered clauses (not all the attributes). This
+ * might however prefer larger, and thus less accurate, statistics.
+ *
+ * When a redundancy is detected, we simply keep the smaller
+ * statistics (less number of columns), on the assumption that it's
+ * more accurate and faster to process. That might be incorrect for
+ * two reasons - first, the accuracy really depends on number of
+ * buckets/MCV items, not the number of columns. Second, we might
+ * prefer MCV lists over histograms or something like that.
+ */
+static List*
+filter_redundant_stats(List *stats, List *clauses, List *conditions)
+{
+ int i, j, nmvstats;
+
+ MVStatisticInfo *mvstats;
+ bool *redundant;
+ Bitmapset **stats_attnums;
+ Bitmapset *varattnos;
+ Index relid;
+
+ Assert(list_length(stats) > 0);
+ Assert(list_length(clauses) > 0);
+
+ /*
+ * We'll convert the list of statistics into an array now, because
+ * the reduction of redundant statistics is easier to do that way
+ * (we can mark previous stats as redundant, etc.).
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* by default, none of the stats is redundant (so palloc0) */
+ redundant = palloc0(nmvstats * sizeof(bool));
+
+ /*
+ * We only expect a single relid here, and also we should get the
+ * same relid from clauses and conditions (but we get it from
+ * clauses, because those are certainly non-empty).
+ */
+ relid = bms_singleton_member(pull_varnos((Node*)clauses));
+
+ /*
+ * Get the varattnos from both conditions and clauses.
+ *
+ * This skips system attributes, although that should be impossible
+ * thanks to previous filtering out of incompatible clauses.
+ *
+ * XXX Is that really true?
+ */
+ varattnos = bms_union(get_varattnos((Node*)clauses, relid),
+ get_varattnos((Node*)conditions, relid));
+
+ for (i = 1; i < nmvstats; i++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
+
+ /* walk through 'previous' stats and check redundancy */
+ for (j = 0; j < i; j++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *prev;
+
+ /* skip stats already identified as redundant */
+ if (redundant[j])
+ continue;
+
+ prev = bms_intersect(stats_attnums[j], varattnos);
+
+ switch (bms_subset_compare(curr, prev))
+ {
+ case BMS_EQUAL:
+ /*
+ * Use the smaller one (hopefully more accurate).
+ * If both have the same size, use the first one.
+ */
+ if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
+ redundant[i] = TRUE;
+ else
+ redundant[j] = TRUE;
+
+ break;
+
+ case BMS_SUBSET1: /* curr is subset of prev */
+ redundant[i] = TRUE;
+ break;
+
+ case BMS_SUBSET2: /* prev is subset of curr */
+ redundant[j] = TRUE;
+ break;
+
+ case BMS_DIFFERENT:
+ /* do nothing - keep both stats */
+ break;
+ }
+
+ bms_free(prev);
+ }
+
+ bms_free(curr);
+ }
+
+ /* can't reduce all statistics (at least one has to remain) */
+ Assert(nmvstats > 0);
+
+ /* now, let's remove the reduced statistics from the arrays */
+ list_free(stats);
+ stats = NIL;
+
+ for (i = 0; i < nmvstats; i++)
+ {
+ MVStatisticInfo *info;
+
+ pfree(stats_attnums[i]);
+
+ if (redundant[i])
+ continue;
+
+ info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
+
+ stats = lappend(stats, info);
+ }
+
+ pfree(mvstats);
+ pfree(stats_attnums);
+ pfree(redundant);
+
+ return stats;
+}
+
+static Node**
+make_clauses_array(List *clauses, int *nclauses)
+{
+ int i;
+ ListCell *l;
+
+ Node** clauses_array;
+
+ *nclauses = list_length(clauses);
+ clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
+
+ i = 0;
+ foreach (l, clauses)
+ clauses_array[i++] = (Node *)lfirst(l);
+
+ *nclauses = i;
+
+ return clauses_array;
+}
+
+static Bitmapset **
+make_clauses_attnums(PlannerInfo *root, Oid varRelid, SpecialJoinInfo *sjinfo,
+ int type, Node **clauses, int nclauses)
+{
+ int i;
+ Index relid;
+ Bitmapset **clauses_attnums
+ = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
+
+ for (i = 0; i < nclauses; i++)
+ {
+ Bitmapset * attnums = NULL;
+
+ if (! clause_is_mv_compatible(clauses[i], varRelid,
+ &relid, &attnums, sjinfo, type))
+ elog(ERROR, "should not get non-mv-compatible clause");
+
+ clauses_attnums[i] = attnums;
+ }
+
+ return clauses_attnums;
+}
+
+static bool*
+make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses)
+{
+ int i, j;
+ bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < nclauses; j++)
+ cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+
+ return cover_map;
+}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 5fc2f9c..7384cb8 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3520,7 +3520,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3543,7 +3544,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3710,7 +3712,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3746,7 +3748,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3783,7 +3786,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3921,12 +3925,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3938,7 +3944,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index ea831f5..6299e75 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 46c95b0..7d0a3a1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1627,13 +1627,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6259,7 +6261,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6579,7 +6582,8 @@ btcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7330,7 +7334,8 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7560,7 +7565,7 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ea5a09a..27a8de5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -393,6 +394,15 @@ static const struct config_enum_entry force_parallel_mode_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3707,6 +3717,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 3e4f4d1..d404914 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -90,6 +90,137 @@ even attempting to do the more expensive estimation.
Whenever we find there are no suitable stats, we skip the expensive steps.
+Combining multiple statistics
+-----------------------------
+
+When estimating selectivity of a list of clauses, there may exist no statistics
+covering all of them. If there are multiple statistics, each covering some
+subset of the attributes, the optimizer needs to figure out which of those
+statistics to apply.
+
+When the statistics do not overlap, the solution is trivial - we can simply
+split the groups of conditions by the matching statistics, and then multiply the
+selectivities. For example assume multivariate statistics on (b,c) and (d,e),
+and a condition like this:
+
+ (a=1) AND (b=2) AND (c=3) AND (d=4) AND (e=5)
+
+Then (a=1) is not covered by any of the statistics, so will be estimated using
+the regular per-column statistics. The two conditions ((b=2) AND (c=3)) will be
+estimated using the (b,c) statistics, and ((d=4) AND (e=5)) will be estimated
+using (d,e) statistics. And the resulting selectivities will be estimated.
+
+Now, what if the statistics overlap? For example assume the same condition as
+above, but let's say we have statistics on (a,b,c) and (a,c,d,e). What then?
+
+As selectivity is just a probability that the condition holds for a random row,
+we can write the selectivity like this:
+
+ P(a=1 & b=2 & c=3 & d=4 & e=5)
+
+and we can rewrite it using conditional probability like this
+
+ P(a=1 & b=2 & c=3) * P(d=4 & e=5 | a=1 & b=2 & c=3)
+
+Notice that the first part already matches to (a,b,c) statistics. If we assume
+that columns that are not referenced by the same statistics are independent, we
+may rewrite the second half like this
+
+ P(d=4 & e=5 | a=1 & b=2 & c=3) = P(d=4 & e=5 | a=1 & c=3)
+
+which corresponds to the statistics on (a,c,d,e).
+
+If there are multiple statistics defined on a table, it's not difficult to come
+up with examples when there are multiple ways to combine them to cover a list of
+clauses. We need a way to find the best combination of statistics.
+
+This is the purpose of choose_mv_statistics(). It searches through the possible
+combinations of statistics, and searches such combination that
+
+ (a) covers the most clauses of the list
+
+ (b) reuses the maximum number of clauses as conditions
+ (in conditional probabilities)
+
+While (a) criteria seems natural, the (b) may seem a bit awkward at first. The
+idea is that conditions in a way of transfering information about dependencies
+between statistics.
+
+There are two alternative implementations of choose_mv_statistics() - greedy
+and exhaustive. Exhaustive actually searches through all possible combinations
+of statistics, and for larger numbers of statistics may get quite expensive
+(as it, unsurprisingly, has exponential cost). Greedy terminates in less than
+K steps (when K is the number of clauses), and in each step chooses the best
+next statistics. I've been unable to come up with an example where those two
+approaches would produce different combinations.
+
+It's possible to choose the optimization using mvstat_search_type, with either
+'greedy' or 'exhaustive' values (default is 'greedy').
+
+ SET mvstat_search_type = 'exhaustive';
+
+Note: This is meant mostly for experimentation. I do expect we'll choose one of
+the algorithms and remove the GUC before commit.
+
+
+Limitations of combining statistics
+-----------------------------------
+
+As described in the section 'Combining multiple statistics', the current appoach
+is based on transfering information between statistics by means of conditional
+probabilities. This is a relatively cheap and efficient approach, but it is
+based on two assumptions:
+
+ (1) The overlap between the statistics needs to be sufficiently large, i.e.
+ there needs to be enough columns shared by the statistics to transfer
+ information about dependencies between the remaining columns.
+
+ (2) The query needs to include sufficient clauses on the shared columns.
+
+How a violation of those assumptions may be a problem can be illustrated by
+a simple example. Assume a table with three columns (a,b,c) containing exactly
+the same values, and statistics on (a,b) and (b,c):
+
+ CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+
+ CREATE STATISTICS s1 ON test (a,b) WITH (mcv);
+ CREATE STATISTICS s2 ON test (b,c) WITH (mcv);
+
+ ANALYZE test;
+
+First, let's estimate this query:
+
+ SELECT * FROM test WHERE (a < 10) AND (c < 10);
+
+Clearly, there are no conditions on 'b' (which is the only column shared by the
+two statistics), so we'll end up with an estimate based on assumption of
+independence:
+
+ P(a < 10) * P(c < 10) = 0.01 * 0.01 = 0.0001
+
+Which is a significant under-estimate, as the proper selectivity is 0.01.
+
+But let's estimate another query:
+
+ SELECT * FROM test WHERE (a < 10) AND (b < 500) AND (c < 10);
+
+In this case, the estimate may be computed for example like this:
+
+ P[(a < 10) & (b < 500) & (c < 10)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (a < 10) & (b < 500)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (b < 500)]
+
+The trouble is the probability P(c < 10 | b < 500) evaluates to 0.02, because
+we have assumed (a) and (c) are independent because there is no statistic
+containing both these columns, and the condition on (b) does not transfer
+sufficient amount of information between the two statistics.
+
+Currently, the only solution is to build statistics on all three columns, but
+see the 'combining statistics using convolution' section for ideas on how to
+improve this.
+
+
Further (possibly crazy) ideas
------------------------------
@@ -111,3 +242,38 @@ But of course, this may result in expensive estimation (CPU-wise).
So we might add a GUC to choose between a simple (single statistics) and thus
multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
+
+
+Combining stats using convolution
+---------------------------------
+
+While the current approach for combining statistics is based on conditional
+probabilities, and thus only works when the query includes conditions on the
+overlapping parts of the statistics. But there may be other ways to combine
+statistics, relaxing this requirement.
+
+Let's assume two histograms H1 and H2 - then combining them might work about
+like this:
+
+
+ for (buckets of H1, satisfying local conditions)
+ {
+ for (buckets of H2, overlapping with H1 bucket)
+ {
+ mark H2 bucket as 'valid'
+ }
+ }
+
+ s1 = s2 = 0.0
+ for (buckets of H2 marked as valid)
+ {
+ s1 += frequency
+
+ if (bucket satistifes local conditions)
+ s2 += frequency
+ }
+
+ s = (s2 / s1) /* final selectivity estimate */
+
+However this may quickly get non-trivial, e.g. when combining two statistics
+of different types (histogram vs. MCV).
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 78c7cae..a5ac088 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -191,11 +191,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index f05a517..35b2f8e 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,6 +17,14 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
--
2.1.0
0007-multivariate-ndistinct-coefficients.patchbinary/octet-stream; name=0007-multivariate-ndistinct-coefficients.patchDownload
From ca8e799b8392541ef46c9427bef431175ae8f84e Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Wed, 23 Dec 2015 02:07:58 +0100
Subject: [PATCH 7/9] multivariate ndistinct coefficients
---
doc/src/sgml/ref/create_statistics.sgml | 9 ++
src/backend/catalog/system_views.sql | 3 +-
src/backend/commands/analyze.c | 2 +-
src/backend/commands/statscmds.c | 11 +-
src/backend/optimizer/path/clausesel.c | 4 +
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/adt/selfuncs.c | 93 +++++++++++++++-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.ndistinct | 83 ++++++++++++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 23 +++-
src/backend/utils/mvstats/mvdist.c | 171 +++++++++++++++++++++++++++++
src/include/catalog/pg_mv_statistic.h | 26 +++--
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 9 +-
src/test/regress/expected/rules.out | 3 +-
16 files changed, 424 insertions(+), 23 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.ndistinct
create mode 100644 src/backend/utils/mvstats/mvdist.c
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index fd3382e..80360a6 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -168,6 +168,15 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>ndistinct</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables ndistinct coefficients for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 6afdee0..a550141 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -169,7 +169,8 @@ CREATE VIEW pg_mv_stats AS
length(S.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
length(S.stahist) AS histbytes,
- pg_mv_stats_histogram_info(S.stahist) AS histinfo
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo,
+ standcoeff AS ndcoeff
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index cbaa4e1..0f6db77 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -582,7 +582,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
}
/* Build multivariate stats (if there are any). */
- build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
+ build_mv_stats(onerel, totalrows, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index b974655..6ea0e13 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -138,7 +138,8 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
build_mcv = false,
- build_histogram = false;
+ build_histogram = false,
+ build_ndistinct = false;
int32 max_buckets = -1,
max_mcv_items = -1;
@@ -221,6 +222,8 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "ndistinct") == 0)
+ build_ndistinct = defGetBoolean(opt);
else if (strcmp(opt->defname, "mcv") == 0)
build_mcv = defGetBoolean(opt);
else if (strcmp(opt->defname, "max_mcv_items") == 0)
@@ -275,10 +278,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv || build_histogram))
+ if (! (build_dependencies || build_mcv || build_histogram || build_ndistinct))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram, ndistinct) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -311,6 +314,7 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+ values[Anum_pg_mv_statistic_ndist_enabled-1] = BoolGetDatum(build_ndistinct);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
@@ -318,6 +322,7 @@ CreateStatistics(CreateStatsStmt *stmt)
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
nulls[Anum_pg_mv_statistic_stahist -1] = true;
+ nulls[Anum_pg_mv_statistic_standist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index d239488..3c2aefd 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -59,6 +59,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
#define MV_CLAUSE_TYPE_HIST 0x04
+#define MV_CLAUSE_TYPE_NDIST 0x08
static bool clause_is_mv_compatible(Node *clause, Oid varRelid,
Index *relid, Bitmapset **attnums, SpecialJoinInfo *sjinfo,
@@ -2553,6 +2554,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_NDIST) && stat->ndist_built)
+ return true;
}
return false;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index d46aed2..bd2c306 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -416,7 +416,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built || mvstat->ndist_built)
{
info = makeNode(MVStatisticInfo);
@@ -427,11 +427,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
info->hist_enabled = mvstat->hist_enabled;
+ info->ndist_enabled = mvstat->ndist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
info->hist_built = mvstat->hist_built;
+ info->ndist_built = mvstat->ndist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 7d0a3a1..a84dd2b 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -132,6 +132,7 @@
#include "utils/fmgroids.h"
#include "utils/index_selfuncs.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/nabstime.h"
#include "utils/pg_locale.h"
#include "utils/rel.h"
@@ -206,6 +207,7 @@ static Const *string_to_const(const char *str, Oid datatype);
static Const *string_to_bytea_const(const char *str, size_t str_len);
static List *add_predicate_to_quals(IndexOptInfo *index, List *indexQuals);
+static Oid find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos);
/*
* eqsel - Selectivity of "=" for any data types.
@@ -3422,12 +3424,26 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
* don't know by how much. We should never clamp to less than the
* largest ndistinct value for any of the Vars, though, since
* there will surely be at least that many groups.
+ *
+ * However we don't need to do this if we have ndistinct stats on
+ * the columns - in that case we can simply use the coefficient
+ * to get the (probably way more accurate) estimate.
+ *
+ * XXX Probably needs refactoring (don't like to mix with clamp
+ * and coeff at the same time).
*/
double clamp = rel->tuples;
+ double coeff = 1.0;
if (relvarcount > 1)
{
- clamp *= 0.1;
+ Oid oid = find_ndistinct_coeff(root, rel, varinfos);
+
+ if (oid != InvalidOid)
+ coeff = load_mv_ndistinct(oid);
+ else
+ clamp *= 0.1;
+
if (clamp < relmaxndistinct)
{
clamp = relmaxndistinct;
@@ -3436,6 +3452,13 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
clamp = rel->tuples;
}
}
+
+ /*
+ * Apply ndistinct coefficient from multivar stats (we must do this
+ * before clamping the estimate in any way.
+ */
+ reldistinct /= coeff;
+
if (reldistinct > clamp)
reldistinct = clamp;
@@ -7582,3 +7605,71 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
/* XXX what about pages_per_range? */
}
+
+/*
+ * Find applicable ndistinct statistics and compute the coefficient to
+ * correct the estimate (simply a product of per-column ndistincts).
+ *
+ * Currently we only look for a perfect match, i.e. a single ndistinct
+ * estimate exactly matching all the columns of the statistics.
+ */
+static Oid
+find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+ VariableStatData vardata;
+
+ foreach(lc, varinfos)
+ {
+ GroupVarInfo *varinfo = (GroupVarInfo *) lfirst(lc);
+
+ if (varinfo->rel != rel)
+ continue;
+
+ /* FIXME handle expressions in general only */
+
+ /*
+ * examine the variable (or expression) so that we know which
+ * attribute we're dealing with - we need this for matching the
+ * ndistinct coefficient
+ *
+ * FIXME probably might remember this from estimate_num_groups
+ */
+ examine_variable(root, varinfo->var, 0, &vardata);
+
+ if (HeapTupleIsValid(vardata.statsTuple))
+ {
+ Form_pg_statistic stats
+ = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+
+ attnums = bms_add_member(attnums, stats->staattnum);
+
+ ReleaseVariableStats(vardata);
+ }
+ }
+
+ /* look for a matching ndistinct statistics */
+ foreach (lc, rel->mvstatlist)
+ {
+ int i;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* skip statistics without ndistinct coefficient built */
+ if (!info->ndist_built)
+ continue;
+
+ /* only exact matches for now (same set of columns) */
+ if (bms_num_members(attnums) != info->stakeys->dim1)
+ continue;
+
+ /* check that the columns match */
+ for (i = 0; i < info->stakeys->dim1; i++)
+ if (bms_is_member(info->stakeys->values[i], attnums))
+ continue;
+
+ return info->mvoid;
+ }
+
+ return InvalidOid;
+}
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 9dbb3b6..d4b88e9 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o histogram.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o mvdist.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.ndistinct b/src/backend/utils/mvstats/README.ndistinct
new file mode 100644
index 0000000..32d1624
--- /dev/null
+++ b/src/backend/utils/mvstats/README.ndistinct
@@ -0,0 +1,83 @@
+ndistinct coefficients
+======================
+
+Estimating number of distinct groups in a combination of columns is tricky,
+and the estimation error is often significant. By ndistinct coefficient we
+mean a ratio
+
+ q = ndistinct(a) * ndistinct(b) / ndistinct(a,b)
+
+where 'a' and 'b' are columns, ndistinct(a) is (an estimate of) a number of
+distinct values in column 'a'. And ndistinct(a,b) is the same thing for the
+pair of columns.
+
+The meaning of the coefficient may be illustrated by answering the following
+question: Given a combination of columns (a,b), how many distinct values of 'b'
+matches a chosen value of 'a' on average?
+
+Let's assume we know ndistinct(a) and ndistinct(a,b). Then the answer to the
+question clearly is
+
+ ndistinct(a,b) / ndistinct(a)
+
+and by using 'q' we may rewrite this as
+
+ ndistinct(b) / q
+
+so 'q' may be considered as a correction factor of the ndistinct estimate given
+a condition on one of the columns.
+
+This may be generalized to a combination of 'n' columns
+
+ [ndistinct(c1) * ... * ndistinct(cn)] / ndistinct(c1, ..., cn)
+
+and the meaning is very similar, except that we need to use conditions on (n-1)
+of the columns.
+
+
+Selectivity estimation
+----------------------
+
+As explained in the previous paragraph, ndistinct coefficients may be used to
+estimate cardinality of a column, given some apriori knowledge. Let's assume
+we need to estimate selectivity of a condition
+
+ (a=1) AND (b=2)
+
+which we can expand like this
+
+ P(a=1 & b=2) = P(a=1) * P(b=2 | a=1)
+
+Let's also assume that the distribution of 'b' is uniform, i.e. that
+
+ P(a=1) = 1/ndistinct(a)
+ P(b=2) = 1/ndistinct(b)
+ P(a=1 & b=2) = 1/ndistinct(a,b)
+
+ P(b=2 | a=1) = ndistinct(a) / ndistinct(a,b)
+
+which may be rewritten like
+
+ P(b=2 | a=1)
+ = ndistinct(a,b) / ndistinct(a)
+ = (1/ndistinct(b)) * [(ndistinct(a) * ndistinct(b)) / ndistinct(a,b)]
+ = (1/ndistinct(b)) * q
+
+and therefore
+
+ P(a=1 & b=2) = (1/ndistinct(a)) * (1/ndistinct(b)) * q
+
+This also illustrates 'q' as a correction coefficient.
+
+It also explains why we store the coefficient and not simply ndistinct(a,b).
+This way we can simply estimate individual clauses and then simply correct
+the estimate by multiplying the result with 'q' - we don't have to mess with
+ndistinct estimates at all.
+
+Naturally, as the coefficient is derives from ndistinct(a,b), it may be also
+used to estimate GROUP BY clauses on the combination of columns, replacing the
+existing heuristics in estimate_num_groups().
+
+Note: Currently only the GROUP BY estimation is implemented. It's a bit unclear
+how to implement the clause estimation when there are other statistics (esp.
+MCV lists and/or functional dependencies) available.
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index d404914..6d4b09b 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -20,6 +20,8 @@ Currently we only have two kinds of multivariate statistics
(c) multivariate histograms (README.histogram)
+ (d) ndistinct coefficients
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index ffb76f4..2be980d 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -32,7 +32,8 @@ static List* list_mv_stats(Oid relid);
* and serializes them back into the catalog (as bytea values).
*/
void
-build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats)
{
ListCell *lc;
@@ -53,6 +54,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
MVHistogram histogram = NULL;
+ double ndist = -1;
int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
@@ -92,6 +94,9 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->ndist_enabled)
+ ndist = build_mv_ndistinct(totalrows, numrows, rows, attrs, stats);
+
/* build the MCV list */
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
@@ -101,7 +106,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, ndist, attrs, stats);
}
}
@@ -183,6 +188,8 @@ list_mv_stats(Oid relid)
info->mcv_built = stats->mcv_built;
info->hist_enabled = stats->hist_enabled;
info->hist_built = stats->hist_built;
+ info->ndist_enabled = stats->ndist_enabled;
+ info->ndist_built = stats->ndist_built;
result = lappend(result, info);
}
@@ -252,7 +259,7 @@ find_mv_attnums(Oid mvoid, Oid *relid)
void
update_mv_stats(Oid mvoid,
MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
- int2vector *attrs, VacAttrStats **stats)
+ double ndistcoeff, int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -292,26 +299,36 @@ update_mv_stats(Oid mvoid,
= PointerGetDatum(data);
}
+ if (ndistcoeff > 1.0)
+ {
+ nulls[Anum_pg_mv_statistic_standist -1] = false;
+ values[Anum_pg_mv_statistic_standist-1] = Float8GetDatum(ndistcoeff);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
replaces[Anum_pg_mv_statistic_stahist-1] = true;
+ replaces[Anum_pg_mv_statistic_standist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_hist_built-1] = false;
+ nulls[Anum_pg_mv_statistic_ndist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_ndist_built-1] = true;
replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
+ values[Anum_pg_mv_statistic_ndist_built-1] = BoolGetDatum(ndistcoeff > 1.0);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/mvdist.c b/src/backend/utils/mvstats/mvdist.c
new file mode 100644
index 0000000..59b8358
--- /dev/null
+++ b/src/backend/utils/mvstats/mvdist.c
@@ -0,0 +1,171 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvdist.c
+ * POSTGRES multivariate distinct coefficients
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mvdist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include <math.h>
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+static double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
+
+/*
+ * Compute ndistinct coefficient for the combination of attributes. This
+ * computes the ndistinct estimate using the same estimator used in analyze.c
+ * and then computes the coefficient.
+ */
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats)
+{
+ int i, j;
+ int f1, cnt, d;
+ int nmultiple, summultiple;
+ int numattrs = attrs->dim1;
+ MultiSortSupport mss = multi_sort_init(numattrs);
+ double ndistcoeff;
+
+ /*
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simpler / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ Assert(numattrs >= 2);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* accumulate all the data into the array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ items[j].values[i]
+ = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+ }
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /* count number of distinct combinations */
+
+ f1 = 0;
+ cnt = 1;
+ d = 1;
+ for (i = 1; i < numrows; i++)
+ {
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ {
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ d++;
+ cnt = 0;
+ }
+
+ cnt += 1;
+ }
+
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ ndistcoeff = 1 / estimate_ndistinct(totalrows, numrows, d, f1);
+
+ /*
+ * now count distinct values for each attribute and incrementally
+ * compute ndistinct(a,b) / (ndistinct(a) * ndistinct(b))
+ *
+ * FIXME Probably need to handle cases when one of the ndistinct
+ * estimates is negative, and also check that the combined
+ * ndistinct is greater than any of those partial values.
+ */
+ for (i = 0; i < numattrs; i++)
+ ndistcoeff *= stats[i]->stadistinct;
+
+ return ndistcoeff;
+}
+
+double
+load_mv_ndistinct(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->ndist_enabled && mvstat->ndist_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_standist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return DatumGetFloat8(deps);
+}
+
+/* The Duj1 estimator (already used in analyze.c). */
+static double
+estimate_ndistinct(double totalrows, int numrows, int d, int f1)
+{
+ double numer,
+ denom,
+ ndistinct;
+
+ numer = (double) numrows *(double) d;
+
+ denom = (double) (numrows - f1) +
+ (double) f1 * (double) numrows / totalrows;
+
+ ndistinct = numer / denom;
+
+ /* Clamp to sane range in case of roundoff error */
+ if (ndistinct < (double) d)
+ ndistinct = (double) d;
+
+ if (ndistinct > totalrows)
+ ndistinct = totalrows;
+
+ return floor(ndistinct + 0.5);
+}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index a5945af..ee353da 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -39,6 +39,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
bool hist_enabled; /* build histogram? */
+ bool ndist_enabled; /* build ndist coefficient? */
/* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
@@ -48,6 +49,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
bool hist_built; /* histogram was built */
+ bool ndist_built; /* ndistinct coeff built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -56,6 +58,7 @@ CATALOG(pg_mv_statistic,3381)
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
bytea stahist; /* MV histogram (serialized) */
+ float8 standcoeff; /* ndistinct coeff (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -71,21 +74,24 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 15
+#define Natts_pg_mv_statistic 18
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
#define Anum_pg_mv_statistic_mcv_enabled 5
#define Anum_pg_mv_statistic_hist_enabled 6
-#define Anum_pg_mv_statistic_mcv_max_items 7
-#define Anum_pg_mv_statistic_hist_max_buckets 8
-#define Anum_pg_mv_statistic_deps_built 9
-#define Anum_pg_mv_statistic_mcv_built 10
-#define Anum_pg_mv_statistic_hist_built 11
-#define Anum_pg_mv_statistic_stakeys 12
-#define Anum_pg_mv_statistic_stadeps 13
-#define Anum_pg_mv_statistic_stamcv 14
-#define Anum_pg_mv_statistic_stahist 15
+#define Anum_pg_mv_statistic_ndist_enabled 7
+#define Anum_pg_mv_statistic_mcv_max_items 8
+#define Anum_pg_mv_statistic_hist_max_buckets 9
+#define Anum_pg_mv_statistic_deps_built 10
+#define Anum_pg_mv_statistic_mcv_built 11
+#define Anum_pg_mv_statistic_hist_built 12
+#define Anum_pg_mv_statistic_ndist_built 13
+#define Anum_pg_mv_statistic_stakeys 14
+#define Anum_pg_mv_statistic_stadeps 15
+#define Anum_pg_mv_statistic_stamcv 16
+#define Anum_pg_mv_statistic_stahist 17
+#define Anum_pg_mv_statistic_standist 18
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 46bece6..a2fafd2 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -621,11 +621,13 @@ typedef struct MVStatisticInfo
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
bool hist_enabled; /* histogram enabled */
+ bool ndist_enabled; /* ndistinct coefficient enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
bool hist_built; /* histogram built */
+ bool ndist_built; /* ndistinct coefficient built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 35b2f8e..fb2c5d8 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -225,6 +225,7 @@ typedef MVSerializedHistogramData *MVSerializedHistogram;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
MVSerializedHistogram load_mv_histogram(Oid mvoid);
+double load_mv_ndistinct(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
@@ -266,11 +267,17 @@ MVHistogram
build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int numrows_total);
-void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
void update_mv_stats(Oid relid, MVDependencies dependencies,
MCVList mcvlist, MVHistogram histogram,
+ double ndistcoeff,
int2vector *attrs, VacAttrStats **stats);
#ifdef DEBUG_MVHIST
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 1a1a4ca..0ad935e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1377,7 +1377,8 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
length(s.stahist) AS histbytes,
- pg_mv_stats_histogram_info(s.stahist) AS histinfo
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo,
+ s.standcoeff AS ndcoeff
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
--
2.1.0
0008-change-how-we-apply-selectivity-to-number-of-groups-.patchbinary/octet-stream; name=0008-change-how-we-apply-selectivity-to-number-of-groups-.patchDownload
From 050ab11a67b89383211c870e7d32259b1368f689 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 26 Jan 2016 18:14:33 +0100
Subject: [PATCH 8/9] change how we apply selectivity to number of groups
estimate
Instead of simply multiplying the ndistinct estimate with selecticity,
we instead use the formula for the expected number of distinct values
observed in 'k' rows when there are 'd' distinct values in the bin
d * (1 - ((d - 1) / d)^k)
This is 'with replacements' which seems appropriate for the use, and it
mostly assumes uniform distribution of the distinct values. So if the
distribution is not uniform (e.g. there are very frequent groups) this
may be less accurate than the current algorithm in some cases, giving
over-estimates. But that's probably better than OOM.
---
src/backend/utils/adt/selfuncs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index a84dd2b..ce3ad19 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3465,7 +3465,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
/*
* Multiply by restriction selectivity.
*/
- reldistinct *= rel->rows / rel->tuples;
+ reldistinct = reldistinct * (1 - powl((reldistinct - 1) / reldistinct,rel->rows));
/*
* Update estimate of total distinct groups.
--
2.1.0
0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchbinary/octet-stream; name=0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchDownload
From acb9b004e5e6a75e33f66b6d2f261f575fc515cb Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Sun, 28 Feb 2016 21:16:40 +0100
Subject: [PATCH 9/9] fixup of regression tests (plans changes by group by
estimation)
---
src/test/regress/expected/join.out | 20 ++++++++++----------
src/test/regress/expected/subselect.out | 25 +++++++++++--------------
src/test/regress/expected/union.out | 16 ++++++++--------
3 files changed, 29 insertions(+), 32 deletions(-)
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 59d7877..d9dd5ca 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3951,17 +3951,17 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
on d.a = s.id;
QUERY PLAN
---------------------------------------
- Merge Left Join
- Merge Cond: (d.a = s.id)
- -> Sort
- Sort Key: d.a
- -> Seq Scan on d
+ Merge Right Join
+ Merge Cond: (s.id = d.a)
-> Sort
Sort Key: s.id
-> Subquery Scan on s
-> HashAggregate
Group Key: b.id
-> Seq Scan on b
+ -> Sort
+ Sort Key: d.a
+ -> Seq Scan on d
(11 rows)
-- similarly, but keying off a DISTINCT clause
@@ -3970,17 +3970,17 @@ select d.* from d left join (select distinct * from b) s
on d.a = s.id;
QUERY PLAN
---------------------------------------------
- Merge Left Join
- Merge Cond: (d.a = s.id)
- -> Sort
- Sort Key: d.a
- -> Seq Scan on d
+ Merge Right Join
+ Merge Cond: (s.id = d.a)
-> Sort
Sort Key: s.id
-> Subquery Scan on s
-> HashAggregate
Group Key: b.id, b.c_id
-> Seq Scan on b
+ -> Sort
+ Sort Key: d.a
+ -> Seq Scan on d
(11 rows)
-- check join removal works when uniqueness of the join condition is enforced
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index de64ca7..0fc93d9 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -807,27 +807,24 @@ select * from int4_tbl where
explain (verbose, costs off)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
- QUERY PLAN
-----------------------------------------------------------------------
- Hash Join
+ QUERY PLAN
+----------------------------------------------------------------
+ Hash Semi Join
Output: o.f1
Hash Cond: (o.f1 = "ANY_subquery".f1)
-> Seq Scan on public.int4_tbl o
Output: o.f1
-> Hash
Output: "ANY_subquery".f1, "ANY_subquery".g
- -> HashAggregate
+ -> Subquery Scan on "ANY_subquery"
Output: "ANY_subquery".f1, "ANY_subquery".g
- Group Key: "ANY_subquery".f1, "ANY_subquery".g
- -> Subquery Scan on "ANY_subquery"
- Output: "ANY_subquery".f1, "ANY_subquery".g
- Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
- -> HashAggregate
- Output: i.f1, (generate_series(1, 2) / 10)
- Group Key: i.f1
- -> Seq Scan on public.int4_tbl i
- Output: i.f1
-(18 rows)
+ Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
+ -> HashAggregate
+ Output: i.f1, (generate_series(1, 2) / 10)
+ Group Key: i.f1
+ -> Seq Scan on public.int4_tbl i
+ Output: i.f1
+(15 rows)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 016571b..f2e297e 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -263,16 +263,16 @@ ORDER BY 1;
SELECT q2 FROM int8_tbl INTERSECT SELECT q1 FROM int8_tbl;
q2
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
SELECT q2 FROM int8_tbl INTERSECT ALL SELECT q1 FROM int8_tbl;
q2
------------------
+ 123
4567890123456789
4567890123456789
- 123
(3 rows)
SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
@@ -305,16 +305,16 @@ SELECT q1 FROM int8_tbl EXCEPT SELECT q2 FROM int8_tbl;
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q2 FROM int8_tbl;
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT DISTINCT q2 FROM int8_tbl;
q1
------------------
+ 123
4567890123456789
4567890123456789
- 123
(3 rows)
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q1 FROM int8_tbl FOR NO KEY UPDATE;
@@ -343,8 +343,8 @@ SELECT f1 FROM float8_tbl EXCEPT SELECT f1 FROM int4_tbl ORDER BY 1;
SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl;
q1
-------------------
- 4567890123456789
123
+ 4567890123456789
456
4567890123456789
123
@@ -355,15 +355,15 @@ SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FR
SELECT q1 FROM int8_tbl INTERSECT (((SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl)));
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
(((SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl))) UNION ALL SELECT q2 FROM int8_tbl;
q1
-------------------
- 4567890123456789
123
+ 4567890123456789
456
4567890123456789
123
@@ -419,8 +419,8 @@ HINT: There is a column named "q2" in table "*SELECT* 2", but it cannot be refe
SELECT q1 FROM int8_tbl EXCEPT (((SELECT q2 FROM int8_tbl ORDER BY q2 LIMIT 1)));
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
--
--
2.1.0
On 2 March 2016 at 14:56, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
Hi,
Attached is v10 of the patch series. There are 9 parts at the moment:
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patch
0002-shared-infrastructure-and-functional-dependencies.patch
0003-clause-reduction-using-functional-dependencies.patch
0004-multivariate-MCV-lists.patch
0005-multivariate-histograms.patch
0006-multi-statistics-estimation.patch
0007-multivariate-ndistinct-coefficients.patch
0008-change-how-we-apply-selectivity-to-number-of-groups-.patch
0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchHowever, the first one is still just a temporary workaround that I plan to address next, and the last 3 are all dealing with the ndistinct coefficients (and shall be squashed into a single chunk).
README docs
-----------Aside from fixing a few bugs, there are several major improvements, the main one being that I've moved most of the comments explaining how it all works into a set of regular README files, located in src/backend/utils/mvstats:
1) README.stats - Overview of available types of statistics, what
clauses can be estimated, how multiple statistics are combined etc.
This is probably the right place to start.2) docs for each type of statistics currently available
README.dependencies - soft functional dependencies
README.mcv - MCV lists
README.histogram - histograms
README.ndistinct - ndistinct coefficientsThe READMEs are added and modified through the patch series, so the best thing to do is apply all the patches and start reading.
I have not improved the user-oriented SGML documentation in this patch, that's one of the tasks I'd lie to work on next. But the READMEs should give you a good idea how it's supposed to work, and there are some examples of use in the regression tests.
Significantly simplified places
-------------------------------The patch version also significantly simplifies several places that were needlessly complex in the previous ones - firstly the function evaluating clauses on multivariate histograms was rather needlessly bloated, so I've simplified it a lot. Similarly for the code in clauselist_select() that combines multiple statistics to estimate a list of clauses - that's much simpler now too. And various other pieces.
That being said, I still think the code in clausesel.c can be simplified. I feel there's a lot of cruft, mostly due to unknowingly implementing something that could be solved by an existing function.
A prime example of that is inspecting the expression tree to check if we know how to estimate the clauses using the multivariate statistics. That sounds like a nice match for expression walker, but currently is done by custom code. I plan to look at that next.
Also, I'm not quite sure I understand what the varRelid parameter of clauselist_selectivity is for, so the code may be handling that wrong (seems to be working though).
ndistinct coefficients
----------------------The one new piece in this patch is the GROUP BY estimation, based on the ndistinct coefficients. So for example you can do this:
CREATE TABLE t AS SELECT mod(i,1000) AS a, mod(i,1000) AS b
FROM generate_series(1,1000000) s(i);
ANALYZE t;
EXPLAIN SELECT * FROM t GROUP BY a, b;which currently does this:
QUERY PLAN
-----------------------------------------------------------------------
Group (cost=127757.34..135257.34 rows=99996 width=8)
Group Key: a, b
-> Sort (cost=127757.34..130257.34 rows=1000000 width=8)
Sort Key: a, b
-> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=8)
(5 rows)but we know that there are only 1000 groups because the columns are correlated. So let's create ndistinct statistics on the two columns:
CREATE STATISTICS s1 ON t (a,b) WITH (ndistinct);
ANALYZE t;which results in estimates like this:
QUERY PLAN
-----------------------------------------------------------------
HashAggregate (cost=19425.00..19435.00 rows=1000 width=8)
Group Key: a, b
-> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=8)
(3 rows)I'm not quite sure how to combine this type of statistics with MCV lists and histograms, so for now it's used only for GROUP BY.
Well, firstly, the patches all apply.
But I have a question (which is coming really late, but I'll ask it
anyway). Is it intended that CREATE STATISTICS will only be for
multivariate statistics? Or do you think we could add support for
expression statistics in future too?
e.g.
CREATE STATISTICS stats_comment_length ON comments (length(comment));
I also note that the docs contain this:
CREATE STATISTICS [ IF NOT EXISTS ] statistics_name ON table_name ( [
{ column_name } ] [, ...])
[ WITH ( statistics_parameter [= value] [, ... ] )
The open square bracket before WITH doesn't get closed. Also, it
indicates that columns are entirely options, so () would be valid, but
that's not the case. Also, a space is missing after the first
ellipsis. So I think this should read:
CREATE STATISTICS [ IF NOT EXISTS ] statistics_name ON table_name (
{ column_name } [, ... ])
[ WITH ( statistics_parameter [= value] [, ... ] ) ]
Regards
Thom
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 03/02/2016 05:17 PM, Thom Brown wrote:
...
Well, firstly, the patches all apply.
But I have a question (which is coming really late, but I'll ask it
anyway). Is it intended that CREATE STATISTICS will only be for
multivariate statistics? Or do you think we could add support for
expression statistics in future too?e.g.
CREATE STATISTICS stats_comment_length ON comments (length(comment));
Hmmm, that's not a use case I had in mind while working on the patch,
but it sounds interesting. I don't see why the syntax would not support
this - I'd like to add support for expressions into the multivariate
patch, but that will still require at least 2 columns to build
multivariate statistics. But perhaps it'd be possible to relax the "at
least 2 columns" requirement, and collect regular statistics somewhere.
So I don't see why the syntax could not work for that case too, but I'm
not going to work on that.
I also note that the docs contain this:
CREATE STATISTICS [ IF NOT EXISTS ] statistics_name ON table_name ( [
{ column_name } ] [, ...])
[ WITH ( statistics_parameter [= value] [, ... ] )The open square bracket before WITH doesn't get closed. Also, it
indicates that columns are entirely options, so () would be valid, but
that's not the case. Also, a space is missing after the first
ellipsis. So I think this should read:CREATE STATISTICS [ IF NOT EXISTS ] statistics_name ON table_name (
{ column_name } [, ... ])
[ WITH ( statistics_parameter [= value] [, ... ] ) ]
Yeah, will fix.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
attached is v11 of the patch - this is mostly a cleanup of v10, removing
redundant code, adding missing comments, removing obsolete FIXME/TODOs
and so on. Overall this shaves ~20kB from the patch (not a primary
objective, though).
The one thing this (hopefully) fixes is handling of varRelid. Apparently
I got that a slightly wrong in the previous versions.
One thing I'm not quite sure about is schema of the new system catalog.
The existing catalog pg_statistic uses generic design with stakindN,
stanumbersN and stavaluesN columns, while the new catalog uses dedicated
columns for each type of stats (MCV, histogram, ...). Not sure whether
it's desirable to switch to the pg_statistic approach or not.
There are a few things I plan to look into next:
* possibly more cleanups in clausesel.c (I'm wondering if some pieces
should be moved to utils/mvstats/*.c)
* a few FIXMEs in the infrastructure (e.g. deriving a name when not
specified in CREATE STATISTICS)
* move the ndistinct coefficients after functional dependencies in
the patch series (but only use them for GROUP BY for now)
* extend the functional dependencies to handle multiple columns on
the left side (condition), i.e. dependencies like (a,b) -> c
* address a few remaining FIXMEs in MCV/histograms building
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchtext/x-patch; charset=UTF-8; name=0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchDownload
From 19defa4e8c1e578f3cf4099b0729357ecc333c5a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 1/9] teach pull_(varno|varattno)_walker about RestrictInfo
otherwise pull_varnos fails when processing OR clauses
---
src/backend/optimizer/util/var.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index dff52c4..80d01bd 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -197,6 +197,13 @@ pull_varnos_walker(Node *node, pull_varnos_context *context)
context->sublevels_up--;
return result;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo*)node;
+ context->varnos = bms_add_members(context->varnos,
+ rinfo->clause_relids);
+ return false;
+ }
return expression_tree_walker(node, pull_varnos_walker,
(void *) context);
}
@@ -245,6 +252,15 @@ pull_varattnos_walker(Node *node, pull_varattnos_context *context)
return false;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *)node;
+
+ return expression_tree_walker((Node*)rinfo->clause,
+ pull_varattnos_walker,
+ (void*) context);
+ }
+
/* Should not find an unplanned subquery */
Assert(!IsA(node, Query));
--
2.1.0
0002-shared-infrastructure-and-functional-dependencies.patchtext/x-patch; charset=UTF-8; name=0002-shared-infrastructure-and-functional-dependencies.patchDownload
From 48412732b6e1c667fd6f0f7d025b941ad0e7c1c1 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 2/9] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate stats, most
importantly:
- adds a new system catalog (pg_mv_statistic)
- CREATE STATISTICS name ON table (columns) WITH (options)
- DROP STATISTICS name
- implementation of functional dependencies (the simplest type of
multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e. it does not
influence the query planning (subject to follow-up patches).
The current implementation requires a valid 'ltopr' for the columns, so
that we can sort the sample rows in various ways, both in this patch
and other kinds of statistics. Maybe this restriction could be relaxed
in the future, requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
Maybe some of the stats (functional dependencies and MCV list with
limited functionality) might be made to work with hashes of the values,
which is sufficient for equality comparisons. But the queries would
require the equality operator anyway, so it's not really a weaker
requirement. The hashes might reduce space requirements, though.
The algorithm detecting the dependencies is rather simple and probably
needs improvements, so that it detects more complicated dependencies,
and also validation of the math.
The name 'functional dependencies' is more correct (than 'association
rules') as it's exactly the name used in relational theory (esp. Normal
Forms) for tracking column-level dependencies.
The multivariate statistics are automatically removed in two situations
(a) after a DROP TABLE (obviously)
(b) after ALTER TABLE ... DROP COLUMN, if the statistics would be
defined on less than 2 columns (remaining)
If there are more at least two remaining columns, we keep the
statistics but perform cleanup on the next ANALYZE. The dropped columns
are removed from stakeys, and the new statistics is built on the
smaller set.
We can't do this at DROP COLUMN, because that'd leave us with invalid
statistics, or we'd have to throw it away although we can still use it.
This lazy approach lets us use the statistics although some of the
columns are dead.
This also adds a simple list of statistics to \d in psql.
This means the statistics are created within a schema by using a
qualified name (or using the default schema)
CREATE STATISTICS schema.statistics ON ...
and then dropped by specifying qualified name
DROP STATISTICS schema.statistics
or searching through search_path (just like with other objects).
This also gets rid of the "(opt_)stats_name" definitions in gram.y and
instead replaces them with just "opt_any_name", although the optional
case is not really handled currently - there's no generated name yet
(so either we should drop it or implement it).
I'm not entirely sure making statistics schema-specific is that a great
idea. Maybe it should be "global", but that does not seem right (e.g.
it makes multi-tenant systems based on schemas more difficult to
manage, because tenants would interact).
---
doc/src/sgml/ref/allfiles.sgml | 2 +
doc/src/sgml/ref/create_statistics.sgml | 174 ++++++++++
doc/src/sgml/ref/drop_statistics.sgml | 90 ++++++
doc/src/sgml/reference.sgml | 2 +
src/backend/catalog/Makefile | 1 +
src/backend/catalog/dependency.c | 11 +-
src/backend/catalog/heap.c | 102 ++++++
src/backend/catalog/namespace.c | 51 +++
src/backend/catalog/objectaddress.c | 22 ++
src/backend/catalog/system_views.sql | 11 +
src/backend/commands/Makefile | 6 +-
src/backend/commands/analyze.c | 21 ++
src/backend/commands/dropcmds.c | 4 +
src/backend/commands/event_trigger.c | 3 +
src/backend/commands/statscmds.c | 331 +++++++++++++++++++
src/backend/commands/tablecmds.c | 8 +-
src/backend/nodes/copyfuncs.c | 16 +
src/backend/nodes/outfuncs.c | 18 ++
src/backend/optimizer/util/plancat.c | 63 ++++
src/backend/parser/gram.y | 34 +-
src/backend/tcop/utility.c | 11 +
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/relcache.c | 59 ++++
src/backend/utils/cache/syscache.c | 23 ++
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/README.dependencies | 222 +++++++++++++
src/backend/utils/mvstats/common.c | 356 +++++++++++++++++++++
src/backend/utils/mvstats/common.h | 75 +++++
src/backend/utils/mvstats/dependencies.c | 437 ++++++++++++++++++++++++++
src/bin/psql/describe.c | 44 +++
src/include/catalog/dependency.h | 5 +-
src/include/catalog/heap.h | 1 +
src/include/catalog/indexing.h | 7 +
src/include/catalog/namespace.h | 2 +
src/include/catalog/pg_mv_statistic.h | 73 +++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/commands/defrem.h | 4 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 12 +
src/include/nodes/relation.h | 28 ++
src/include/utils/mvstats.h | 70 +++++
src/include/utils/rel.h | 4 +
src/include/utils/relcache.h | 1 +
src/include/utils/syscache.h | 2 +
src/test/regress/expected/rules.out | 9 +
src/test/regress/expected/sanity_check.out | 1 +
47 files changed, 2432 insertions(+), 11 deletions(-)
create mode 100644 doc/src/sgml/ref/create_statistics.sgml
create mode 100644 doc/src/sgml/ref/drop_statistics.sgml
create mode 100644 src/backend/commands/statscmds.c
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/README.dependencies
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index bf95453..c0f7653 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -76,6 +76,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY createSchema SYSTEM "create_schema.sgml">
<!ENTITY createSequence SYSTEM "create_sequence.sgml">
<!ENTITY createServer SYSTEM "create_server.sgml">
+<!ENTITY createStatistics SYSTEM "create_statistics.sgml">
<!ENTITY createTable SYSTEM "create_table.sgml">
<!ENTITY createTableAs SYSTEM "create_table_as.sgml">
<!ENTITY createTableSpace SYSTEM "create_tablespace.sgml">
@@ -119,6 +120,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY dropSchema SYSTEM "drop_schema.sgml">
<!ENTITY dropSequence SYSTEM "drop_sequence.sgml">
<!ENTITY dropServer SYSTEM "drop_server.sgml">
+<!ENTITY dropStatistics SYSTEM "drop_statistics.sgml">
<!ENTITY dropTable SYSTEM "drop_table.sgml">
<!ENTITY dropTableSpace SYSTEM "drop_tablespace.sgml">
<!ENTITY dropTransform SYSTEM "drop_transform.sgml">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
new file mode 100644
index 0000000..a86eae3
--- /dev/null
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -0,0 +1,174 @@
+<!--
+doc/src/sgml/ref/create_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-CREATESTATISTICS">
+ <indexterm zone="sql-createstatistics">
+ <primary>CREATE STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>CREATE STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>CREATE STATISTICS</refname>
+ <refpurpose>define a new statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable> ON <replaceable class="PARAMETER">table_name</replaceable> ( [
+ { <replaceable class="PARAMETER">column_name</replaceable> } ] [, ...])
+[ WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )
+</synopsis>
+
+ </refsynopsisdiv>
+
+ <refsect1 id="SQL-CREATESTATISTICS-description">
+ <title>Description</title>
+
+ <para>
+ <command>CREATE STATISTICS</command> will create a new multivariate
+ statistics on the table. The statistics will be created in the in the
+ current database. The statistics will be owned by the user issuing
+ the command.
+ </para>
+
+ <para>
+ If a schema name is given (for example, <literal>CREATE STATISTICS
+ myschema.mystat ...</>) then the statistics is created in the specified
+ schema. Otherwise it is created in the current schema. The name of
+ the table must be distinct from the name of any other statistics in the
+ same schema.
+ </para>
+
+ <para>
+ To be able to create a table, you must have <literal>USAGE</literal>
+ privilege on all column types or the type in the <literal>OF</literal>
+ clause, respectively.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>IF NOT EXISTS</></term>
+ <listitem>
+ <para>
+ Do not throw an error if a statistics with the same name already exists.
+ A notice is issued in this case. Note that there is no guarantee that
+ the existing statistics is anything like the one that would have been
+ created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">statistics_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to be created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">table_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the table the statistics should
+ be created on.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name</replaceable></term>
+ <listitem>
+ <para>
+ The name of a column to be included in the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ ...
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <refsect2 id="SQL-CREATESTATISTICS-parameters">
+ <title id="SQL-CREATESTATISTICS-parameters-title">Statistics Parameters</title>
+
+ <indexterm zone="sql-createstatistics-parameters">
+ <primary>statistics parameters</primary>
+ </indexterm>
+
+ <para>
+ The <literal>WITH</> clause can specify <firstterm>statistics parameters</>
+ for statistics. The currently available parameters are listed below.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>dependencies</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables functional dependencies for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1 id="SQL-CREATESTATISTICS-notes">
+ <title>Notes</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+
+ <refsect1 id="SQL-CREATESTATISTICS-examples">
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>CREATE STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-dropstatistics"></member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/doc/src/sgml/ref/drop_statistics.sgml b/doc/src/sgml/ref/drop_statistics.sgml
new file mode 100644
index 0000000..4cc0b70
--- /dev/null
+++ b/doc/src/sgml/ref/drop_statistics.sgml
@@ -0,0 +1,90 @@
+<!--
+doc/src/sgml/ref/drop_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-DROPSTATISTICS">
+ <indexterm zone="sql-dropstatistics">
+ <primary>DROP STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>DROP STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>DROP STATISTICS</refname>
+ <refpurpose>remove a statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> [, ...]
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <command>DROP STATISTICS</command> removes statistics from the database.
+ Only the statistics owner, the schema owner, and superuser can drop a
+ statistics.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>IF EXISTS</literal></term>
+ <listitem>
+ <para>
+ Do not throw an error if the statistics does not exist. A notice is
+ issued in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to drop.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>DROP STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-createstatistics"></member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 03020df..2b07b2d 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -104,6 +104,7 @@
&createSchema;
&createSequence;
&createServer;
+ &createStatistics;
&createTable;
&createTableAs;
&createTableSpace;
@@ -147,6 +148,7 @@
&dropSchema;
&dropSequence;
&dropServer;
+ &dropStatistics;
&dropTable;
&dropTableSpace;
&dropTSConfig;
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 25130ec..058b8a9 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index c48e37b..8200454 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -40,6 +40,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -160,7 +161,8 @@ static const Oid object_classes[] = {
ExtensionRelationId, /* OCLASS_EXTENSION */
EventTriggerRelationId, /* OCLASS_EVENT_TRIGGER */
PolicyRelationId, /* OCLASS_POLICY */
- TransformRelationId /* OCLASS_TRANSFORM */
+ TransformRelationId, /* OCLASS_TRANSFORM */
+ MvStatisticRelationId /* OCLASS_STATISTICS */
};
@@ -1272,6 +1274,10 @@ doDeletion(const ObjectAddress *object, int flags)
DropTransformById(object->objectId);
break;
+ case OCLASS_STATISTICS:
+ RemoveStatisticsById(object->objectId);
+ break;
+
default:
elog(ERROR, "unrecognized object class: %u",
object->classId);
@@ -2415,6 +2421,9 @@ getObjectClass(const ObjectAddress *object)
case TransformRelationId:
return OCLASS_TRANSFORM;
+
+ case MvStatisticRelationId:
+ return OCLASS_STATISTICS;
}
/* shouldn't get here */
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 6a4a9d9..e7d9aaa 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -47,6 +47,7 @@
#include "catalog/pg_constraint_fn.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_tablespace.h"
@@ -1613,7 +1614,10 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
if (attnum > 0)
+ {
RemoveStatistics(relid, attnum);
+ RemoveMVStatistics(relid, attnum);
+ }
relation_close(rel, NoLock);
}
@@ -1841,6 +1845,11 @@ heap_drop_with_catalog(Oid relid)
RemoveStatistics(relid, 0);
/*
+ * delete multi-variate statistics
+ */
+ RemoveMVStatistics(relid, 0);
+
+ /*
* delete attribute tuples
*/
DeleteAttributeTuples(relid);
@@ -2696,6 +2705,99 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
/*
+ * RemoveMVStatistics --- remove entries in pg_mv_statistic for a rel
+ *
+ * If attnum is zero, remove all entries for rel; else remove only the one(s)
+ * for that column.
+ */
+void
+RemoveMVStatistics(Oid relid, AttrNumber attnum)
+{
+ Relation pgmvstatistic;
+ TupleDesc tupdesc = NULL;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ /*
+ * When dropping a column, we'll drop statistics with a single
+ * remaining (undropped column). To do that, we need the tuple
+ * descriptor.
+ *
+ * We already have the relation locked (as we're running ALTER
+ * TABLE ... DROP COLUMN), so we'll just get the descriptor here.
+ */
+ if (attnum != 0)
+ {
+ Relation rel = relation_open(relid, NoLock);
+
+ /* multivariate stats are supported on tables and matviews */
+ if (rel->rd_rel->relkind == RELKIND_RELATION ||
+ rel->rd_rel->relkind == RELKIND_MATVIEW)
+ tupdesc = RelationGetDescr(rel);
+
+ relation_close(rel, NoLock);
+ }
+
+ if (tupdesc == NULL)
+ return;
+
+ pgmvstatistic = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ scan = systable_beginscan(pgmvstatistic,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ bool delete = true;
+
+ if (attnum != 0)
+ {
+ Datum adatum;
+ bool isnull;
+ int i;
+ int ncolumns = 0;
+ ArrayType *arr;
+ int16 *attnums;
+
+ /* get the columns */
+ adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+ attnums = (int16*)ARR_DATA_PTR(arr);
+
+ for (i = 0; i < ARR_DIMS(arr)[0]; i++)
+ {
+ /* count the column unless it's has been / is being dropped */
+ if ((! tupdesc->attrs[attnums[i]-1]->attisdropped) &&
+ (attnums[i] != attnum))
+ ncolumns += 1;
+ }
+
+ /* delete if there are less than two attributes */
+ delete = (ncolumns < 2);
+ }
+
+ if (delete)
+ simple_heap_delete(pgmvstatistic, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(pgmvstatistic, RowExclusiveLock);
+}
+
+
+/*
* RelationTruncateIndexes - truncate all indexes associated
* with the heap relation to zero tuples.
*
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 446b2ac..dfd5bef 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -4201,3 +4201,54 @@ pg_is_other_temp_schema(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(isOtherTempNamespace(oid));
}
+
+Oid
+get_statistics_oid(List *names, bool missing_ok)
+{
+ char *schemaname;
+ char *stats_name;
+ Oid namespaceId;
+ Oid stats_oid = InvalidOid;
+ ListCell *l;
+
+ /* deconstruct the name list */
+ DeconstructQualifiedName(names, &schemaname, &stats_name);
+
+ if (schemaname)
+ {
+ /* use exact schema given */
+ namespaceId = LookupExplicitNamespace(schemaname, missing_ok);
+ if (missing_ok && !OidIsValid(namespaceId))
+ stats_oid = InvalidOid;
+ else
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ }
+ else
+ {
+ /* search for it in search path */
+ recomputeNamespacePath();
+
+ foreach(l, activeSearchPath)
+ {
+ namespaceId = lfirst_oid(l);
+
+ if (namespaceId == myTempNamespace)
+ continue; /* do not look in temp namespace */
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ if (OidIsValid(stats_oid))
+ break;
+ }
+ }
+
+ if (!OidIsValid(stats_oid) && !missing_ok)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics \"%s\" does not exist",
+ NameListToString(names))));
+
+ return stats_oid;
+}
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index d2aaa6d..3a6a0b0 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -39,6 +39,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_opfamily.h"
@@ -438,9 +439,22 @@ static const ObjectPropertyType ObjectProperty[] =
Anum_pg_type_typacl,
ACL_KIND_TYPE,
true
+ },
+ {
+ MvStatisticRelationId,
+ MvStatisticOidIndexId,
+ MVSTATOID,
+ MVSTATNAMENSP,
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ InvalidAttrNumber, /* XXX same owner as relation */
+ InvalidAttrNumber, /* no ACL (same as relation) */
+ -1, /* no ACL */
+ true
}
};
+
/*
* This struct maps the string object types as returned by
* getObjectTypeDescription into ObjType enum values. Note that some enum
@@ -913,6 +927,11 @@ get_object_address(ObjectType objtype, List *objname, List *objargs,
address = get_object_address_defacl(objname, objargs,
missing_ok);
break;
+ case OBJECT_STATISTICS:
+ address.classId = MvStatisticRelationId;
+ address.objectId = get_statistics_oid(objname, missing_ok);
+ address.objectSubId = 0;
+ break;
default:
elog(ERROR, "unrecognized objtype: %d", (int) objtype);
/* placate compiler, in case it thinks elog might return */
@@ -2185,6 +2204,9 @@ check_object_ownership(Oid roleid, ObjectType objtype, ObjectAddress address,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser")));
break;
+ case OBJECT_STATISTICS:
+ /* FIXME do the right owner checks here */
+ break;
default:
elog(ERROR, "unrecognized object type: %d",
(int) objtype);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index abf9a70..b8a264e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,6 +158,17 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.staname AS staname,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats WITH (security_barrier) AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index b1ac704..5151001 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -18,8 +18,8 @@ OBJS = aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
event_trigger.o explain.o extension.o foreigncmds.o functioncmds.o \
indexcmds.o lockcmds.o matview.o operatorcmds.o opclasscmds.o \
policy.o portalcmds.o prepare.o proclang.o \
- schemacmds.o seclabel.o sequence.o tablecmds.o tablespace.o trigger.o \
- tsearchcmds.o typecmds.o user.o vacuum.o vacuumlazy.o \
- variable.o view.o
+ schemacmds.o seclabel.o sequence.o statscmds.o \
+ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o \
+ user.o vacuum.o vacuumlazy.o variable.o view.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 070df29..cbaa4e1 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -55,7 +56,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Per-index data for ANALYZE */
typedef struct AnlIndexData
@@ -460,6 +465,19 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, because we need to build more
+ * detailed stats (more MCV items / histogram buckets) to get
+ * good accuracy. Maybe it'd be appropriate to use samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -562,6 +580,9 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/dropcmds.c b/src/backend/commands/dropcmds.c
index 522027a..cd65b58 100644
--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -292,6 +292,10 @@ does_not_exist_skipping(ObjectType objtype, List *objname, List *objargs)
msg = gettext_noop("schema \"%s\" does not exist, skipping");
name = NameListToString(objname);
break;
+ case OBJECT_STATISTICS:
+ msg = gettext_noop("statistics \"%s\" does not exist, skipping");
+ name = NameListToString(objname);
+ break;
case OBJECT_TSPARSER:
if (!schema_does_not_exist_skipping(objname, &msg, &name))
{
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 9e32f8d..09061bb 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -110,6 +110,7 @@ static event_trigger_support_data event_trigger_support[] = {
{"SCHEMA", true},
{"SEQUENCE", true},
{"SERVER", true},
+ {"STATISTICS", true},
{"TABLE", true},
{"TABLESPACE", false},
{"TRANSFORM", true},
@@ -1106,6 +1107,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
case OBJECT_RULE:
case OBJECT_SCHEMA:
case OBJECT_SEQUENCE:
+ case OBJECT_STATISTICS:
case OBJECT_TABCONSTRAINT:
case OBJECT_TABLE:
case OBJECT_TRANSFORM:
@@ -1167,6 +1169,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
case OCLASS_DEFACL:
case OCLASS_EXTENSION:
case OCLASS_POLICY:
+ case OCLASS_STATISTICS:
return true;
}
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
new file mode 100644
index 0000000..84a8b13
--- /dev/null
+++ b/src/backend/commands/statscmds.c
@@ -0,0 +1,331 @@
+/*-------------------------------------------------------------------------
+ *
+ * statscmds.c
+ * Commands for creating and altering multivariate statistics
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/commands/statscmds.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/multixact.h"
+#include "access/reloptions.h"
+#include "access/relscan.h"
+#include "access/sysattr.h"
+#include "access/xact.h"
+#include "access/xlog.h"
+#include "catalog/catalog.h"
+#include "catalog/dependency.h"
+#include "catalog/heap.h"
+#include "catalog/index.h"
+#include "catalog/indexing.h"
+#include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_constraint.h"
+#include "catalog/pg_depend.h"
+#include "catalog/pg_foreign_table.h"
+#include "catalog/pg_inherits.h"
+#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
+#include "catalog/pg_namespace.h"
+#include "catalog/pg_opclass.h"
+#include "catalog/pg_tablespace.h"
+#include "catalog/pg_trigger.h"
+#include "catalog/pg_type.h"
+#include "catalog/pg_type_fn.h"
+#include "catalog/storage.h"
+#include "catalog/toasting.h"
+#include "commands/cluster.h"
+#include "commands/comment.h"
+#include "commands/defrem.h"
+#include "commands/event_trigger.h"
+#include "commands/policy.h"
+#include "commands/sequence.h"
+#include "commands/tablecmds.h"
+#include "commands/tablespace.h"
+#include "commands/trigger.h"
+#include "commands/typecmds.h"
+#include "commands/user.h"
+#include "executor/executor.h"
+#include "foreign/foreign.h"
+#include "miscadmin.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/parsenodes.h"
+#include "optimizer/clauses.h"
+#include "optimizer/planner.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_coerce.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_expr.h"
+#include "parser/parse_oper.h"
+#include "parser/parse_relation.h"
+#include "parser/parse_type.h"
+#include "parser/parse_utilcmd.h"
+#include "parser/parser.h"
+#include "pgstat.h"
+#include "rewrite/rewriteDefine.h"
+#include "rewrite/rewriteHandler.h"
+#include "rewrite/rewriteManip.h"
+#include "storage/bufmgr.h"
+#include "storage/lmgr.h"
+#include "storage/lock.h"
+#include "storage/predicate.h"
+#include "storage/smgr.h"
+#include "utils/acl.h"
+#include "utils/builtins.h"
+#include "utils/fmgroids.h"
+#include "utils/inval.h"
+#include "utils/lsyscache.h"
+#include "utils/memutils.h"
+#include "utils/relcache.h"
+#include "utils/ruleutils.h"
+#include "utils/snapmgr.h"
+#include "utils/syscache.h"
+#include "utils/tqual.h"
+#include "utils/typcache.h"
+#include "utils/mvstats.h"
+
+
+/* used for sorting the attnums in ExecCreateStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the CREATE STATISTICS name ON table (columns) WITH (options)
+ *
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+ObjectAddress
+CreateStatistics(CreateStatsStmt *stmt)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+ ObjectAddress address = InvalidObjectAddress;
+ char *namestr;
+ NameData staname;
+ Oid statoid;
+ Oid namespaceId;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+ Relation rel;
+ ObjectAddress parentobject, childobject;
+
+ /* by default build nothing */
+ bool build_dependencies = false;
+
+ Assert(IsA(stmt, CreateStatsStmt));
+
+ /* resolve the pieces of the name (namespace etc.) */
+ namespaceId = QualifiedNameGetCreationNamespace(stmt->defnames, &namestr);
+ namestrcpy(&staname, namestr);
+
+ /*
+ * If if_not_exists was given and the statistics already exists, bail out.
+ */
+ if (stmt->if_not_exists &&
+ SearchSysCacheExists2(MVSTATNAMENSP,
+ PointerGetDatum(&staname),
+ ObjectIdGetDatum(namespaceId)))
+ {
+ ereport(NOTICE,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("statistics \"%s\" already exists, skipping",
+ namestr)));
+ return InvalidObjectAddress;
+ }
+
+ rel = heap_openrv(stmt->relation, AccessExclusiveLock);
+
+ /* transform the column names to attnum values */
+
+ foreach(l, stmt->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, stmt->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+ values[Anum_pg_mv_statistic_staname -1] = NameGetDatum(&staname);
+ values[Anum_pg_mv_statistic_stanamespace -1] = ObjectIdGetDatum(namespaceId);
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ statoid = HeapTupleGetOid(htup);
+
+ heap_freetuple(htup);
+
+
+ /*
+ * Store a dependency too, so that statistics are dropped on DROP TABLE
+ */
+ parentobject.classId = RelationRelationId;
+ parentobject.objectId = ObjectIdGetDatum(RelationGetRelid(rel));
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+ /*
+ * Also record dependency on the schema (to drop statistics on DROP SCHEMA)
+ */
+ parentobject.classId = NamespaceRelationId;
+ parentobject.objectId = ObjectIdGetDatum(namespaceId);
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ relation_close(rel, NoLock);
+
+ /*
+ * Invalidate relcache so that others see the new statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ ObjectAddressSet(address, MvStatisticRelationId, statoid);
+
+ return address;
+}
+
+
+/*
+ * Implements the DROP STATISTICS
+ *
+ * DROP STATISTICS stats_name ON table_name
+ *
+ * The first one requires an exact match, the second one just drops
+ * all the statistics on a table.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+ Relation relation;
+ HeapTuple tup;
+
+ /*
+ * Delete the pg_proc tuple.
+ */
+ relation = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ tup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(statsOid));
+ if (!HeapTupleIsValid(tup)) /* should not happen */
+ elog(ERROR, "cache lookup failed for statistics %u", statsOid);
+
+ simple_heap_delete(relation, &tup->t_self);
+
+ ReleaseSysCache(tup);
+
+ heap_close(relation, RowExclusiveLock);
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 96dc923..96ab02f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -37,6 +37,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_tablespace.h"
@@ -95,7 +96,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -143,8 +144,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index a9e9cc3..1a04024 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4124,6 +4124,19 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static CreateStatsStmt *
+_copyCreateStatsStmt(const CreateStatsStmt *from)
+{
+ CreateStatsStmt *newnode = makeNode(CreateStatsStmt);
+
+ COPY_NODE_FIELD(defnames);
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4999,6 +5012,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_CreateStatsStmt:
+ retval = _copyCreateStatsStmt(from);
+ break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 85acce8..474d2c7 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1968,6 +1968,21 @@ _outIndexOptInfo(StringInfo str, const IndexOptInfo *node)
}
static void
+_outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
+{
+ WRITE_NODE_TYPE("MVSTATISTICINFO");
+
+ /* NB: this isn't a complete set of fields */
+ WRITE_OID_FIELD(mvoid);
+
+ /* enabled statistics */
+ WRITE_BOOL_FIELD(deps_enabled);
+
+ /* built/available statistics */
+ WRITE_BOOL_FIELD(deps_built);
+}
+
+static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
@@ -3409,6 +3424,9 @@ _outNode(StringInfo str, const void *obj)
case T_PlannerParamItem:
_outPlannerParamItem(str, obj);
break;
+ case T_MVStatisticInfo:
+ _outMVStatisticInfo(str, obj);
+ break;
case T_ExtensibleNode:
_outExtensibleNode(str, obj);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 0ea9fcf..b9de71d 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -28,6 +28,7 @@
#include "catalog/dependency.h"
#include "catalog/heap.h"
#include "catalog/pg_am.h"
+#include "catalog/pg_mv_statistic.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -40,7 +41,9 @@
#include "parser/parsetree.h"
#include "rewrite/rewriteManip.h"
#include "storage/bufmgr.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
+#include "utils/syscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -94,6 +97,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
Relation relation;
bool hasindex;
List *indexinfos = NIL;
+ List *stainfos = NIL;
/*
* We need not lock the relation since it was already locked, either by
@@ -387,6 +391,65 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->indexlist = indexinfos;
+ if (true)
+ {
+ List *mvstatoidlist;
+ ListCell *l;
+
+ mvstatoidlist = RelationGetMVStatList(relation);
+
+ foreach(l, mvstatoidlist)
+ {
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ Oid mvoid = lfirst_oid(l);
+ Form_pg_mv_statistic mvstat;
+ MVStatisticInfo *info;
+
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ continue;
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /* unavailable stats are not interesting for the planner */
+ if (mvstat->deps_built)
+ {
+ info = makeNode(MVStatisticInfo);
+
+ info->mvoid = mvoid;
+ info->rel = rel;
+
+ /* enabled statistics */
+ info->deps_enabled = mvstat->deps_enabled;
+
+ /* built/available statistics */
+ info->deps_built = mvstat->deps_built;
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ info->stakeys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+
+ stainfos = lcons(info, stainfos);
+ }
+
+ ReleaseSysCache(htup);
+ }
+
+ list_free(mvstatoidlist);
+ }
+
+ rel->mvstatlist = stainfos;
+
/* Grab foreign-table info using the relcache, while we have it */
if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b307b48..3be3f02 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -241,7 +241,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
ConstraintsSetStmt CopyStmt CreateAsStmt CreateCastStmt
CreateDomainStmt CreateExtensionStmt CreateGroupStmt CreateOpClassStmt
CreateOpFamilyStmt AlterOpFamilyStmt CreatePLangStmt
- CreateSchemaStmt CreateSeqStmt CreateStmt CreateTableSpaceStmt
+ CreateSchemaStmt CreateSeqStmt CreateStmt CreateStatsStmt CreateTableSpaceStmt
CreateFdwStmt CreateForeignServerStmt CreateForeignTableStmt
CreateAssertStmt CreateTransformStmt CreateTrigStmt CreateEventTrigStmt
CreateUserStmt CreateUserMappingStmt CreateRoleStmt CreatePolicyStmt
@@ -809,6 +809,7 @@ stmt :
| CreateSchemaStmt
| CreateSeqStmt
| CreateStmt
+ | CreateStatsStmt
| CreateTableSpaceStmt
| CreateTransformStmt
| CreateTrigStmt
@@ -3436,6 +3437,36 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * CREATE STATISTICS stats_name ON relname (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+
+CreateStatsStmt: CREATE STATISTICS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $3;
+ n->relation = $5;
+ n->keys = $7;
+ n->options = $9;
+ n->if_not_exists = false;
+ $$ = (Node *)n;
+ }
+ | CREATE STATISTICS IF_P NOT EXISTS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $6;
+ n->relation = $8;
+ n->keys = $10;
+ n->options = $12;
+ n->if_not_exists = true;
+ $$ = (Node *)n;
+ }
+ ;
+
/*****************************************************************************
*
@@ -5621,6 +5652,7 @@ drop_type: TABLE { $$ = OBJECT_TABLE; }
| TEXT_P SEARCH DICTIONARY { $$ = OBJECT_TSDICTIONARY; }
| TEXT_P SEARCH TEMPLATE { $$ = OBJECT_TSTEMPLATE; }
| TEXT_P SEARCH CONFIGURATION { $$ = OBJECT_TSCONFIGURATION; }
+ | STATISTICS { $$ = OBJECT_STATISTICS; }
;
any_name_list:
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 045f7f0..2ba88e2 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1520,6 +1520,10 @@ ProcessUtilitySlow(Node *parsetree,
address = ExecSecLabelStmt((SecLabelStmt *) parsetree);
break;
+ case T_CreateStatsStmt: /* CREATE STATISTICS */
+ address = CreateStatistics((CreateStatsStmt *) parsetree);
+ break;
+
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(parsetree));
@@ -2160,6 +2164,9 @@ CreateCommandTag(Node *parsetree)
case OBJECT_TRANSFORM:
tag = "DROP TRANSFORM";
break;
+ case OBJECT_STATISTICS:
+ tag = "DROP STATISTICS";
+ break;
default:
tag = "???";
}
@@ -2527,6 +2534,10 @@ CreateCommandTag(Node *parsetree)
tag = "EXECUTE";
break;
+ case T_CreateStatsStmt:
+ tag = "CREATE STATISTICS";
+ break;
+
case T_DeallocateStmt:
{
DeallocateStmt *stmt = (DeallocateStmt *) parsetree;
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 130c06d..3bc4c8a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -47,6 +47,7 @@
#include "catalog/pg_auth_members.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
@@ -3956,6 +3957,62 @@ RelationGetIndexList(Relation relation)
return result;
}
+
+List *
+RelationGetMVStatList(Relation relation)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result;
+ List *oldlist;
+ MemoryContext oldcxt;
+
+ /* Quick exit if we already computed the list. */
+ if (relation->rd_mvstatvalid != 0)
+ return list_copy(relation->rd_mvstatlist);
+
+ /*
+ * We build the list we intend to return (in the caller's context) while
+ * doing the scan. After successfully completing the scan, we copy that
+ * list into the relcache entry. This avoids cache-context memory leakage
+ * if we get some sort of error partway through.
+ */
+ result = NIL;
+
+ /* Prepare to scan pg_index for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(relation)));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ /* TODO maybe include only already built statistics? */
+ result = insert_ordered_oid(result, HeapTupleGetOid(htup));
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* Now save a copy of the completed list in the relcache entry. */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ oldlist = relation->rd_mvstatlist;
+ relation->rd_mvstatlist = list_copy(result);
+
+ relation->rd_mvstatvalid = true;
+ MemoryContextSwitchTo(oldcxt);
+
+ /* Don't leak the old list, if there is one */
+ list_free(oldlist);
+
+ return result;
+}
+
/*
* insert_ordered_oid
* Insert a new Oid into a sorted list of Oids, preserving ordering
@@ -4920,6 +4977,8 @@ load_relcache_init_file(bool shared)
rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_idattr = NULL;
+ rel->rd_mvstatvalid = false;
+ rel->rd_mvstatlist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 65ffe84..3c1bc4b 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -44,6 +44,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -502,6 +503,28 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATNAMENSP */
+ MvStatisticNameIndexId,
+ 2,
+ {
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ 0,
+ 0
+ },
+ 4
+ },
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 4
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.dependencies b/src/backend/utils/mvstats/README.dependencies
new file mode 100644
index 0000000..1f96fbc
--- /dev/null
+++ b/src/backend/utils/mvstats/README.dependencies
@@ -0,0 +1,222 @@
+Soft functional dependencies
+============================
+
+A type of multivariate statistics used to capture cases when one column (or
+possibly a combination of columns) determines values in another column. We may
+also say that one column implies the other one.
+
+A simple artificial example may be a table with two columns, created like this
+
+ CREATE TABLE t (a INT, b INT)
+ AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+
+Clearly, once we know the value for column 'a' the value for 'b' is trivially
+determined, as it's simply (a/10). A more practical example may be addresses,
+where (ZIP code -> city name), i.e. once we know the ZIP, we probably know the
+city it belongs to, as ZIP codes are usually assigned to one city. Larger cities
+may have multiple ZIP codes, so the dependency can't be reversed.
+
+Functional dependencies are a concept well described in relational theory,
+particularly in definition of normalization and "normal forms". Wikipedia has a
+nice definition of a functional dependency [1]:
+
+ In a given table, an attribute Y is said to have a functional dependency on
+ a set of attributes X (written X -> Y) if and only if each X value is
+ associated with precisely one Y value. For example, in an "Employee" table
+ that includes the attributes "Employee ID" and "Employee Date of Birth", the
+ functional dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ It follows from the previous two sentences that each {Employee ID} is
+ associated with precisely one {Employee Date of Birth}.
+
+ [1] http://en.wikipedia.org/wiki/Database_normalization
+
+Many datasets might be normalized not to contain such dependencies, but often
+it's not practical for various reasons. In some cases it's actually a conscious
+design choice to model the dataset in denormalized way, either because of
+performance or to make querying easier.
+
+The functional dependencies are called 'soft' because the implementation is
+meant to allow small number of rows contradicting the dependency. Many actual
+data sets contain some sort of errors, either because of data entry mistakes
+(user mistyping the ZIP code) or issues in generating the data (e.g. a ZIP code
+mistakenly assigned to two cities in different states). A strict implementation
+would ignore dependencies on such noisy data, rendering the approach unusable on
+such data sets.
+
+
+Mining dependencies (ANALYZE)
+-----------------------------
+
+The current build algorithm is rather simple - for each pair (a,b) of columns,
+the data are sorted lexicographically (first by 'a', then by 'b'). Then for each
+group (rows with the same 'a' value) we decide whether the group is neutral,
+supporting or contradicting the dependency (a->b).
+
+A group is considered neutral when it's too small - e.g. when there's a single
+row in the group, there can't possibly be multiple values in 'b'. For this
+reason we ignore groups smaller than a threshold (currently 3 rows).
+
+For sufficiently large groups (3 rows or more), we count the number of distinct
+values in 'b'. When there's a single 'b' value, the group is considered to
+support the dependency (a->b), otherwise it's condidered as contradicting it.
+
+At the end, we compare the number of rows in supporting and contradicting groups,
+and if there are at least 10x as many supporting rows, we consider the
+functional dependency to be valid.
+
+
+This approach has the negative property that the algorithm is that it's a bit
+fragile with respect to the sample - there may be data sets producing quite
+different results for each ANALYZE execution (as even a single row may change
+the outcome of the final 10x test).
+
+It was proposed to make the dependencies "fuzzy" - e.g. track some coefficient
+between [0,1] determining how much the dependency holds. That would however mean
+we have to keep all the dependencies, as eliminating them based on the value of
+the coefficient (e.g. throw away dependencies <= 0.5) would result in exactly
+the same fragility issues. This would also make it more complicated to combine
+dependencies. So this does not seem like a practical approach.
+
+A better approach might be to replace the constants (min_group_size=3 and 10x)
+with values somehow related to the particular data set.
+
+
+Clause reduction (planner/optimizer)
+------------------------------------
+
+Apllying the functional dependencies is quite simple - given a list of equality
+clauses, check which clauses are redundant (i.e. implied by some other clause).
+For example given clause list
+
+ (a = 2) AND (b = 2) AND (c = 3)
+
+and dependencies (a->b) and (a->d), the list of clauses may be simplified to
+
+ (a = 1) AND (c = 3)
+
+Functional dependencies may only be applied to equality clauses, all other types
+of clauses are ignored. See clauselist_apply_dependencies() for more details.
+
+
+Compatibility of clauses
+------------------------
+
+The reduction assumes the clauses really are redundant, and the value in the
+reduced clause (b=2) is the value determined by (a=1). If that's not the case
+and the values are "incompatible" the result will be over-estimation.
+
+This may happen for example when using conditions on ZIP and city name with
+mismatching values (ZIP for a different city), etc. In such case the result
+set will be empty, but we'll estimate the selectivity using the ZIP condition.
+
+In this case the default estimation based on AVIA principle happens to work
+better, but mostly by chance.
+
+
+Dependencies vs. MCV/histogram
+------------------------------
+
+In some cases the "compatibility" of the conditions might be verified using the
+other types of multivariate stats - MCV lists and histograms.
+
+For MCV lists the verification might be very simple - peek into the list if
+there are any items matching the clause on the 'a' column (e.g. ZIP code), and
+if such item is found, check that the 'b' column matches the other clause. If it
+does not, the clauses are contradictory. We can't really say if such item was
+not found, except maybe restricting the selectivity using the MCV data (e.g.
+using min/max selectivity, or something).
+
+With histograms, it might work similarly - we can't check the values directly
+(because histograms use buckets, unlike MCV lists, storing the actual values).
+So we can only observe the buckets matching the clauses - if those buckets have
+very low frequency, it probably means the two clauses are incompatible.
+
+It's unclear what 'low frequency' is, but if one of the clauses is implied
+(automatically true because of the other clause), then
+
+ selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+
+So we might compute selectivity of the first clause - for example using regular
+statistics. And then check if the selectivity computed from the histogram is
+about the same (or significantly lower).
+
+The problem is that histograms work well only when the data ordering matches the
+natural meaning. For values that serve as labels - like city names or ZIP codes,
+or even generated IDs, histograms really don't work all that well. For example
+sorting cities by name won't match the sorting of ZIP codes, rendering the
+histogram unusable.
+
+So MCVs are probably going to work much better, because they don't really assume
+any sort of ordering. And it's probably more appropriate for the label-like data.
+
+A good question however is why even use functional dependencies in such cases
+and not simply use the MCV/histogram instead. One reason is that the functional
+dependencies allow fallback to regular stats, and often produce more accurate
+estimates - especially compared to histograms, that are quite bad in estimating
+equality clauses.
+
+
+Limitations
+-----------
+
+Let's see the main liminations of functional dependencies, especially those
+related to the current implementation.
+
+The current implementation supports only dependencies between two columns, but
+this is merely a simplification of the initial implementation. It's certainly
+useful to mine for dependencies involving multiple columns on the 'left' side,
+i.e. a condition for the dependency. That is dependencies like (a,b -> c).
+
+The implementation may/should be smart enough not to mine redundant conditions,
+e.g. (a->b) and (a,c -> b), because the latter is a trivial consequence of the
+former one (if values of 'a' determine 'b', adding another column won't change
+that relationship). The ANALYZE should first analyze 1:1 dependencies, then 2:1
+dependencies (and skip the already identified ones), etc.
+
+For example the dependency
+
+ (city name -> zip code)
+
+is much stronger, i.e. whenever it hold, then
+
+ (city name, state name -> zip code)
+
+holds too. But in case there are cities with the same name in different states,
+then only the latter dependency will be valid.
+
+Of course, there probably are cities with the same name within a single state,
+but hopefully this is relatively rare occurence (and thus we'll still detect
+the 'soft' dependency).
+
+Handling multiple columns on the right side of the dependency, is not necessary,
+as those dependencies may be simply decomposed into a set of dependencies with
+the same meaning, one for each column on the right side. For example
+
+ (a -> b,c)
+
+is exactly the same as
+
+ (a -> b) & (a -> c)
+
+Of course, storing the first form may be more efficient thant storing multiple
+'simple' dependencies separately.
+
+
+TODO Support dependencies with multiple columns on left/right.
+
+TODO Investigate using histogram and MCV list to verify the dependencies.
+
+TODO Investigate statistical testing of the distribution (to decide whether it
+ makes sense to build the histogram/MCV list).
+
+TODO Using a min/max of selectivities would probably make more sense for the
+ associated columns.
+
+TODO Consider eliminating the implied columns from the histogram and MCV lists
+ (but maybe that's not a good idea, because that'd make it impossible to use
+ these stats for non-equality clauses and also it wouldn't be possible to
+ use the stats for verification of the dependencies).
+
+TODO The reduction probably might be extended to also handle IS NULL clauses,
+ assuming we fix the ANALYZE to properly handle NULL values. We however
+ won't be able to reduce IS NOT NULL (unless I'm missing something).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..a755c49
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,356 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+static List* list_mv_stats(Oid relid);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ ListCell *lc;
+ List *mvstats;
+
+ TupleDesc tupdesc = RelationGetDescr(onerel);
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel));
+
+ foreach (lc, mvstats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies deps = NULL;
+
+ VacAttrStats **stats = NULL;
+ int numatts = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = stat->stakeys;
+
+ /* see how many of the columns are not dropped */
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ numatts += 1;
+
+ /* if there are dropped attributes, build a filtered int2vector */
+ if (numatts != attrs->dim1)
+ {
+ int16 *tmp = palloc0(numatts * sizeof(int16));
+ int attnum = 0;
+
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ tmp[attnum++] = attrs->values[j];
+
+ pfree(attrs);
+ attrs = buildint2vector(tmp, numatts);
+ }
+
+ /* filter only the interesting vacattrstats records */
+ stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(stat->mvoid, deps, attrs);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+static List*
+list_mv_stats(Oid relid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result = NIL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ info->mvoid = HeapTupleGetOid(htup);
+ info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_built = stats->deps_built;
+
+ result = lappend(result, info);
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_stakeys-1] = false;
+
+ /* use the new attnums, in case we removed some dropped ones */
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_stakeys -1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..6d5465b
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/datum.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "access/sysattr.h"
+
+#include "utils/mvstats.h"
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..2a064a0
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,437 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ * Detect functional dependencies between columns.
+ *
+ * TODO This builds a complete set of dependencies, i.e. including transitive
+ * dependencies - if we identify [A => B] and [B => C], we're likely to
+ * identify [A => C] too. It might be better to keep only the minimal set
+ * of dependencies, i.e. prune all the dependencies that we can recreate
+ * by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may be
+ * recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check before
+ * actually doing the work
+ *
+ * The second option has the advantage that we don't really need to perform
+ * the sort/count. It's not sufficient alone, though, because we may
+ * discover the dependencies in the wrong order. For example we may find
+ *
+ * (a -> b), (a -> c) and then (b -> c)
+ *
+ * None of those dependencies is a combination of the already known ones,
+ * yet (a -> C) is a combination of (a -> b) and (b -> c).
+ *
+ *
+ * FIXME Currently we simply replace NULL values with 0 and then handle is as
+ * a regular value, but that groups NULL and actual 0 values. That's
+ * clearly incorrect - we need to handle NULL values as a separate value.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct values in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we can estimate the
+ * average group size and use it as a threshold. Or something
+ * like that. Seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, stats);
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ SortItem current;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, stats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ current = items[0];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
+ {
+ n_violations += 1;
+ }
+
+ current = items[i];
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means there's a 1:1 relation (or one is a 'label'), making the
+ * conditions rather redundant. Although it's possible that the
+ * query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index fd8dc91..4f106c3 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2104,6 +2104,50 @@ describeOneTableDetails(const char *schemaname,
PQclear(result);
}
+ /* print any multivariate statistics */
+ if (pset.sversion >= 90500)
+ {
+ printfPQExpBuffer(&buf,
+ "SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
+ " deps_enabled,\n"
+ " deps_built,\n"
+ " (SELECT string_agg(attname::text,', ')\n"
+ " FROM ((SELECT unnest(stakeys) AS attnum) s\n"
+ " JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
+ "FROM pg_mv_statistic stat WHERE starelid = '%s' ORDER BY 1;",
+ oid);
+
+ result = PSQLexec(buf.data);
+ if (!result)
+ goto error_return;
+ else
+ tuples = PQntuples(result);
+
+ if (tuples > 0)
+ {
+ printTableAddFooter(&cont, _("Statistics:"));
+ for (i = 0; i < tuples; i++)
+ {
+ printfPQExpBuffer(&buf, " ");
+
+ /* statistics name (qualified with namespace) */
+ appendPQExpBuffer(&buf, "\"%s.%s\" ",
+ PQgetvalue(result, i, 1),
+ PQgetvalue(result, i, 2));
+
+ /* options */
+ if (!strcmp(PQgetvalue(result, i, 4), "t"))
+ appendPQExpBuffer(&buf, "(dependencies)");
+
+ appendPQExpBuffer(&buf, " ON (%s)",
+ PQgetvalue(result, i, 6));
+
+ printTableAddFooter(&cont, buf.data);
+ }
+ }
+ PQclear(result);
+ }
+
/* print rules */
if (tableinfo.hasrules && tableinfo.relkind != 'm')
{
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 049bf9f..12211fe 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -153,10 +153,11 @@ typedef enum ObjectClass
OCLASS_EXTENSION, /* pg_extension */
OCLASS_EVENT_TRIGGER, /* pg_event_trigger */
OCLASS_POLICY, /* pg_policy */
- OCLASS_TRANSFORM /* pg_transform */
+ OCLASS_TRANSFORM, /* pg_transform */
+ OCLASS_STATISTICS /* pg_mv_statistics */
} ObjectClass;
-#define LAST_OCLASS OCLASS_TRANSFORM
+#define LAST_OCLASS OCLASS_STATISTICS
/* in dependency.c */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index b80d8d8..5ae42f7 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -119,6 +119,7 @@ extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
DropBehavior behavior, bool complain, bool internal);
extern void RemoveAttrDefaultById(Oid attrdefId);
extern void RemoveStatistics(Oid relid, AttrNumber attnum);
+extern void RemoveMVStatistics(Oid relid, AttrNumber attnum);
extern Form_pg_attribute SystemAttributeDefinition(AttrNumber attno,
bool relhasoids);
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index ab2c1a8..a768bb5 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,13 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_name_index, 3997, on pg_mv_statistic using btree(staname name_ops, stanamespace oid_ops));
+#define MvStatisticNameIndexId 3997
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index 2ccb3a7..44cf9c6 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -137,6 +137,8 @@ extern Oid get_collation_oid(List *collname, bool missing_ok);
extern Oid get_conversion_oid(List *conname, bool missing_ok);
extern Oid FindDefaultConversionProc(int32 for_encoding, int32 to_encoding);
+extern Oid get_statistics_oid(List *names, bool missing_ok);
+
/* initialization & transaction cleanup code */
extern void InitializeSearchPath(void);
extern void AtEOXact_Namespace(bool isCommit, bool parallel);
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..a568a07
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,73 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+ NameData staname; /* statistics name */
+ Oid stanamespace; /* OID of namespace containing this statistics */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_mv_statistic
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 7
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_staname 2
+#define Anum_pg_mv_statistic_stanamespace 3
+#define Anum_pg_mv_statistic_deps_enabled 4
+#define Anum_pg_mv_statistic_deps_built 5
+#define Anum_pg_mv_statistic_stakeys 6
+#define Anum_pg_mv_statistic_stadeps 7
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 62b9125..20d565c 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2666,6 +2666,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index b7a38ce..a52096b 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3577, 3578);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 54f67e9..99a6a62 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -75,6 +75,10 @@ extern ObjectAddress DefineOperator(List *names, List *parameters);
extern void RemoveOperatorById(Oid operOid);
extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
+/* commands/statscmds.c */
+extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
+extern void RemoveStatisticsById(Oid statsOid);
+
/* commands/aggregatecmds.c */
extern ObjectAddress DefineAggregate(List *name, List *args, bool oldstyle,
List *parameters, const char *queryString);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index c407fa2..2226aad 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -251,6 +251,7 @@ typedef enum NodeTag
T_PlaceHolderInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
+ T_MVStatisticInfo,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
@@ -386,6 +387,7 @@ typedef enum NodeTag
T_CreatePolicyStmt,
T_AlterPolicyStmt,
T_CreateTransformStmt,
+ T_CreateStatsStmt,
/*
* TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fd0629..e1807fb 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -601,6 +601,17 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct CreateStatsStmt
+{
+ NodeTag type;
+ List *defnames; /* qualified name (list of Value strings) */
+ RangeVar *relation; /* relation to build statistics on */
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+ bool if_not_exists; /* just do nothing if statistics already exists? */
+} CreateStatsStmt;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1410,6 +1421,7 @@ typedef enum ObjectType
OBJECT_RULE,
OBJECT_SCHEMA,
OBJECT_SEQUENCE,
+ OBJECT_STATISTICS,
OBJECT_TABCONSTRAINT,
OBJECT_TABLE,
OBJECT_TABLESPACE,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index af8cb6b..de86d01 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -503,6 +503,7 @@ typedef struct RelOptInfo
List *lateral_vars; /* LATERAL Vars and PHVs referenced by rel */
Relids lateral_referencers; /* rels that reference me laterally */
List *indexlist; /* list of IndexOptInfo */
+ List *mvstatlist; /* list of MVStatisticInfo */
BlockNumber pages; /* size estimates derived from pg_class */
double tuples;
double allvisfrac;
@@ -600,6 +601,33 @@ typedef struct IndexOptInfo
void (*amcostestimate) (); /* AM's cost estimator */
} IndexOptInfo;
+/*
+ * MVStatisticInfo
+ * Information about multivariate stats for planning/optimization
+ *
+ * This contains information about which columns are covered by the
+ * statistics (stakeys), which options were requested while adding the
+ * statistics (*_enabled), and which kinds of statistics were actually
+ * built and are available for the optimizer (*_built).
+ */
+typedef struct MVStatisticInfo
+{
+ NodeTag type;
+
+ Oid mvoid; /* OID of the statistics row */
+ RelOptInfo *rel; /* back-link to index's table */
+
+ /* enabled statistics */
+ bool deps_enabled; /* functional dependencies enabled */
+
+ /* built/available statistics */
+ bool deps_built; /* functional dependencies built */
+
+ /* columns in the statistics (attnums) */
+ int2vector *stakeys; /* attnums of the columns covered */
+
+} MVStatisticInfo;
+
/*
* EquivalenceClasses
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..7ebd961
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,70 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "fmgr.h"
+#include "commands/vacuum.h"
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+
+#endif
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index f2bebf2..8771f9c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -61,6 +61,7 @@ typedef struct RelationData
bool rd_isvalid; /* relcache entry is valid */
char rd_indexvalid; /* state of rd_indexlist: 0 = not valid, 1 =
* valid, 2 = temporarily forced */
+ bool rd_mvstatvalid; /* state of rd_mvstatlist: true/false */
/*
* rd_createSubid is the ID of the highest subtransaction the rel has
@@ -93,6 +94,9 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
+
+ /* data managed by RelationGetMVStatList: */
+ List *rd_mvstatlist; /* list of OIDs of multivariate stats */
/* data managed by RelationGetIndexAttrBitmap: */
Bitmapset *rd_indexattr; /* identifies columns used in indexes */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 1b48304..9f03c8d 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -38,6 +38,7 @@ extern void RelationClose(Relation relation);
* Routines to compute/retrieve additional cached information
*/
extern List *RelationGetIndexList(Relation relation);
+extern List *RelationGetMVStatList(Relation relation);
extern Oid RelationGetOidIndex(Relation relation);
extern Oid RelationGetReplicaIndex(Relation relation);
extern List *RelationGetIndexExpressions(Relation relation);
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 256615b..0e0658d 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,8 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATNAMENSP,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 81bc5c9..84b4425 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1368,6 +1368,15 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.staname,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index eb0bc88..92a0d8a 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
--
2.1.0
0003-clause-reduction-using-functional-dependencies.patchtext/x-patch; charset=UTF-8; name=0003-clause-reduction-using-functional-dependencies.patchDownload
From 282579eef3de01e0d31ed5f7067045a4f97fbfb8 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 19:42:18 +0200
Subject: [PATCH 3/9] clause reduction using functional dependencies
During planning, use functional dependencies to decide which clauses to
skip during cardinality estimation. Initial and rather simplistic
implementation.
This only works with regular WHERE clauses, not clauses used for join
clauses.
Note: The clause_is_mv_compatible() needs to identify the relation (so
that we can fetch the list of multivariate stats by OID).
planner_rt_fetch() seems like the appropriate way to get the relation
OID, but apparently it only works with simple vars. Maybe
examine_variable() would make this work with more complex vars too?
Includes regression tests analyzing functional dependencies (part of
ANALYZE) on several datasets (no dependencies, no transitive
dependencies, ...).
Checks that a query with conditions on two columns, where one (B) is
functionally dependent on the other one (A), correctly ignores the
clause on (B) and chooses bitmap index scan instead of plain index scan
(which is what happens otherwise, thanks to assumption of
independence).
Note: Functional dependencies only work with equality clauses, no
inequalities etc.
---
src/backend/optimizer/path/clausesel.c | 891 +++++++++++++++++++++++++-
src/backend/utils/mvstats/README.stats | 36 ++
src/backend/utils/mvstats/common.c | 5 +-
src/backend/utils/mvstats/dependencies.c | 24 +
src/include/utils/mvstats.h | 16 +-
src/test/regress/expected/mv_dependencies.out | 172 +++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 150 +++++
9 files changed, 1293 insertions(+), 5 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.stats
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 02660c2..80708fe 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -14,14 +14,19 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
+#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/selfuncs.h"
+#include "utils/typcache.h"
/*
@@ -41,6 +46,23 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+
+static bool clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum);
+
+static Bitmapset *collect_mv_attnums(List *clauses, Index relid);
+
+static int count_mv_attnums(List *clauses, Index relid);
+
+static int count_varnos(List *clauses, Index *relid);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Index relid, List *stats);
+
+static bool has_stats(List *stats, int type);
+
+static List * find_stats(PlannerInfo *root, Index relid);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
@@ -60,7 +82,19 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
+ * The first thing we try to do is applying multivariate statistics, in a way
+ * that intends to minimize the overhead when there are no multivariate stats
+ * on the relation. Thus we do several simple (and inexpensive) checks first,
+ * to verify that suitable multivariate statistics exist.
+ *
+ * If we identify such multivariate statistics apply, we try to apply them.
+ * Currently we only have (soft) functional dependencies, so we try to reduce
+ * the list of clauses.
+ *
+ * Then we remove the clauses estimated using multivariate stats, and process
+ * the rest of the clauses using the regular per-column stats.
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -99,6 +133,22 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+
+ /* list of multivariate stats on the relation */
+ List *stats = NIL;
+
+ /*
+ * To fetch the statistics, we first need to determine the rel. Currently
+ * point we only support estimates of simple restrictions with all Vars
+ * referencing a single baserel. However set_baserel_size_estimates() sets
+ * varRelid=0 so we have to actually inspect the clauses by pull_varnos
+ * and see if there's just a single varno referenced.
+ */
+ if ((count_varnos(clauses, &relid) == 1) && ((varRelid == 0) || (varRelid == relid)))
+ stats = find_stats(root, relid);
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +158,24 @@ clauselist_selectivity(PlannerInfo *root,
varRelid, jointype, sjinfo);
/*
+ * Apply functional dependencies, but first check that there are some stats
+ * with functional dependencies built (by simply walking the stats list),
+ * and that there are at two or more attributes referenced by clauses that
+ * may be reduced using functional dependencies.
+ *
+ * We would find that anyway when trying to actually apply the functional
+ * dependencies, but let's do the cheap checks first.
+ *
+ * After applying the functional dependencies we get the remainig clauses
+ * that need to be estimated by other types of stats (MCV, histograms etc).
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
+ (count_mv_attnums(clauses, relid) >= 2))
+ {
+ clauses = clauselist_apply_dependencies(root, clauses, relid, stats);
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -763,3 +831,824 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Pull varattnos from the clauses, similarly to pull_varattnos() but:
+ *
+ * (a) only get attributes for a particular relation (relid)
+ * (b) ignore system attributes (we can't build stats on them anyway)
+ *
+ * This makes it possible to directly compare the result with attnum
+ * values from pg_attribute etc.
+ */
+static Bitmapset *
+get_varattnos(Node * node, Index relid)
+{
+ int k;
+ Bitmapset *varattnos = NULL;
+ Bitmapset *result = NULL;
+
+ /* get the varattnos */
+ pull_varattnos(node, relid, &varattnos);
+
+ k = -1;
+ while ((k = bms_next_member(varattnos, k)) >= 0)
+ {
+ if (k + FirstLowInvalidHeapAttributeNumber > 0)
+ result
+ = bms_add_member(result,
+ k + FirstLowInvalidHeapAttributeNumber);
+ }
+
+ bms_free(varattnos);
+
+ return result;
+}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Index relid)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(clause, relid, &attnum))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Index relid)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Count varnos referenced in the clauses, and if there's a single varno then
+ * return the index in 'relid'.
+ */
+static int
+count_varnos(List *clauses, Index *relid)
+{
+ int cnt;
+ Bitmapset *varnos = NULL;
+
+ varnos = pull_varnos((Node *) clauses);
+ cnt = bms_num_members(varnos);
+
+ /* if there's a single varno in the clauses, remember it */
+ if (bms_num_members(varnos) == 1)
+ *relid = bms_singleton_member(varnos);
+
+ bms_free(varnos);
+
+ return cnt;
+}
+
+typedef struct
+{
+ Index varno; /* relid we're interested in */
+ Bitmapset *varattnos; /* attnums referenced by the clauses */
+} mv_compatible_context;
+
+/*
+ * Recursive walker that checks compatibility of the clause with multivariate
+ * statistics, and collects attnums from the Vars.
+ *
+ * XXX The original idea was to combine this with expression_tree_walker, but
+ * I've been unable to make that work - seems that does not quite allow
+ * checking the structure. Hence the explicit calls to the walker.
+ */
+static bool
+mv_compatible_walker(Node *node, mv_compatible_context *context)
+{
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) node;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return true;
+
+ /* clauses referencing multiple varnos are incompatible */
+ if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+ return true;
+
+ /* check the clause inside the RestrictInfo */
+ return mv_compatible_walker((Node*)rinfo->clause, (void *) context);
+ }
+
+ if (IsA(node, Var))
+ {
+ Var * var = (Var*)node;
+
+ /*
+ * Also, the variable needs to reference the right relid (this might be
+ * unnecessary given the other checks, but let's be sure).
+ */
+ if (var->varno != context->varno)
+ return true;
+
+ /* Also skip system attributes (we don't allow stats on those). */
+ if (! AttrNumberIsForUserDefinedAttr(var->varattno))
+ return true;
+
+ /* Seems fine, so let's remember the attnum. */
+ context->varattnos = bms_add_member(context->varattnos, var->varattno);
+
+ return false;
+ }
+
+ /*
+ * And finally the operator expressions - we only allow simple expressions
+ * with two arguments, where one is a Var and the other is a constant, and
+ * it's a simple comparison (which we detect using estimator function).
+ */
+ if (is_opclause(node))
+ {
+ OpExpr *expr = (OpExpr *) node;
+ Var *var;
+ bool varonleft = true;
+ bool ok;
+
+ /*
+ * Only expressions with two arguments are considered compatible.
+ *
+ * XXX Possibly unnecessary (can OpExpr have different arg count?).
+ */
+ if (list_length(expr->args) != 2)
+ return true;
+
+ /* see if it actually has the right */
+ ok = (NumRelids((Node*)expr) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ /* unsupported structure (two variables or so) */
+ if (! ok)
+ return true;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the clause.
+ * Otherwise note the relid and attnum for the variable. This uses the
+ * function for estimating selectivity, ont the operator directly (a bit
+ * awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+
+ /* equality conditions are compatible with all statistics */
+ break;
+
+ default:
+
+ /* unknown estimator */
+ return true;
+ }
+
+ var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ return mv_compatible_walker((Node *) var, context);
+ }
+
+ /* Node not explicitly supported, so terminate */
+ return true;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
+{
+ mv_compatible_context context;
+
+ context.varno = relid;
+ context.varattnos = NULL; /* no attnums */
+
+ if (mv_compatible_walker(clause, (void *) &context))
+ return false;
+
+ /* remember the newly collected attnums */
+ *attnum = bms_singleton_member(context.varattnos);
+
+ return true;
+}
+
+/*
+ * collect attnums from functional dependencies
+ *
+ * Walk through all statistics on the relation, and collect attnums covered
+ * by those with functional dependencies. We only look at columns specified
+ * when creating the statistics, not at columns actually referenced by the
+ * dependencies (which may only be a subset of the attributes).
+ */
+static Bitmapset*
+fdeps_collect_attnums(List *stats)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ int2vector *stakeys = info->stakeys;
+
+ /* skip stats without functional dependencies built */
+ if (! info->deps_built)
+ continue;
+
+ for (j = 0; j < stakeys->dim1; j++)
+ attnums = bms_add_member(attnums, stakeys->values[j]);
+ }
+
+ return attnums;
+}
+
+/* transforms bitmapset into an array (index => value) */
+static int*
+make_idx_to_attnum_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum;
+
+ int *mapping = (int*)palloc0(bms_num_members(attnums) * sizeof(int));
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attidx++] = attnum;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+/* transforms bitmapset into an array (value => index) */
+static int*
+make_attnum_to_idx_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum;
+ int maxattnum = -1;
+ int *mapping;
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ maxattnum = attnum;
+
+ mapping = (int*)palloc0((maxattnum+1) * sizeof(int));
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attnum] = attidx++;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+/* build adjacency matrix for the dependencies */
+static bool*
+build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx)
+{
+ ListCell *lc;
+ int natts = bms_num_members(attnums);
+ bool *matrix = (bool*)palloc0(natts * natts * sizeof(bool));
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies dependencies = NULL;
+
+ /* skip stats without functional dependencies built */
+ if (! stat->deps_built)
+ continue;
+
+ /* fetch and deserialize dependencies */
+ dependencies = load_mv_dependencies(stat->mvoid);
+ if (dependencies == NULL)
+ {
+ elog(WARNING, "failed to deserialize func deps %d", stat->mvoid);
+ continue;
+ }
+
+ /* set matrix[a,b] to 'true' if 'a=>b' */
+ for (j = 0; j < dependencies->ndeps; j++)
+ {
+ int aidx = attnum_to_idx[dependencies->deps[j]->a];
+ int bidx = attnum_to_idx[dependencies->deps[j]->b];
+
+ /* a=> b */
+ matrix[aidx * natts + bidx] = true;
+ }
+ }
+
+ return matrix;
+}
+
+/*
+ * multiply the adjacency matrix
+ *
+ * By multiplying the adjacency matrix, we derive dependencies implied by those
+ * stored in the catalog (but possibly in several separate rows). We need to
+ * repeat the multiplication until no new dependencies are discovered. The
+ * maximum number of multiplications is equal to the number of attributes.
+ *
+ * This is based on modeling the functional dependencies as edges in a directed
+ * graph with attributes as vertices.
+ */
+static void
+multiply_adjacency_matrix(bool *matrix, int natts)
+{
+ int i;
+
+ /* repeat the multiplication up to natts-times */
+ for (i = 0; i < natts; i++)
+ {
+ bool changed = false; /* no changes in this round */
+ int k, l, m;
+
+ /* k => l */
+ for (k = 0; k < natts; k++)
+ {
+ for (l = 0; l < natts; l++)
+ {
+ /* skip already known dependencies */
+ if (matrix[k * natts + l])
+ continue;
+
+ /*
+ * compute (k,l) in the multiplied matrix
+ *
+ * We don't really care about the exact value, just true/false,
+ * so terminate the loop once we get a hit. Also, this makes it
+ * safe to modify the matrix in-place.
+ */
+ for (m = 0; m < natts; m++)
+ {
+ if (matrix[k * natts + m] * matrix[m * natts + l])
+ {
+ matrix[k * natts + l] = true;
+ changed = true;
+ break;
+ }
+ }
+ }
+ }
+
+ /* no transitive dependency added in this round, so terminate */
+ if (! changed)
+ break;
+ }
+}
+
+/*
+ * Reduce clauses using functional dependencies
+ *
+ * Walk through clauses and eliminate the redundant ones (implied by other
+ * clauses). This is done by first deriving a transitive closure of all the
+ * functional dependencies (by multiplying the adjacency matrix).
+ */
+static List*
+fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx, Index relid)
+{
+ int i;
+ ListCell *lc;
+ List *reduced_clauses = NIL;
+
+ int nmvclauses; /* size of the arrays */
+ bool *reduced;
+ AttrNumber *mvattnums;
+ Node **mvclauses;
+
+ int natts = bms_num_members(attnums);
+
+ /*
+ * Preallocate space for all clauses (the list only containst
+ * compatible clauses at this point). This makes it somewhat easier
+ * to access the stats / attnums randomly.
+ *
+ * XXX This assumes each clause references exactly one Var, so the
+ * arrays are sized accordingly - for functional dependencies
+ * this is safe, because it only works with Var=Const.
+ */
+ mvclauses = (Node**)palloc0(list_length(clauses) * sizeof(Node*));
+ mvattnums = (AttrNumber*)palloc0(list_length(clauses) * sizeof(AttrNumber));
+ reduced = (bool*)palloc0(list_length(clauses) * sizeof(bool));
+
+ /* fill the arrays */
+ nmvclauses = 0;
+ foreach (lc, clauses)
+ {
+ Node * clause = (Node*)lfirst(lc);
+ Bitmapset * attnums = get_varattnos(clause, relid);
+
+ mvclauses[nmvclauses] = clause;
+ mvattnums[nmvclauses] = bms_singleton_member(attnums);
+ nmvclauses++;
+ }
+
+ Assert(nmvclauses == list_length(clauses));
+
+ /* now try to reduce the clauses (using the dependencies) */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ int j;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[i], attnums))
+ continue;
+
+ /* this clause was already reduced, so let's skip it */
+ if (reduced[i])
+ continue;
+
+ /* walk the potentially 'implied' clauses */
+ for (j = 0; j < nmvclauses; j++)
+ {
+ int aidx, bidx;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[j], attnums))
+ continue;
+
+ aidx = attnum_to_idx[mvattnums[i]];
+ bidx = attnum_to_idx[mvattnums[j]];
+
+ /* can't reduce the clause by itself, or if already reduced */
+ if ((i == j) || reduced[j])
+ continue;
+
+ /* mark the clause as reduced (if aidx => bidx) */
+ reduced[j] = matrix[aidx * natts + bidx];
+ }
+ }
+
+ /* now walk through the clauses, and keep only those not reduced */
+ for (i = 0; i < nmvclauses; i++)
+ if (! reduced[i])
+ reduced_clauses = lappend(reduced_clauses, mvclauses[i]);
+
+ pfree(reduced);
+ pfree(mvclauses);
+ pfree(mvattnums);
+
+ return reduced_clauses;
+}
+
+/*
+ * filter clauses that are interesting for the reduction step
+ *
+ * Functional dependencies can only work with equality clauses with attributes
+ * covered by at least one of the statistics, so we walk through the clauses
+ * and copy the uninteresting ones directly to the result (reduced) clauses.
+ *
+ * That includes clauses that:
+ * (a) are not mv-compatible
+ * (b) reference more than a single attnum
+ * (c) use attnum not covered by functional depencencies
+ *
+ * The clauses interesting for the reduction step are copied to deps_clauses.
+ *
+ * root - planner root
+ * clauses - list of clauses (input)
+ * deps_attnums - attributes covered by dependencies
+ * reduced_clauses - resulting clauses (not subject to reduction step)
+ * deps_clauses - clauses to be processed by reduction
+ * relid - relid of the baserel
+ *
+ * The return value is a bitmap of attnums referenced by deps_clauses.
+ */
+static Bitmapset *
+fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Index relid)
+{
+ ListCell *lc;
+ Bitmapset *clause_attnums = NULL;
+
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(lc);
+
+ if (! clause_is_mv_compatible(clause, relid, &attnum))
+
+ /* clause incompatible with functional dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(attnum, deps_attnums))
+
+ /* clause not covered by the dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else
+ {
+ *deps_clauses = lappend(*deps_clauses, clause);
+ clause_attnums = bms_add_member(clause_attnums, attnum);
+ }
+ }
+
+ return clause_attnums;
+}
+
+/*
+ * reduce list of equality clauses using soft functional dependencies
+ *
+ * We simply walk through list of functional dependencies, and for each one we
+ * check whether the dependency 'matches' the clauses, i.e. if there's a clause
+ * matching the condition. If yes, we attempt to remove all clauses matching
+ * the implied part of the dependency from the list.
+ *
+ * This only reduces equality clauses, and ignores all the other types. We might
+ * extend it to handle IS NULL clause, in the future.
+ *
+ * We also assume the equality clauses are 'compatible'. For example we can't
+ * identify when the clauses use a mismatching zip code and city name. In such
+ * case the usual approach (product of selectivities) would produce a better
+ * estimate, although mostly by chance.
+ *
+ * The implementation needs to be careful about cyclic dependencies, e.g. when
+ *
+ * (a -> b) and (b -> a)
+ *
+ * at the same time, which means there's 1:1 relationship between te columns.
+ * In this case we must not reduce clauses on both attributes at the same time.
+ *
+ * TODO Currently we only apply functional dependencies at the same level, but
+ * maybe we could transfer the clauses from upper levels to the subtrees?
+ * For example let's say we have (a->b) dependency, and condition
+ *
+ * (a=1) AND (b=2 OR c=3)
+ *
+ * Currently, we won't be able to perform any reduction, because we'll
+ * consider (a=1) and (b=2 OR c=3) independently. But maybe we could pass
+ * (a=1) into the other expression, and only check it against conditions
+ * of the functional dependencies?
+ *
+ * In this case we'd end up with
+ *
+ * (a=1)
+ *
+ * as we'd consider (b=2) implied thanks to the rule, rendering the whole
+ * OR clause valid.
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Index relid, List *stats)
+{
+ List *reduced_clauses = NIL;
+
+ /*
+ * matrix of (natts x natts), 1 means x=>y
+ *
+ * This serves two purposes - first, it merges dependencies from all
+ * the statistics, second it makes generating all the transitive
+ * dependencies easier.
+ *
+ * We need to build this only for attributes from the dependencies,
+ * not for all attributes in the table.
+ *
+ * We can't do that only for attributes from the clauses, because we
+ * want to build transitive dependencies (including those going
+ * through attributes not listed in the stats).
+ *
+ * This only works for A=>B dependencies, not sure how to do that
+ * for complex dependencies.
+ */
+ bool *deps_matrix;
+ int deps_natts; /* size of the matric */
+
+ /* mapping attnum <=> matrix index */
+ int *deps_idx_to_attnum;
+ int *deps_attnum_to_idx;
+
+ /* attnums in dependencies and clauses (and intersection) */
+ List *deps_clauses = NIL;
+ Bitmapset *deps_attnums = NULL;
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *intersect_attnums = NULL;
+
+ /*
+ * Is there at least one statistics with functional dependencies?
+ * If not, return the original clauses right away.
+ *
+ * XXX Isn't this pointless, thanks to exactly the same check in
+ * clauselist_selectivity()? Can we trigger the condition here?
+ */
+ if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ return clauses;
+
+ /*
+ * Build the dependency matrix, i.e. attribute adjacency matrix,
+ * where 1 means (a=>b). Once we have the adjacency matrix, we'll
+ * multiply it by itself, to get transitive dependencies.
+ *
+ * Note: This is pretty much transitive closure from graph theory.
+ *
+ * First, let's see what attributes are covered by functional
+ * dependencies (sides of the adjacency matrix), and also a maximum
+ * attribute (size of mapping to simple integer indexes);
+ */
+ deps_attnums = fdeps_collect_attnums(stats);
+
+ /*
+ * Walk through the clauses - clauses that are (one of)
+ *
+ * (a) not mv-compatible
+ * (b) are using more than a single attnum
+ * (c) using attnum not covered by functional depencencies
+ *
+ * may be copied directly to the result. The interesting clauses are
+ * kept in 'deps_clauses' and will be processed later.
+ */
+ clause_attnums = fdeps_filter_clauses(root, clauses, deps_attnums,
+ &reduced_clauses, &deps_clauses, relid);
+
+ /*
+ * we need at least two clauses referencing two different attributes
+ * referencing to do the reduction
+ */
+ if ((list_length(deps_clauses) < 2) || (bms_num_members(clause_attnums) < 2))
+ {
+ bms_free(clause_attnums);
+ list_free(reduced_clauses);
+ list_free(deps_clauses);
+
+ return clauses;
+ }
+
+
+ /*
+ * We need at least two matching attributes in the clauses and
+ * dependencies, otherwise we can't really reduce anything.
+ */
+ intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
+ if (bms_num_members(intersect_attnums) < 2)
+ {
+ bms_free(clause_attnums);
+ bms_free(deps_attnums);
+ bms_free(intersect_attnums);
+
+ list_free(deps_clauses);
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ /*
+ * Build mapping between matrix indexes and attnums, and then the
+ * adjacency matrix itself.
+ */
+ deps_idx_to_attnum = make_idx_to_attnum_mapping(deps_attnums);
+ deps_attnum_to_idx = make_attnum_to_idx_mapping(deps_attnums);
+
+ /* build the adjacency matrix */
+ deps_matrix = build_adjacency_matrix(stats, deps_attnums,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx);
+
+ deps_natts = bms_num_members(deps_attnums);
+
+ /*
+ * Multiply the matrix N-times (N = size of the matrix), so that we
+ * get all the transitive dependencies. That makes the next step
+ * much easier and faster.
+ *
+ * This is essentially an adjacency matrix from graph theory, and
+ * by multiplying it we get transitive edges. We don't really care
+ * about the exact number (number of paths between vertices) though,
+ * so we can do the multiplication in-place (we don't care whether
+ * we found the dependency in this round or in the previous one).
+ *
+ * Track how many new dependencies were added, and stop when 0, but
+ * we can't multiply more than N-times (longest path in the graph).
+ */
+ multiply_adjacency_matrix(deps_matrix, deps_natts);
+
+ /*
+ * Walk through the clauses, and see which other clauses we may
+ * reduce. The matrix contains all transitive dependencies, which
+ * makes this very fast.
+ *
+ * We have to be careful not to reduce the clause using itself, or
+ * reducing all clauses forming a cycle (so we have to skip already
+ * eliminated clauses).
+ *
+ * I'm not sure whether this guarantees finding the best solution,
+ * i.e. reducing the most clauses, but it probably does (thanks to
+ * having all the transitive dependencies).
+ */
+ deps_clauses = fdeps_reduce_clauses(deps_clauses,
+ deps_attnums, deps_matrix,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx, relid);
+
+ /* join the two lists of clauses */
+ reduced_clauses = list_union(reduced_clauses, deps_clauses);
+
+ pfree(deps_matrix);
+ pfree(deps_idx_to_attnum);
+ pfree(deps_attnum_to_idx);
+
+ bms_free(deps_attnums);
+ bms_free(clause_attnums);
+ bms_free(intersect_attnums);
+
+ return reduced_clauses;
+}
+
+/*
+ * Check that there are stats with at least one of the requested types.
+ */
+static bool
+has_stats(List *stats, int type)
+{
+ ListCell *s;
+
+ foreach (s, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Lookups stats for a given baserel.
+ */
+static List *
+find_stats(PlannerInfo *root, Index relid)
+{
+ Assert(root->simple_rel_array[relid] != NULL);
+
+ return root->simple_rel_array[relid]->mvstatlist;
+}
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
new file mode 100644
index 0000000..a38ea7b
--- /dev/null
+++ b/src/backend/utils/mvstats/README.stats
@@ -0,0 +1,36 @@
+Multivariate statististics
+==========================
+
+When estimating various quantities (e.g. condition selectivities) the default
+approach relies on the assumption of independence. In practice that's often
+not true, resulting in estimation errors.
+
+Multivariate stats track different types of dependencies between the columns,
+hopefully improving the estimates.
+
+Currently we only have one kind of multivariate statistics - soft functional
+dependencies, and we use it to improve estimates of equality clauses. See
+README.dependencies for details.
+
+
+Selectivity estimation
+----------------------
+
+When estimating selectivity, we aim to achieve several things:
+
+ (a) maximize the estimate accuracy
+
+ (b) minimize the overhead, especially when no suitable multivariate stats
+ exist (so if you are not using multivariate stats, there's no overhead)
+
+This clauselist_selectivity() performs several inexpensive checks first, before
+even attempting to do the more expensive estimation.
+
+ (1) check if there are multivariate stats on the relation
+
+ (2) check there are at least two attributes referenced by clauses compatible
+ with multivariate statistics (equality clauses for func. dependencies)
+
+ (3) perform reduction of equality clauses using func. dependencies
+
+ (4) estimate the reduced list of clauses using regular statistics
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index a755c49..bd200bc 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -84,7 +84,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(stat->mvoid, deps, attrs);
@@ -163,6 +164,7 @@ list_mv_stats(Oid relid)
info->mvoid = HeapTupleGetOid(htup);
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
result = lappend(result, info);
@@ -274,6 +276,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 2a064a0..c80ba33 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -435,3 +435,27 @@ pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(result));
}
+
+MVDependencies
+load_mv_dependencies(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->deps_enabled && mvstat->deps_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_dependencies(DatumGetByteaP(deps));
+}
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 7ebd961..cc43a79 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,12 +17,20 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
int16 a;
@@ -48,6 +56,8 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
+MVDependencies load_mv_dependencies(Oid mvoid);
+
bytea * serialize_mv_dependencies(MVDependencies dependencies);
/* deserialization of stats (serialization is private to analyze) */
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..e759997
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,172 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index bec0316..4f2ffb8 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -110,3 +110,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7e9b319..097a04f 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -162,3 +162,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..48dea4d
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,150 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
--
2.1.0
0004-multivariate-MCV-lists.patchtext/x-patch; charset=UTF-8; name=0004-multivariate-MCV-lists.patchDownload
From c15fa03dbc0be00f80f12545b1468a8ca55a57f5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 16:52:15 +0200
Subject: [PATCH 4/9] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
---
doc/src/sgml/ref/create_statistics.sgml | 18 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 45 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 829 ++++++++++++++++++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.mcv | 137 ++++
src/backend/utils/mvstats/README.stats | 89 ++-
src/backend/utils/mvstats/common.c | 104 ++-
src/backend/utils/mvstats/common.h | 11 +-
src/backend/utils/mvstats/mcv.c | 1094 +++++++++++++++++++++++++++++++
src/bin/psql/describe.c | 25 +-
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 69 +-
src/test/regress/expected/mv_mcv.out | 207 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 178 +++++
22 files changed, 2776 insertions(+), 73 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.mcv
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index a86eae3..193e4b0 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -132,6 +132,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>max_mcv_items</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of MCV list items.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>mcv</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables MCV list for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index b8a264e..2d570ee 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -165,7 +165,9 @@ CREATE VIEW pg_mv_stats AS
S.staname AS staname,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 84a8b13..90bfaed 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -136,7 +136,13 @@ CreateStatistics(CreateStatsStmt *stmt)
ObjectAddress parentobject, childobject;
/* by default build nothing */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -212,6 +218,29 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -220,10 +249,16 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -243,8 +278,12 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 474d2c7..e3983fd 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1977,9 +1977,11 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
+ WRITE_BOOL_FIELD(mcv_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
+ WRITE_BOOL_FIELD(mcv_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 80708fe..977f88e 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -15,6 +15,7 @@
#include "postgres.h"
#include "access/sysattr.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
@@ -47,23 +48,51 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
-static bool clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum);
+static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
+ int type);
-static Bitmapset *collect_mv_attnums(List *clauses, Index relid);
+static Bitmapset *collect_mv_attnums(List *clauses, Index relid, int type);
-static int count_mv_attnums(List *clauses, Index relid);
+static int count_mv_attnums(List *clauses, Index relid, int type);
static int count_varnos(List *clauses, Index *relid);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Index relid, List *stats);
+static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
+
+static List *clauselist_mv_split(PlannerInfo *root, Index relid,
+ List *clauses, List **mvclauses,
+ MVStatisticInfo *mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
+
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,isor) \
+ (m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -89,11 +118,13 @@ static List * find_stats(PlannerInfo *root, Index relid);
* to verify that suitable multivariate statistics exist.
*
* If we identify such multivariate statistics apply, we try to apply them.
- * Currently we only have (soft) functional dependencies, so we try to reduce
- * the list of clauses.
*
- * Then we remove the clauses estimated using multivariate stats, and process
- * the rest of the clauses using the regular per-column stats.
+ * First we try to reduce the list of clauses by applying (soft) functional
+ * dependencies, and then we try to estimate the selectivity of the reduced
+ * list of clauses using the multivariate MCV list.
+ *
+ * Finally we remove the portion of clauses estimated using multivariate stats,
+ * and process the rest of the clauses using the regular per-column stats.
*
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
@@ -170,12 +201,46 @@ clauselist_selectivity(PlannerInfo *root,
* that need to be estimated by other types of stats (MCV, histograms etc).
*/
if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
- (count_mv_attnums(clauses, relid) >= 2))
+ (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_FDEP) >= 2))
{
clauses = clauselist_apply_dependencies(root, clauses, relid, stats);
}
/*
+ * Check that there are statistics with MCV list or histogram, and also the
+ * number of attributes covered by these types of statistics.
+ *
+ * If there are no such stats or not enough attributes, don't waste time
+ * with the multivariate code and simply skip to estimation using the
+ * regular per-column stats.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
+ (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV) >= 2))
+ {
+ /* collect attributes from the compatible conditions */
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV);
+
+ /* and search for the statistic covering the most attributes */
+ MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+
+ if (mvstat != NULL) /* we have a matching stats */
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
+ mvstat, MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -832,6 +897,69 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * estimate selectivity of clauses using multivariate statistic
+ *
+ * Perform estimation of the clauses using a MCV list.
+ *
+ * This assumes all the clauses are compatible with the selected statistics
+ * (e.g. only reference columns covered by the statistics, use supported
+ * operator, etc.).
+ *
+ * TODO We may support some additional conditions, most importantly those
+ * matching multiple columns (e.g. "a = b" or "a < b").
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities (i.e. the
+ * selectivity of the most restrictive clause), because that's the maximum
+ * we can ever get from ANDed list of clauses. This may probably prevent
+ * issues with hitting too many buckets and low precision histograms.
+ *
+ * TODO We may remember the lowest frequency in the MCV list, and then later use
+ * it as a upper boundary for the selectivity (had there been a more
+ * frequent item, it'd be in the MCV list). This might improve cases with
+ * low-detail histograms.
+ *
+ * TODO We may also derive some additional boundaries for the selectivity from
+ * the MCV list, because
+ *
+ * (a) if we have a "full equality condition" (one equality condition on
+ * each column of the statistic) and we found a match in the MCV list,
+ * then this is the final selectivity (and pretty accurate),
+ *
+ * (b) if we have a "full equality condition" and we haven't found a match
+ * in the MCV list, then the selectivity is below the lowest frequency
+ * found in the MCV list,
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive). But this
+ * requires really knowing the per-clause selectivities in advance,
+ * and that's not what we do now.
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Pull varattnos from the clauses, similarly to pull_varattnos() but:
*
@@ -869,28 +997,26 @@ get_varattnos(Node * node, Index relid)
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
-collect_mv_attnums(List *clauses, Index relid)
+collect_mv_attnums(List *clauses, Index relid, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
/*
- * Walk through the clauses and identify the ones we can estimate
- * using multivariate stats, and remember the relid/columns. We'll
- * then cross-check if we have suitable stats, and only if needed
- * we'll split the clauses into multivariate and regular lists.
+ * Walk through the clauses and identify the ones we can estimate using
+ * multivariate stats, and remember the relid/columns. We'll then
+ * cross-check if we have suitable stats, and only if needed we'll split
+ * the clauses into multivariate and regular lists.
*
- * For now we're only interested in RestrictInfo nodes with nested
- * OpExpr, using either a range or equality.
+ * For now we're only interested in RestrictInfo nodes with nested OpExpr,
+ * using either a range or equality.
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(clause, relid, &attnum))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, relid, &attnums, types);
}
/*
@@ -911,10 +1037,10 @@ collect_mv_attnums(List *clauses, Index relid)
* Count the number of attributes in clauses compatible with multivariate stats.
*/
static int
-count_mv_attnums(List *clauses, Index relid)
+count_mv_attnums(List *clauses, Index relid, int type)
{
int c;
- Bitmapset *attnums = collect_mv_attnums(clauses, relid);
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
c = bms_num_members(attnums);
@@ -944,9 +1070,183 @@ count_varnos(List *clauses, Index *relid)
return cnt;
}
+
+/*
+ * We're looking for statistics matching at least 2 attributes, referenced in
+ * clauses compatible with multivariate statistics. The current selection
+ * criteria is very simple - we choose the statistics referencing the most
+ * attributes.
+ *
+ * If there are multiple statistics referencing the same number of columns
+ * (from the clauses), the one with less source columns (as listed in the
+ * ADD STATISTICS when creating the statistics) wins. Else the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns, but one
+ * has 100 buckets and the other one has 1000 buckets (thus likely
+ * providing better estimates), this is not currently considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list, another
+ * one with just a histogram and a third one with both, we treat them equally.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts, so if
+ * there are multiple clauses on a single attribute, this still counts as
+ * a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example equality
+ * clauses probably work better with MCV lists than with histograms. But
+ * IS [NOT] NULL conditions may often work better with histograms (thanks
+ * to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be selected
+ * as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list of
+ * clauses into two parts - conditions that are compatible with the selected
+ * stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while the last
+ * condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting conditions
+ * instead of just referenced attributes), but eventually the best option should
+ * be to combine multiple statistics. But that's much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses, because
+ * 'dependencies' will probably work only with equality clauses.
+ */
+static MVStatisticInfo *
+choose_mv_statistics(List *stats, Bitmapset *attnums)
+{
+ int i;
+ ListCell *lc;
+
+ MVStatisticInfo *choice = NULL;
+
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements) and for
+ * each one count the referenced attributes (encoded in the 'attnums' bitmap).
+ */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = info->stakeys;
+ int numattrs = attrs->dim1;
+
+ /* skip dependencies-only stats */
+ if (! info->mcv_built)
+ continue;
+
+ /* count columns covered by the histogram */
+ for (i = 0; i < numattrs; i++)
+ if (bms_is_member(attrs->values[i], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = info;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses that
+ * will be evaluated using the chosen statistics, and the remaining clauses
+ * (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, Index relid,
+ List *clauses, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes, so we can do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(clause, relid, &attnums, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list of
+ * mv-compatible clauses. Otherwise, keep it in the list of 'regular'
+ * clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible with the chosen
+ * histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
typedef struct
{
+ int types; /* types of statistics ? */
Index varno; /* relid we're interested in */
Bitmapset *varattnos; /* attnums referenced by the clauses */
} mv_compatible_context;
@@ -964,23 +1264,66 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
{
if (node == NULL)
return false;
-
+
if (IsA(node, RestrictInfo))
{
RestrictInfo *rinfo = (RestrictInfo *) node;
-
+
/* Pseudoconstants are not really interesting here. */
if (rinfo->pseudoconstant)
return true;
-
+
/* clauses referencing multiple varnos are incompatible */
if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
return true;
-
+
/* check the clause inside the RestrictInfo */
return mv_compatible_walker((Node*)rinfo->clause, (void *) context);
}
+ if (or_clause(node) || and_clause(node) || not_clause(node))
+ {
+ /*
+ * AND/OR/NOT-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses are
+ * supported and some are not, and treat all supported subclauses
+ * as a single clause, compute it's selectivity using mv stats,
+ * and compute the total selectivity using the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the orclause
+ * with nested RestrictInfo - we won't have to call pull_varnos()
+ * for each clause, saving time.
+ *
+ * TODO Perhaps this needs a bit more thought for functional
+ * dependencies? Those don't quite work for NOT cases.
+ */
+ BoolExpr *expr = (BoolExpr *) node;
+ ListCell *lc;
+
+ foreach (lc, expr->args)
+ {
+ if (mv_compatible_walker((Node *) lfirst(lc), context))
+ return true;
+ }
+
+ return false;
+ }
+
+ if (IsA(node, NullTest))
+ {
+ NullTest* nt = (NullTest*)node;
+
+ /*
+ * Only simple (Var IS NULL) expressions supported for now. Maybe we could
+ * use examine_variable to fix this?
+ */
+ if (! IsA(nt->arg, Var))
+ return true;
+
+ return mv_compatible_walker((Node*)(nt->arg), context);
+ }
+
if (IsA(node, Var))
{
Var * var = (Var*)node;
@@ -1031,7 +1374,7 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
/* unsupported structure (two variables or so) */
if (! ok)
return true;
-
+
/*
* If it's not a "<" or ">" or "=" operator, just ignore the clause.
* Otherwise note the relid and attnum for the variable. This uses the
@@ -1041,10 +1384,18 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
switch (get_oprrest(expr->opno))
{
case F_EQSEL:
-
/* equality conditions are compatible with all statistics */
break;
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+
+ /* not compatible with functional dependencies */
+ if (! (context->types & MV_CLAUSE_TYPE_MCV))
+ return true; /* terminate */
+
+ break;
+
default:
/* unknown estimator */
@@ -1055,11 +1406,11 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
return mv_compatible_walker((Node *) var, context);
}
-
+
/* Node not explicitly supported, so terminate */
return true;
}
-
+
/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
@@ -1078,10 +1429,11 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
* evaluate them using multivariate stats.
*/
static bool
-clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
+clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums, int types)
{
mv_compatible_context context;
+ context.types = types;
context.varno = relid;
context.varattnos = NULL; /* no attnums */
@@ -1089,7 +1441,7 @@ clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
return false;
/* remember the newly collected attnums */
- *attnum = bms_singleton_member(context.varattnos);
+ *attnums = bms_add_members(*attnums, context.varattnos);
return true;
}
@@ -1394,24 +1746,39 @@ fdeps_filter_clauses(PlannerInfo *root,
foreach (lc, clauses)
{
- AttrNumber attnum;
+ Bitmapset *attnums = NULL;
Node *clause = (Node *) lfirst(lc);
- if (! clause_is_mv_compatible(clause, relid, &attnum))
+ if (! clause_is_mv_compatible(clause, relid, &attnums,
+ MV_CLAUSE_TYPE_FDEP))
/* clause incompatible with functional dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
- else if (! bms_is_member(attnum, deps_attnums))
+ else if (bms_num_members(attnums) > 1)
+
+ /*
+ * clause referencing multiple attributes (strange, should
+ * this be handled by clause_is_mv_compatible directly)
+ */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(bms_singleton_member(attnums), deps_attnums))
/* clause not covered by the dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
else
{
+ /* ok, clause compatible with existing dependencies */
+ Assert(bms_num_members(attnums) == 1);
+
*deps_clauses = lappend(*deps_clauses, clause);
- clause_attnums = bms_add_member(clause_attnums, attnum);
+ clause_attnums = bms_add_member(clause_attnums,
+ bms_singleton_member(attnums));
}
+
+ bms_free(attnums);
}
return clause_attnums;
@@ -1637,6 +2004,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
}
return false;
@@ -1652,3 +2022,392 @@ find_stats(PlannerInfo *root, Index relid)
return root->simple_rel_array[relid]->mvstatlist;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = load_mv_mcvlist(mvstats->mvoid);
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* used to 'scale' for MCV lists not covering all tuples */
+ u += mcvlist->items[i]->frequency;
+
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s*u;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /*
+ * find the lowest frequency in the MCV list
+ *
+ * We need to do that here, because we do various tricks in the following
+ * code - skipping items already ruled out, etc.
+ *
+ * XXX A loop is necessary because the MCV list is not sorted by frequency.
+ */
+ *lowsel = 1.0;
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+ }
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr *expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+ FmgrInfo opproc;
+
+ /* get procedure computing operator selectivity */
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ switch (oprrest)
+ {
+ case F_EQSEL:
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ mismatch = ! DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (! mismatch)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ break;
+
+ case F_SCALARLTSEL: /* column < constant */
+ case F_SCALARGTSEL: /* column > constant */
+
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ /* invert the result if isgt=true */
+ mismatch = (isgt) ? (! mismatch) : mismatch;
+ break;
+ }
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! item->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (item->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b9de71d..a92f889 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -416,7 +416,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built)
+ if (mvstat->deps_built || mvstat->mcv_built)
{
info = makeNode(MVStatisticInfo);
@@ -425,9 +425,11 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
+ info->mcv_enabled = mvstat->mcv_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
+ info->mcv_built = mvstat->mcv_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..f9bf10c 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o dependencies.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.mcv b/src/backend/utils/mvstats/README.mcv
new file mode 100644
index 0000000..e93cfe4
--- /dev/null
+++ b/src/backend/utils/mvstats/README.mcv
@@ -0,0 +1,137 @@
+MCV lists
+=========
+
+Multivariate MCV (most-common values) lists are a straightforward extension of
+regular MCV list, tracking most frequent combinations of values for a group of
+attributes.
+
+This works particularly well for columns with a small number of distinct values,
+as the list may include all the combinations and approximate the distribution
+very accurately.
+
+For columns with large number of distinct values (e.g. those with continuous
+domains), the list will only track the most frequent combinations. If the
+distribution is mostly uniform (all combinations about equally frequent), the
+MCV list will be empty.
+
+Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
+than when using histograms.
+
+Also, MCV lists don't necessarily require sorting of the values (the fact that
+we use sorting when building them is implementation detail), but even more
+importantly the ordering is not built into the approximation (while histograms
+are built on ordering). So MCV lists work well even for attributes where the
+ordering of the data type is disconnected from the meaning of the data. For
+example we know how to sort strings, but it's unlikely to make much sense for
+city names (or other label-like attributes).
+
+
+Selectivity estimation
+----------------------
+
+The estimation, implemented in clauselist_mv_selectivity_mcvlist(), is quite
+simple in principle - we need to identify MCV items matching all the clauses
+and sum frequencies of all those items.
+
+Currently MCV lists support estimation of the following clause types:
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR clauses WHERE (a < 1) OR (b >= 2)
+
+It's possible to add support for additional clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and possibly others. These are tasks for the future, not yet implemented.
+
+
+Estimating equality clauses
+---------------------------
+
+When computing selectivity estimate for equality clauses
+
+ (a = 1) AND (b = 2)
+
+we can do this estimate pretty exactly assuming that two conditions are met:
+
+ (1) there's an equality condition on all attributes of the statistic
+
+ (2) we find a matching item in the MCV list
+
+In this case we know the MCV item represents all tuples matching the clauses,
+and the selectivity estimate is complete (i.e. we don't need to perform
+estimation using the histogram). This is what we call 'full match'.
+
+When only (1) holds, but there's no matching MCV item, we don't know whether
+there are no such rows or just are not very frequent. We can however use the
+frequency of the least frequent MCV item as an upper bound for the selectivity.
+
+For a combination of equality conditions (not full-match case) we can clamp the
+selectivity by the minimum of selectivities for each condition. For example if
+we know the number of distinct values for each column, we can use 1/ndistinct
+as a per-column estimate. Or rather 1/ndistinct + selectivity derived from the
+MCV list.
+
+We should also probably only use the 'residual ndistinct' by exluding the items
+included in the MCV list (and also residual frequency):
+
+ f = (1.0 - sum(MCV frequencies)) / (ndistinct - ndistinct(MCV list))
+
+but it's worth pointing out the ndistinct values are multi-variate for the
+columns referenced by the equality conditions.
+
+Note: Only the "full match" limit is currently implemented.
+
+
+Hashed MCV (not yet implemented)
+--------------------------------
+
+Regular MCV lists have to include actual values for each item, so if those items
+are large the list may be quite large. This is especially true for multi-variate
+MCV lists, although the current implementation partially mitigates this by
+performing de-duplicating the values before storing them on disk.
+
+It's possible to only store hashes (32-bit values) instead of the actual values,
+significantly reducing the space requirements. Obviously, this would only make
+the MCV lists useful for estimating equality conditions (assuming the 32-bit
+hashes make the collisions rare enough).
+
+This might also complicate matching the columns to available stats.
+
+
+TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
+ of the actual values. This type of MCV list will be useful only for
+ estimating equality clauses, and will reduce space requirements for large
+ varlena types (in such cases we usually only want equality anyway).
+
+TODO Currently there's no logic to consider building only a MCV list (and not
+ building the histogram at all), except for doing this decision manually in
+ ADD STATISTICS.
+
+
+Inspecting the MCV list
+-----------------------
+
+Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarrays, so we
+simply get the text representation of the arrays.
+
+With multivariate MCV lits it's not that simple due to the possible mix of
+data types. It might be possible to produce similar array-like representation,
+but that'd unnecessarily complicate further processing and analysis of the MCV
+list. Instead, there's a SRF function providing values, frequencies etc.
+
+ SELECT * FROM pg_mv_mcv_items();
+
+It has two input parameters:
+
+ oid - OID of the MCV list (pg_mv_statistic.staoid)
+
+and produces a table with these columns:
+
+ - item ID (0...nitems-1)
+ - values (string array)
+ - nulls only (boolean array)
+ - frequency (double precision)
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index a38ea7b..5c5c59a 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -8,9 +8,50 @@ not true, resulting in estimation errors.
Multivariate stats track different types of dependencies between the columns,
hopefully improving the estimates.
-Currently we only have one kind of multivariate statistics - soft functional
-dependencies, and we use it to improve estimates of equality clauses. See
-README.dependencies for details.
+
+Types of statistics
+-------------------
+
+Currently we only have two kinds of multivariate statistics
+
+ (a) soft functional dependencies (README.dependencies)
+
+ (b) MCV lists (README.mcv)
+
+
+Compatible clause types
+-----------------------
+
+Each type of statistics may be used to estimate some subset of clause types.
+
+ (a) functional dependencies - equality clauses (AND), possibly IS NULL
+
+ (b) MCV list - equality and inequality clauses, IS [NOT] NULL, AND/OR
+
+Currently only simple operator clauses (Var op Const) are supported, but it's
+possible to support more complex clause types, e.g. (Var op Var).
+
+
+Complex clauses
+---------------
+
+We also support estimating more complex clauses - essentially AND/OR clauses
+with (Var op Const) as leaves, as long as all the referenced attributes are
+covered by a single statistics.
+
+For example this condition
+
+ (a=1) AND ((b=2) OR ((c=3) AND (d=4)))
+
+may be estimated using statistics on (a,b,c,d). If we only have statistics on
+(b,c,d) we may estimate the second part, and estimate (a=1) using simple stats.
+
+If we only have statistics on (a,b,c) we can't apply it at all at this point,
+but it's worth pointing out clauselist_selectivity() works recursively and when
+handling the second part (the OR-clause), we'll be able to apply the statistics.
+
+Note: The multi-statistics estimation patch also makes it possible to pass some
+clauses as 'conditions' into the deeper parts of the expression tree.
Selectivity estimation
@@ -23,14 +64,48 @@ When estimating selectivity, we aim to achieve several things:
(b) minimize the overhead, especially when no suitable multivariate stats
exist (so if you are not using multivariate stats, there's no overhead)
-This clauselist_selectivity() performs several inexpensive checks first, before
+Thus clauselist_selectivity() performs several inexpensive checks first, before
even attempting to do the more expensive estimation.
(1) check if there are multivariate stats on the relation
- (2) check there are at least two attributes referenced by clauses compatible
- with multivariate statistics (equality clauses for func. dependencies)
+ (2) check that there are functional dependencies on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equality clauses for func. dependencies)
(3) perform reduction of equality clauses using func. dependencies
- (4) estimate the reduced list of clauses using regular statistics
+ (4) check that there are multivariate MCV lists on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equalities, inequalities, etc.)
+
+ (5) find the best multivariate statistics (matching the most conditions)
+ and use it to compute the estimate
+
+ (6) estimate the remaining clauses (not estimated using multivariate stats)
+ using the regular per-column statistics
+
+Whenever we find there are no suitable stats, we skip the expensive steps.
+
+
+Further (possibly crazy) ideas
+------------------------------
+
+Currently the clauses are only estimated using a single statistics, even if
+there are multiple candidate statistics - for example assume we have statistics
+on (a,b,c) and (b,c,d), and estimate conditions
+
+ (b = 1) AND (c = 2)
+
+Then both statistics may be used, but we only use one of them. Maybe we could
+use compute estimates using all candidate stats, and somehow aggregate them
+into the final estimate by using average or median.
+
+Some stats may give better estimates than others, but it's very difficult to say
+in advance which stats are the best (it depends on the number of buckets, number
+of additional columns not referenced in the clauses, type of condition etc.).
+
+But of course, this may result in expensive estimation (CPU-wise).
+
+So we might add a GUC to choose between a simple (single statistics) and thus
+multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index bd200bc..d1da714 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,12 +16,14 @@
#include "common.h"
+#include "utils/array.h"
+
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+ int natts,
+ VacAttrStats **vacattrstats);
static List* list_mv_stats(Oid relid);
-
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -49,6 +51,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int j;
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -87,8 +91,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (stat->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, attrs);
+ update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -166,6 +174,8 @@ list_mv_stats(Oid relid)
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
+ info->mcv_enabled = stats->mcv_enabled;
+ info->mcv_built = stats->mcv_built;
result = lappend(result, info);
}
@@ -180,8 +190,56 @@ list_mv_stats(Oid relid)
return result;
}
+
+/*
+ * Find attnims of MV stats using the mvoid.
+ */
+int2vector*
+find_mv_attnums(Oid mvoid, Oid *relid)
+{
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ HeapTuple htup;
+ int2vector *keys;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ htup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+ /* starelid */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_starelid, &isnull);
+ Assert(!isnull);
+
+ *relid = DatumGetObjectId(adatum);
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+ ReleaseSysCache(htup);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return keys;
+}
+
+
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -206,18 +264,29 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
@@ -246,6 +315,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -256,11 +340,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index 6d5465b..f4309f7 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -46,7 +46,15 @@ typedef struct
Datum value; /* a data value */
int tupno; /* position index for tuple it came from */
} ScalarItem;
-
+
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -71,5 +79,6 @@ int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..551c934
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1094 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(int32))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(int32) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/* pointers into a flat serialized item of ITEM_SIZE(n) bytes */
+#define ITEM_INDEXES(item) ((uint16*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+/*
+ * Builds MCV list from sample rows, and removes rows represented by
+ * the MCV list from the sample (the number of remaining sample rows is
+ * returned by the numrows_filtered parameter).
+ *
+ * The method is quite simple - in short it does about these steps:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * For more details, see the comments in the code.
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we
+ * want to check the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or
+ * float4). Maybe we could save some space here, but the bytea
+ * compression should handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed
+ * from the table, but rather estimate the number of distinct
+ * values in the table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * Preallocate space for all the items as a single chunk, and point
+ * the items to the appropriate parts of the array.
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ /* keep all the rows by default (as if there was no MCV list) */
+ *numrows_filtered = numrows;
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ items[j].values[i] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Count the number of distinct groups - just walk through the
+ * sorted list and count the number of key changes. We use this to
+ * determine the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array. We'll always
+ * require at least 4 rows per group.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e.
+ * if there are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS),
+ * we'll require only 2 rows per group.
+ *
+ * TODO For now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ * for the multivariate case.
+ *
+ * TODO We can do this only if we believe we got all the distinct
+ * values of the table.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog)
+ * instead of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /*
+ * Walk through the sorted data again, and see how many groups
+ * reach the mcv_threshold (and become an item in the MCV list).
+ */
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or new group, so check if we exceed mcv_threshold */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* group hits the threshold, count the group as MCV item */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* within group, so increase the number of items */
+ count += 1;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as
+ * we'll pass this outside this method and thus it needs to be
+ * easy to pfree() the data (and we wouldn't know where the
+ * arrays start).
+ *
+ * TODO Maybe the reasoning that we can't allocate a single
+ * piece because we're passing it out is bogus? Who'd
+ * free a single item of the MCV list, anyway?
+ *
+ * TODO Maybe with a proper encoding (stuffing all the values
+ * into a list-level array, this will be untrue)?
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /*
+ * Repeat the same loop as above, but this time copy the data
+ * into the MCV list (for items exceeding the threshold).
+ *
+ * TODO Maybe we could simply remember indexes of the last item
+ * in each group (from the previous loop)?
+ */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[nitems];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, items[(i-1)].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, items[(i-1)].isnull, sizeof(bool) * numattrs);
+
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ /* next item */
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows
+ * that are not represented by the MCV list).
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ *
+ * A better option would be keeping the ID of the row in
+ * the sort item, and then just walk through the items and
+ * mark rows to remove (in a bitmap of the same size).
+ * There's not space for that in SortItem at this moment,
+ * but it's trivial to add 'private' pointer, or just
+ * using another structure with extra field (starting with
+ * SortItem, so that the comparators etc. still work).
+ *
+ * Another option is to use the sorted array of items
+ * (because that's how we sorted the source data), and
+ * simply do a bsearch() into it. If we find a matching
+ * item, the row belongs to the MCV list.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* used for the searches */
+ SortItem item, mcvitem;;
+
+ item.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ item.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * FIXME we don't need to allocate this, we can reference
+ * the MCV item directly ...
+ */
+ mcvitem.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ mcvitem.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ item.values[j] = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &item.isnull[j]);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /*
+ * TODO Create a SortItem/MCVItem comparator so that
+ * we don't need to do memcpy() like crazy.
+ */
+ memcpy(mcvitem.values, mcvlist->items[j]->values,
+ numattrs * sizeof(Datum));
+ memcpy(mcvitem.isnull, mcvlist->items[j]->isnull,
+ numattrs * sizeof(bool));
+
+ if (multi_sort_compare(&item, &mcvitem, mss) == 0)
+ {
+ match = true;
+ break;
+ }
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the rows and remember how many rows we kept */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(rows_filtered);
+ pfree(item.values);
+ pfree(item.isnull);
+ pfree(mcvitem.values);
+ pfree(mcvitem.isnull);
+ }
+ }
+
+ pfree(values);
+ pfree(items);
+ pfree(isnull);
+
+ return mcvlist;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+MCVList
+load_mv_mcvlist(Oid mvoid)
+{
+ bool isnull = false;
+ Datum mcvlist;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->mcv_enabled && mvstat->mcv_built);
+#endif
+
+ mcvlist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_mcvlist(DatumGetByteaP(mcvlist));
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Serialize MCV list into a bytea value. The basic algorithm is simple:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use uint16 values for the indexes in step (3), as we don't
+ * allow more than 8k MCV items (see list max_mcv_items). We might
+ * increase this to 65k and still fit into uint16.
+ *
+ * We don't really expect the high compression as with histograms,
+ * because we're not doing any bucket splits etc. (which is the source
+ * of high redundancy there), but we need to do it anyway as we need
+ * to serialize varlena values etc. We might invent another way to
+ * serialize MCV lists, but let's keep it consistent.
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider using 16-bit values for the indexes in step (3).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ Size total_length = 0;
+
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for all values, including NULLs (won't use them) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ if (! mcvlist->items[j]->isnull[i]) /* skip NULL values */
+ {
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* do not exceed UINT16_MAX */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval || (info[i].typlen > 0))
+ /* by value pased by reference, but fixed length */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data.
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized MCV exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'ptr' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the values for each item */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! mcvlist->items[i]->isnull[j])
+ {
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ v = (Datum*)bsearch(&mcvlist->items[i]->values[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims),
+ mcvlist->items[i]->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims),
+ &mcvlist->items[i]->frequency, sizeof(double));
+
+ /* copy the item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/*
+ * Inverse to serialize_mv_mcvlist() - see the comment there.
+ *
+ * We'll do full deserialization, because we don't really expect high
+ * duplication of values so the caching may not be as efficient as with
+ * histograms.
+ */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ uint16 *indexes = NULL;
+ Datum **values = NULL;
+
+ /* local allocation buffer (used only for deserialization) */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* buffer used for the result */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert(nitems > 0);
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MCVListData,items) +
+ ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * We'll allocate one large chunk of memory for the intermediate
+ * data, needed only for deserializing the MCV list, and we'll pack
+ * use a local dense allocation to minimize the palloc overhead.
+ *
+ * Let's see how much space we'll actually need, and also include
+ * space for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ /* for full-size byval types, we reuse the serialized value */
+ if (! (info[i].typbyval && info[i].typlen == sizeof(Datum)))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ /* pased by reference, but fixed length (name, tid, ...) */
+ if (info[i].typlen > 0)
+ {
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should exhaust the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for the MCV items in a single piece */
+ rbufflen = (sizeof(MCVItem) + sizeof(MCVItemData) +
+ sizeof(Datum)*ndims + sizeof(bool)*ndims) * nitems;
+
+ rbuff = palloc(rbufflen);
+ rptr = rbuff;
+
+ mcvlist->items = (MCVItem*)rbuff;
+ rptr += (sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)rptr;
+ rptr += (sizeof(MCVItemData));
+
+ item->values = (Datum*)rptr;
+ rptr += (sizeof(Datum)*ndims);
+
+ item->isnull = (bool*)rptr;
+ rptr += (sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+ for (j = 0; j < ndims; j++)
+ Assert(indexes[j] <= UINT16_MAX);
+#endif
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ /* release the temporary buffer */
+ pfree(buff);
+
+ return mcvlist;
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_mcv_items);
+
+Datum
+pg_mv_mcv_items(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MCVList mcvlist;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ mcvlist = load_mv_mcvlist(PG_GETARG_OID(0));
+
+ funcctx->user_fctx = mcvlist;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = mcvlist->nitems;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MCVList mcvlist;
+ MCVItem item;
+
+ mcvlist = (MCVList)funcctx->user_fctx;
+
+ Assert(call_cntr < mcvlist->nitems);
+
+ item = mcvlist->items[call_cntr];
+
+ stakeys = find_mv_attnums(PG_GETARG_OID(0), &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(4 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+
+ /* frequency */
+ values[3] = (char *) palloc(64 * sizeof(char));
+
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * mcvlist->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* item ID */
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ Datum val, valout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == mcvlist->ndimensions-1)
+ format = "%s, %s}";
+
+ val = item->values[i];
+ valout = FunctionCall1(&fmgrinfo[i], val);
+
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4f106c3..6339631 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,8 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled,\n"
- " deps_built,\n"
+ " deps_enabled, mcv_enabled,\n"
+ " deps_built, mcv_built,\n"
+ " mcv_max_items,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2128,6 +2129,8 @@ describeOneTableDetails(const char *schemaname,
printTableAddFooter(&cont, _("Statistics:"));
for (i = 0; i < tuples; i++)
{
+ bool first = true;
+
printfPQExpBuffer(&buf, " ");
/* statistics name (qualified with namespace) */
@@ -2137,10 +2140,22 @@ describeOneTableDetails(const char *schemaname,
/* options */
if (!strcmp(PQgetvalue(result, i, 4), "t"))
- appendPQExpBuffer(&buf, "(dependencies)");
+ {
+ appendPQExpBuffer(&buf, "(dependencies");
+ first = false;
+ }
+
+ if (!strcmp(PQgetvalue(result, i, 5), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", mcv");
+ else
+ appendPQExpBuffer(&buf, "(mcv");
+ first = false;
+ }
- appendPQExpBuffer(&buf, " ON (%s)",
- PQgetvalue(result, i, 6));
+ appendPQExpBuffer(&buf, ") ON (%s)",
+ PQgetvalue(result, i, 9));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index a568a07..fd7107d 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -37,15 +37,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -61,13 +67,17 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 7
+#define Natts_pg_mv_statistic 11
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
-#define Anum_pg_mv_statistic_deps_built 5
-#define Anum_pg_mv_statistic_stakeys 6
-#define Anum_pg_mv_statistic_stadeps 7
+#define Anum_pg_mv_statistic_mcv_enabled 5
+#define Anum_pg_mv_statistic_mcv_max_items 6
+#define Anum_pg_mv_statistic_deps_built 7
+#define Anum_pg_mv_statistic_mcv_built 8
+#define Anum_pg_mv_statistic_stakeys 9
+#define Anum_pg_mv_statistic_stadeps 10
+#define Anum_pg_mv_statistic_stamcv 11
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 20d565c..66b4bcd 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2670,6 +2670,10 @@ DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index de86d01..5ae6b3c 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -619,9 +619,11 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
+ bool mcv_enabled; /* MCV list enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
+ bool mcv_built; /* MCV list built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index cc43a79..4535db7 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -51,30 +51,89 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
*/
MVDependencies load_mv_dependencies(Oid mvoid);
+MCVList load_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..56748e3
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON mcv_list (unknown_column) WITH (mcv);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON mcv_list (a) WITH (mcv);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a, b) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+ERROR: max number of MCV items is 8192
+-- correct command
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON mcv_list (a, b, c, d) WITH (mcv);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 84b4425..66071d8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1373,7 +1373,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
s.staname,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 4f2ffb8..85d94f1 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 097a04f..6584d73 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -163,3 +163,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..af4c9f4
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,178 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON mcv_list (unknown_column) WITH (mcv);
+
+-- single column
+CREATE STATISTICS s1 ON mcv_list (a) WITH (mcv);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a) WITH (mcv);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mcv_list (a, a, b) WITH (mcv);
+
+-- unknown option
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (unknown_option);
+
+-- missing MCV statistics
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+
+-- correct command
+CREATE STATISTICS s1 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON mcv_list (a, b, c, d) WITH (mcv);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DROP TABLE mcv_list;
--
2.1.0
0005-multivariate-histograms.patchtext/x-patch; charset=UTF-8; name=0005-multivariate-histograms.patchDownload
From 31ff6cd36727d73e72aaa5fa1a0c52da460dae5b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 5/9] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
doc/src/sgml/ref/create_statistics.sgml | 18 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 44 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 571 +++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.histogram | 287 ++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 37 +-
src/backend/utils/mvstats/histogram.c | 2032 ++++++++++++++++++++++++++++
src/bin/psql/describe.c | 17 +-
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 136 +-
src/test/regress/expected/mv_histogram.out | 207 +++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 176 +++
21 files changed, 3538 insertions(+), 38 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.histogram
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index 193e4b0..fd3382e 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -133,6 +133,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</varlistentry>
<varlistentry>
+ <term><literal>histogram</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables histogram for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>max_buckets</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of histogram buckets.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><literal>max_mcv_items</> (<type>integer</>)</term>
<listitem>
<para>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d570ee..6afdee0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -167,7 +167,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 90bfaed..b974655 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -137,12 +137,15 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -241,6 +244,29 @@ CreateStatistics(CreateStatsStmt *stmt)
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("maximum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -249,10 +275,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -260,6 +286,11 @@ CreateStatistics(CreateStatsStmt *stmt)
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -279,11 +310,14 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index e3983fd..d3a96f0 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1978,10 +1978,12 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
WRITE_BOOL_FIELD(mcv_enabled);
+ WRITE_BOOL_FIELD(hist_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
WRITE_BOOL_FIELD(mcv_built);
+ WRITE_BOOL_FIELD(hist_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 977f88e..0de2418 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -49,6 +49,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
int type);
@@ -74,6 +75,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStatisticInfo *mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -81,6 +84,12 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
@@ -93,6 +102,7 @@ static List * find_stats(PlannerInfo *root, Index relid);
#define UPDATE_RESULT(m,r,isor) \
(m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -121,7 +131,7 @@ static List * find_stats(PlannerInfo *root, Index relid);
*
* First we try to reduce the list of clauses by applying (soft) functional
* dependencies, and then we try to estimate the selectivity of the reduced
- * list of clauses using the multivariate MCV list.
+ * list of clauses using the multivariate MCV list and histograms.
*
* Finally we remove the portion of clauses estimated using multivariate stats,
* and process the rest of the clauses using the regular per-column stats.
@@ -214,11 +224,13 @@ clauselist_selectivity(PlannerInfo *root,
* with the multivariate code and simply skip to estimation using the
* regular per-column stats.
*/
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
- (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV) >= 2))
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) &&
+ (count_mv_attnums(clauses, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
/* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV);
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* and search for the statistic covering the most attributes */
MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
@@ -230,7 +242,7 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
- mvstat, MV_CLAUSE_TYPE_MCV);
+ mvstat, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -942,6 +954,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -955,9 +968,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* TODO if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1160,7 +1188,7 @@ choose_mv_statistics(List *stats, Bitmapset *attnums)
int numattrs = attrs->dim1;
/* skip dependencies-only stats */
- if (! info->mcv_built)
+ if (! (info->mcv_built || info->hist_built))
continue;
/* count columns covered by the histogram */
@@ -1391,7 +1419,7 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (! (context->types & MV_CLAUSE_TYPE_MCV))
+ if (! (context->types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST)))
return true; /* terminate */
break;
@@ -2007,6 +2035,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
}
return false;
@@ -2411,3 +2442,525 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ int nmatches = 0;
+ char *matches = NULL;
+
+ MVSerializedHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = load_mv_histogram(mvstats->mvoid);
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * Find out what part of the data is covered by the histogram,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample got into the histogram, and the rest
+ * is in a MCV list).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole histogram, which might save us some time
+ * spent accessing the not-matching part of the histogram.
+ * Although it's likely in a cache, so it's very fast.
+ */
+ u += mvhist->buckets[i]->ntuples;
+
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+#ifdef DEBUG_MVHIST
+ debug_histogram_matches(mvhist, matches);
+#endif
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s * u;
+}
+
+/* cached result of bucket boundary comparison for a single dimension */
+
+#define HIST_CACHE_NOT_FOUND 0x00
+#define HIST_CACHE_FALSE 0x01
+#define HIST_CACHE_TRUE 0x03
+#define HIST_CACHE_MASK 0x02
+
+static char
+bucket_contains_value(FmgrInfo ltproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache)
+{
+ bool a, b;
+
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /*
+ * First some quick checks on equality - if any of the boundaries equals,
+ * we have a partial match (so no need to call the comparator).
+ */
+ if (((min_value == constvalue) && (min_include)) ||
+ ((max_value == constvalue) && (max_include)))
+ return MVSTATS_MATCH_PARTIAL;
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ /*
+ * If result for the bucket lower bound not in cache, evaluate the function
+ * and store the result in the cache.
+ */
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, min_value));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /* And do the same for the upper bound. */
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, max_value));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ return (a ^ b) ? MVSTATS_MATCH_PARTIAL : MVSTATS_MATCH_NONE;
+}
+
+static char
+bucket_is_smaller_than_value(FmgrInfo opproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache, bool isgt)
+{
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ bool a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ bool b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ min_value,
+ constvalue));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ max_value,
+ constvalue));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /*
+ * Now, we need to combine both results into the final answer, and we need
+ * to be careful about the 'isgt' variable which kinda inverts the meaning.
+ *
+ * First, we handle the case when each boundary returns different results.
+ * In that case the outcome can only be 'partial' match.
+ */
+ if (a != b)
+ return MVSTATS_MATCH_PARTIAL;
+
+ /*
+ * When the results are the same, then it depends on the 'isgt' value. There
+ * are four options:
+ *
+ * isgt=false a=b=true => full match
+ * isgt=false a=b=false => empty
+ * isgt=true a=b=true => empty
+ * isgt=true a=b=false => full match
+ *
+ * We'll cheat a bit, because we know that (a=b) so we'll use just one of them.
+ */
+ if (isgt)
+ return (!a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+ else
+ return ( a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ /*
+ * Used for caching function calls, only once per deduplicated value.
+ *
+ * We know may have up to (2 * nbuckets) values per dimension. It's
+ * probably overkill, but let's allocate that once for all clauses,
+ * to minimize overhead.
+ *
+ * Also, we only need two bits per value, but this allocates byte
+ * per value. Might be worth optimizing.
+ *
+ * 0x00 - not yet called
+ * 0x01 - called, result is 'false'
+ * 0x03 - called, result is 'true'
+ */
+ char *callcache = palloc(mvhist->nbuckets);
+
+ Assert(mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ /* reset the cache (per clause) */
+ memset(callcache, 0, mvhist->nbuckets);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_LT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ char res = MVSTATS_MATCH_NONE;
+
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /* histogram boundaries */
+ Datum minval, maxval;
+ bool mininclude, maxinclude;
+ int minidx, maxidx;
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* lookup the values and cache of function calls */
+ minidx = bucket->min[idx];
+ maxidx = bucket->max[idx];
+
+ minval = mvhist->values[idx][bucket->min[idx]];
+ maxval = mvhist->values[idx][bucket->max[idx]];
+
+ mininclude = bucket->min_inclusive[idx];
+ maxinclude = bucket->max_inclusive[idx];
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ */
+ switch (oprrest)
+ {
+ case F_SCALARLTSEL: /* Var < Const */
+ case F_SCALARGTSEL: /* Var > Const */
+
+ res = bucket_is_smaller_than_value(opproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache, isgt);
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the
+ * lt operator, and we also check for equality with the boundaries.
+ */
+
+ res = bucket_contains_value(ltproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache);
+ break;
+ }
+
+ UPDATE_RESULT(matches[i], res, is_or);
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the bucket, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+
+ /* free the call cache */
+ pfree(callcache);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index a92f889..d46aed2 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -416,7 +416,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
{
info = makeNode(MVStatisticInfo);
@@ -426,10 +426,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
+ info->hist_enabled = mvstat->hist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
+ info->hist_built = mvstat->hist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index f9bf10c..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.histogram b/src/backend/utils/mvstats/README.histogram
new file mode 100644
index 0000000..8234d2c
--- /dev/null
+++ b/src/backend/utils/mvstats/README.histogram
@@ -0,0 +1,287 @@
+Multivariate histograms
+=======================
+
+Histograms on individual attributes consist of buckets represented by ranges,
+covering the domain of the attribute. That is, each bucket is a [min,max]
+interval, and contains all values in this range. The histogram is built in such
+a way that all buckets have about the same frequency.
+
+Multivariate histograms are an extension into n-dimensional space - the buckets
+are n-dimensional intervals (i.e. n-dimensional rectagles), covering the domain
+of the combination of attributes. That is, each bucket has a vector of lower
+and upper boundaries, denoted min[i] and max[i] (where i = 1..n).
+
+In addition to the boundaries, each bucket tracks additional info:
+
+ * frequency (fraction of tuples in the bucket)
+ * whether the boundaries are inclusive or exclusive
+ * whether the dimension contains only NULL values
+ * number of distinct values in each dimension (for building only)
+
+It's possible that in the future we'll multiple histogram types, with different
+features. We do however expect all the types to share the same representation
+(buckets as ranges) and only differ in how we build them.
+
+The current implementation builds non-overlapping buckets, that may not be true
+for some histogram types and the code should not rely on this assumption. There
+are interesting types of histograms (or algorithms) with overlapping buckets.
+
+When used on low-cardinality data, histograms usually perform considerably worse
+than MCV lists (which are a good fit for this kind of data). This is especially
+true on label-like values, where ordering of the values is mostly unrelated to
+meaning of the data, as proper ordering is crucial for histograms.
+
+On high-cardinality data the histograms are usually a better choice, because MCV
+lists can't represent the distribution accurately enough.
+
+
+Selectivity estimation
+----------------------
+
+The estimation is implemented in clauselist_mv_selectivity_histogram(), and
+works very similarly to clauselist_mv_selectivity_mcvlist.
+
+The main difference is that while MCV lists support exact matches, histograms
+often result in approximate matches - e.g. with equality we can only say if
+the constant would be part of the bucket, but not whether it really is there
+or what fraction of the bucket it corresponds to. In this case we rely on
+some defaults just like in the per-column histograms.
+
+The current implementation uses histograms to estimates those types of clauses
+(think of WHERE conditions):
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR-clauses WHERE (a = 1) OR (b = 2)
+
+Similarly to MCV lists, it's possible to add support for additional types of
+clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and so on. These are tasks for the future, not yet implemented.
+
+
+When evaluating a clause on a bucket, we may get one of three results:
+
+ (a) FULL_MATCH - The bucket definitely matches the clause.
+
+ (b) PARTIAL_MATCH - The bucket matches the clause, but not necessarily all
+ the tuples it represents.
+
+ (c) NO_MATCH - The bucket definitely does not match the clause.
+
+This may be illustrated using a range [1, 5], which is essentially a 1-D bucket.
+With clause
+
+ WHERE (a < 10) => FULL_MATCH (all range values are below
+ 10, so the whole bucket matches)
+
+ WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ the clause, but we don't know how many)
+
+ WHERE (a < 0) => NO_MATCH (the whole range is above 1, so
+ no values from the bucket can match)
+
+Some clauses may produce only some of those results - for example equality
+clauses may never produce FULL_MATCH as we always hit only part of the bucket
+(we can't match both boundaries at the same time). This results in less accurate
+estimates compared to MCV lists, where we can hit a MCV items exactly (there's
+no PARTIAL match in MCV).
+
+There are also clauses that may not produce any PARTIAL_MATCH results. A nice
+example of that is 'IS [NOT] NULL' clause, which either matches the bucket
+completely (FULL_MATCH) or not at all (NO_MATCH), thanks to how the NULL-buckets
+are constructed.
+
+Computing the total selectivity estimate is trivial - simply sum selectivities
+from all the FULL_MATCH and PARTIAL_MATCH buckets (but for buckets marked with
+PARTIAL_MATCH, multiply the frequency by 0.5 to minimize the average error).
+
+
+Building a histogram
+---------------------
+
+The algorithm of building a histogram in general is quite simple:
+
+ (a) create an initial bucket (containing all sample rows)
+
+ (b) create NULL buckets (by splitting the initial bucket)
+
+ (c) repeat
+
+ (1) choose bucket to split next
+
+ (2) terminate if no bucket that might be split found, or if we've
+ reached the maximum number of buckets (16384)
+
+ (3) choose dimension to partition the bucket by
+
+ (4) partition the bucket by the selected dimension
+
+The main complexity is hidden in steps (c.1) and (c.3), i.e. how we choose the
+bucket and dimension for the split.
+
+Similarly to one-dimensional histograms, we want to produce buckets with roughly
+the same frequency. We also need to produce "regular" buckets, because buckets
+with one "side" much longer than the others are very likely to match a lot of
+conditions (which increases error, even if the bucket frequency is very low).
+
+To achieve this, we choose the largest bucket (containing the most sample rows),
+but we only choose buckets that can actually be split (have at least 3 different
+combinations of values).
+
+Then we choose the "longest" dimension of the bucket, which is computed by using
+the distinct values in the sample as a measure.
+
+For details see functions select_bucket_to_partition() and partition_bucket().
+
+The current limit on number of buckets (16384) is mostly arbitrary, but chosen
+so that it guarantees we don't exceed the number of distinct values indexable by
+uint16 in any of the dimensions. In practice we could handle more buckets as we
+index each dimension separately and the splits should use the dimensions evenly.
+
+Also, histograms this large (with 16k values in multiple dimensions) would be
+quite expensive to build and process, so the 16k limit is rather reasonable.
+
+The actual number of buckets is also related to statistics target, because we
+require MIN_BUCKET_ROWS (10) tuples per bucket before a split, so we can't have
+more than (2 * 300 * target / 10) buckets. For the default target (100) this
+evaluates to ~6k.
+
+
+NULL handling (create_null_buckets)
+-----------------------------------
+
+When building histograms on a single attribute, we first filter out NULL values.
+In the multivariate case, we can't really do that because the rows may contain
+a mix of NULL and non-NULL values in different columns (so we can't simply
+filter all of them out).
+
+For this reason, the histograms are built in a way so that for each bucket, each
+dimension only contains only NULL or non-NULL values. Building the NULL-buckets
+happens as the first step in the build, by the create_null_buckets() function.
+The number of NULL buckets, as produced by this function, has a clear upper
+boundary (2^N) where N is the number of dimensions (attributes the histogram is
+built on). Or rather 2^K where K is the number of attributes that are not marked
+as not-NULL.
+
+The buckets with NULL dimensions are then subject to the same build algorithm
+(i.e. may be split into smaller buckets) just like any other bucket, but may
+only be split by non-NULL dimension.
+
+
+Serialization
+-------------
+
+To store the histogram in pg_mv_statistic table, it is serialized into a more
+efficient form. We also use the representation for estimation, i.e. we don't
+fully deserialize the histogram.
+
+For example the boundary values are deduplicated to minimize the required space.
+How much redundancy is there, actually? Let's assume there are no NULL values,
+so we start with a single bucket - in that case we have 2*N boundaries. Each
+time we split a bucket we introduce one new value (in the "middle" of one of
+the dimensions), and keep boundries for all the other dimensions. So after K
+splits, we have up to
+
+ 2*N + K
+
+unique boundary values (we may have fewe values, if the same value is used for
+several splits). But after K splits we do have (K+1) buckets, so
+
+ (K+1) * 2 * N
+
+boundary values. Using e.g. N=4 and K=999, we arrive to those numbers:
+
+ 2*N + K = 1007
+ (K+1) * 2 * N = 8000
+
+wich means a lot of redundancy. It's somewhat counter-intuitive that the number
+of distinct values does not really depend on the number of dimensions (except
+for the initial bucket, but that's negligible compared to the total).
+
+By deduplicating the values and replacing them with 16-bit indexes (uint16), we
+reduce the required space to
+
+ 1007 * 8 + 8000 * 2 ~= 24kB
+
+which is significantly less than 64kB required for the 'raw' histogram (assuming
+the values are 8B).
+
+While the bytea compression (pglz) might achieve the same reduction of space,
+the deduplicated representation is used to optimize the estimation by caching
+results of function calls for already visited values. This significantly
+reduces the number of calls to (often quite expensive) operators.
+
+Note: Of course, this reasoning only holds for histograms built by the algorithm
+that simply splits the buckets in half. Other histograms types (e.g. containing
+overlapping buckets) may behave differently and require different serialization.
+
+Serialized histograms are marked with 'magic' constant, to make it easier to
+check the bytea value really is a serialized histogram.
+
+
+varlena compression
+-------------------
+
+This serialization may however disable automatic varlena compression, the array
+of unique values is placed at the beginning of the serialized form. Which is
+exactly the chunk used by pglz to check if the data is compressible, and it
+will probably decide it's not very compressible. This is similar to the issue
+we had with JSONB initially.
+
+Maybe storing buckets first would make it work, as the buckets may be better
+compressible.
+
+On the other hand the serialization is actually a context-aware compression,
+usually compressing to ~30% (or even less, with large data types). So the lack
+of additional pglz compression may be acceptable.
+
+
+Deserialization
+---------------
+
+The deserialization is not a perfect inverse of the serialization, as we keep
+the deduplicated arrays. This reduces the amount of memory and also allows
+optimizations during estimation (e.g. we can cache results for the distinct
+values, saving expensive function calls).
+
+
+Inspecting the histogram
+------------------------
+
+Inspecting the regular (per-attribute) histograms is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarray, so we
+simply get the text representation of the array.
+
+With multivariate histograms it's not that simple due to the possible mix of
+data types in the histogram. It might be possible to produce similar array-like
+text representation, but that'd unnecessarily complicate further processing
+and analysis of the histogram. Instead, there's a SRF function that allows
+access to lower/upper boundaries, frequencies etc.
+
+ SELECT * FROM pg_mv_histogram_buckets();
+
+It has two input parameters:
+
+ oid - OID of the histogram (pg_mv_statistic.staoid)
+ otype - type of output
+
+and produces a table with these columns:
+
+ - bucket ID (0...nbuckets-1)
+ - lower bucket boundaries (string array)
+ - upper bucket boundaries (string array)
+ - nulls only dimensions (boolean array)
+ - lower boundary inclusive (boolean array)
+ - upper boundary includive (boolean array)
+ - frequency (double precision)
+
+The 'otype' accepts three values, determining what will be returned in the
+lower/upper boundary arrays:
+
+ - 0 - values stored in the histogram, encoded as text
+ - 1 - indexes into the deduplicated arrays
+ - 2 - idnexes into the deduplicated arrays, scaled to [0,1]
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 5c5c59a..3e4f4d1 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -18,6 +18,8 @@ Currently we only have two kinds of multivariate statistics
(b) MCV lists (README.mcv)
+ (c) multivariate histograms (README.histogram)
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index d1da714..ffb76f4 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -13,11 +13,11 @@
*
*-------------------------------------------------------------------------
*/
+#include "postgres.h"
+#include "utils/array.h"
#include "common.h"
-#include "utils/array.h"
-
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts,
VacAttrStats **vacattrstats);
@@ -52,7 +52,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -95,8 +96,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (stat->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
}
}
@@ -176,6 +181,8 @@ list_mv_stats(Oid relid)
info->deps_built = stats->deps_built;
info->mcv_enabled = stats->mcv_enabled;
info->mcv_built = stats->mcv_built;
+ info->hist_enabled = stats->hist_enabled;
+ info->hist_built = stats->hist_built;
result = lappend(result, info);
}
@@ -190,7 +197,6 @@ list_mv_stats(Oid relid)
return result;
}
-
/*
* Find attnims of MV stats using the mvoid.
*/
@@ -236,9 +242,16 @@ find_mv_attnums(Oid mvoid, Oid *relid)
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -271,22 +284,34 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..9e5620a
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,2032 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+#include <math.h>
+
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(int32))
+ * - max boundary indexes (2 * ndim * sizeof(int32))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(int32) + 3 * sizeof(bool)) +
+ * 2 * sizeof(float)
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) ((float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* can't split bucket with less than 10 rows */
+#define MIN_BUCKET_ROWS 10
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size.
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket
+ * for more details about the algorithm.
+ *
+ * The current algorithm works like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (largest bucket)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (largest dimension)
+ * split the bucket into two buckets
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ int *ndistvalues;
+ Datum **distvalues;
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets
+ = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * Collect info on distinct values in each dimension (used later
+ * to select dimension to partition).
+ */
+ ndistvalues = (int*)palloc0(sizeof(int) * numattrs);
+ distvalues = (Datum**)palloc0(sizeof(Datum*) * numattrs);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ int j;
+ int nvals;
+ Datum *tmp;
+
+ SortSupportData ssup;
+ StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ nvals = 0;
+ tmp = (Datum*)palloc0(sizeof(Datum) * numrows);
+
+ for (j = 0; j < numrows; j++)
+ {
+ bool isnull;
+
+ /* remember the index of the sample row, to make the partitioning simpler */
+ Datum value = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+
+ if (isnull)
+ continue;
+
+ tmp[nvals++] = value;
+ }
+
+ /* do the sort and stuff only if there are non-NULL values */
+ if (nvals > 0)
+ {
+ /* sort the array of values */
+ qsort_arg((void *) tmp, nvals, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /* count distinct values */
+ ndistvalues[i] = 1;
+ for (j = 1; j < nvals; j++)
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ ndistvalues[i] += 1;
+
+ /* FIXME allocate only needed space (count ndistinct first) */
+ distvalues[i] = (Datum*)palloc0(sizeof(Datum) * ndistvalues[i]);
+
+ /* now collect distinct values into the array */
+ distvalues[i][0] = tmp[0];
+ ndistvalues[i] = 1;
+
+ for (j = 1; j < nvals; j++)
+ {
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ {
+ distvalues[i][ndistvalues[i]] = tmp[j];
+ ndistvalues[i] += 1;
+ }
+ }
+ }
+
+ pfree(tmp);
+ }
+
+ /*
+ * The initial bucket may contain NULL values, so we have to create
+ * buckets with NULL-only dimensions.
+ *
+ * FIXME We may need up to 2^ndims buckets - check that there are
+ * enough buckets (MVSTAT_HIST_MAX_BUCKETS >= 2^ndims).
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets]
+ = partition_bucket(bucket, attrs, stats,
+ ndistvalues, distvalues);
+
+ histogram->nbuckets += 1;
+ }
+
+ /* finalize the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data
+ = ((HistogramBuild)histogram->buckets[i]->build_data);
+
+ /*
+ * The frequency has to be computed from the whole sample, in
+ * case some of the rows were used for MCV (and thus are missing
+ * from the histogram).
+ */
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVSerializedHistogram
+load_mv_histogram(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram(DatumGetByteaP(histogram));
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVSerializedHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm is quite
+ * simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ *
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ *
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we're mixing different
+ * datatypes, and we we need to use the right operators to compare/sort them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs
+ * (we won't use them, but we don't know how many are there),
+ * and then collect all non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* make sure we fit into uint16 */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typlen > 0)
+ /* byval or byref, but with fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (10 * 1024 * 1024))
+ elog(ERROR, "serialized histogram exceeds 10MB (%ld > %d)",
+ total_length, (10 * 1024 * 1024));
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typlen > 0)
+ {
+ /* pased by value or reference, but fixed length */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the histogram buckets */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ *BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ uint16 idx;
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ /* min boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* FIXME free the values/counts arrays here */
+
+ return output;
+}
+
+/*
+ * Returns histogram in a partially-serialized form (keeps the boundary
+ * values deduplicated, so that it's possible to optimize the estimation
+ * part by caching function call results between buckets etc.).
+ */
+MVSerializedHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+
+ MVSerializedHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram
+ = (MVSerializedHistogram)palloc(sizeof(MVSerializedHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVSerializedHistogramData, buckets));
+ tmp += offsetof(MVSerializedHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVSerializedHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* now let's allocate a single buffer for all the values and counts */
+
+ bufflen = (sizeof(int) + sizeof(Datum*)) * ndims;
+ for (i = 0; i < ndims; i++)
+ {
+ /* don't allocate space for byval types, matching Datum */
+ if (! (info[i].typbyval && (info[i].typlen == sizeof(Datum))))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+ }
+
+ /* also, include space for the result, tracking the buckets */
+ bufflen += nbuckets * (
+ sizeof(MVSerializedBucket) + /* bucket pointer */
+ sizeof(MVSerializedBucketData)); /* bucket data */
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ histogram->nvalues = (int*)ptr;
+ ptr += (sizeof(int) * ndims);
+
+ histogram->values = (Datum**)ptr;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ histogram->nvalues[i] = info[i].nvalues;
+
+ if (info[i].typbyval && info[i].typlen == sizeof(Datum))
+ {
+ /* passed by value / Datum - simply reuse the array */
+ histogram->values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typbyval)
+ {
+ /* pased by value, but smaller than Datum */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&histogram->values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ histogram->buckets = (MVSerializedBucket*)ptr;
+ ptr += (sizeof(MVSerializedBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVSerializedBucket bucket = (MVSerializedBucket)ptr;
+ ptr += sizeof(MVSerializedBucketData);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+ bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+ bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+ bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+ bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+ bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ /* we should exhaust the output buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller ones.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size. We
+ * select the bucket with the highest number of distinct values, and
+ * then split it by the longest dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this
+ * is used to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this
+ * contains values for all the tuples from the sample, not just
+ * the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ * For example use the "bucket volume" (product of dimension
+ * lengths) to select the bucket.
+ *
+ * We need buckets containing about the same number of tuples (so
+ * about the same frequency), as that limits the error when we
+ * match the bucket partially (in that case use 1/2 the bucket).
+ *
+ * We also need buckets with "regular" size, i.e. not "narrow" in
+ * some dimensions and "wide" in the others, because that makes
+ * partial matches more likely and increases the estimation error,
+ * especially when the clauses match many buckets partially. This
+ * is especially serious for OR-clauses, because in that case any
+ * of them may add the bucket as a (partial) match. With AND-clauses
+ * all the clauses have to match the bucket, which makes this issue
+ * somewhat less pressing.
+ *
+ * For example this table:
+ *
+ * CREATE TABLE t AS SELECT i AS a, i AS b
+ * FROM generate_series(1,1000000) s(i);
+ * ALTER TABLE t ADD STATISTICS (histogram) ON (a,b);
+ * ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because
+ * every bucket always has exactly the same number of distinct
+ * values in all dimensions, which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ * SELECT * FROM t WHERE a < 10 AND b < 10;
+ *
+ * is estimated to return ~120 rows, while in reality it returns 9.
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=117 width=8)
+ * (actual time=0.185..270.774 rows=9 loops=1)
+ * Filter: ((a < 10) AND (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * while the query using OR clauses is estimated like this:
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=8100 width=8)
+ * (actual time=0.118..189.919 rows=9 loops=1)
+ * Filter: ((a < 10) OR (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * which is clearly much worse. This happens because the histogram
+ * contains buckets like this:
+ *
+ * bucket 592 [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the
+ * length of "b" is (30593-30134)=459. So the "b" dimension is much
+ * narrower than "a". Of course, there are buckets where "b" is the
+ * wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension
+ * in partition_bucket() but that only happens after we already
+ * selected the bucket. So if we never select the bucket, we can't
+ * really fix it there.
+ *
+ * The other reason why this particular example behaves so poorly
+ * is due to the way we split the partition in partition_bucket().
+ * Currently we attempt to divide the bucket into two parts with
+ * the same number of sampled tuples (frequency), but that does not
+ * work well when all the tuples are squashed on one end of the
+ * bucket (e.g. exactly at the diagonal, as a=b). In that case we
+ * split the bucket into a tiny bucket on the diagonal, and a huge
+ * remaining part of the bucket, which is still going to be narrow
+ * and we're unlikely to fix that.
+ *
+ * So perhaps we need two partitioning strategies - one aiming to
+ * split buckets with high frequency (number of sampled rows), the
+ * other aiming to split "large" buckets. And alternating between
+ * them, somehow.
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ *
+ * TODO Consider using similar lower boundary for row count as for simple
+ * histograms, i.e. 300 tuples per bucket.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int numrows = 0;
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+ /* if the number of rows is higher, use this bucket */
+ if ((data->ndistinct > 2) &&
+ (data->numrows > numrows) &&
+ (data->numrows >= MIN_BUCKET_ROWS)) {
+ bucket = buckets[i];
+ numrows = data->numrows;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest
+ * bucket dimension, measured using the array of distinct values built
+ * at the very beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly
+ * distributed, and then use this to measure length. It's essentially
+ * a number of distinct values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts
+ * with roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning
+ * the new bucket (essentially shrinking the existing one in-place and
+ * returning the other "half" as a new bucket). The caller is responsible
+ * for adding the new bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case
+ * of strongly dependent columns - e.g. y=x). The current algorithm
+ * minimizes this, but may still happen for perfectly dependent
+ * examples (when all the dimensions have equal length, the first
+ * one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ // int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+ double delta;
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split.
+ */
+ delta = 0.0;
+ dimension = -1;
+
+ for (i = 0; i < numattrs; i++)
+ {
+ Datum *a, *b;
+
+ mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ /* can't split NULL-only dimension */
+ if (bucket->nullsonly[i])
+ continue;
+
+ /* can't split dimension with a single ndistinct value */
+ if (data->ndistincts[i] <= 1)
+ continue;
+
+ /* sort support for the bsearch_comparator */
+ ssup_private = &ssup;
+
+ /* search for min boundary in the distinct list */
+ a = (Datum*)bsearch(&bucket->min[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ b = (Datum*)bsearch(&bucket->max[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ /* if this dimension is 'larger' then partition by it */
+ if (((b-a)*1.0 / ndistvalues[i]) > delta)
+ {
+ delta = ((b-a)*1.0 / ndistvalues[i]);
+ dimension = i;
+ }
+ }
+
+ /*
+ * If we haven't found a dimension here, we've done something
+ * wrong in select_bucket_to_partition.
+ */
+ Assert(dimension != -1);
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ delta = fabs(data->numrows);
+ split_value = values[0].value;
+
+ for (i = 1; i < data->numrows; i++)
+ {
+ if (values[i].value != values[i-1].value)
+ {
+ /* are we closer to splitting the bucket in half? */
+ if (fabs(i - data->numrows/2.0) < delta)
+ {
+ /* let's assume we'll use this value for the split */
+ split_value = values[i].value;
+ delta = fabs(i - data->numrows/2.0);
+ nrows = i;
+ }
+ }
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and
+ * non-NULL values in a single dimension. Each dimension may either be
+ * marked as 'nulls only', and thus containing only NULL values, or
+ * it must not contain any NULL values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns,
+ * it's necessary to build those NULL-buckets. This is done in an
+ * iterative way using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL
+ * and non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not
+ * marked as NULL-only, mark it as NULL-only and run the
+ * algorithm again (on this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the
+ * bucket into two parts - one with NULL values, one with
+ * non-NULL values (replacing the current one). Then run
+ * the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions
+ * should be quite low - limited by the number of NULL-buckets. Also,
+ * in each branch the number of nested calls is limited by the number
+ * of dimensions (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The
+ * number of buckets produced by this algorithm is rather limited - with
+ * N dimensions, there may be only 2^N such buckets (each dimension may
+ * be either NULL or non-NULL). So with 8 dimensions (current value of
+ * MVSTATS_MAX_DIMENSIONS) there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further
+ * optimizing the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL
+ * in a dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ /*
+ * FIXME We don't need to start from the first attribute
+ * here - we can start from the last known dimension.
+ */
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only,
+ * but is not yet marked like that. It's enough to mark it and
+ * repeat the process recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in
+ * the dimension, one with non-NULL values. We don't need to sort
+ * the data or anything, but otherwise it's similar to what's done
+ * in partition_bucket().
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each
+ * bucket (NULL is not a value, so 0, and the other bucket got
+ * all the ndistinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram (or if there's no
+ * statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ * - prints actual values
+ * - using the output function of the data type (as string)
+ * - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ * - prints index of the distinct value (into the serialized array)
+ * - makes it easier to spot neighbor buckets, etc.
+ * - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ * - prints index of the distinct value, but normalized into [0,1]
+ * - similar to 1, but shows how 'long' the bucket range is
+ * - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options
+ * skew the lengths by distributing the distinct values uniformly. For
+ * data types without a clear meaning of 'distance' (e.g. strings) that
+ * is not a big deal, but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_histogram_buckets);
+
+Datum
+pg_mv_histogram_buckets(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ Oid mvoid = PG_GETARG_OID(0);
+ int otype = PG_GETARG_INT32(1);
+
+ if ((otype < 0) || (otype > 2))
+ elog(ERROR, "invalid output type specified");
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MVSerializedHistogram histogram;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ histogram = load_mv_histogram(mvoid);
+
+ funcctx->user_fctx = histogram;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = histogram->nbuckets;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+ double bucket_size = 1.0;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MVSerializedHistogram histogram;
+ MVSerializedBucket bucket;
+
+ histogram = (MVSerializedHistogram)funcctx->user_fctx;
+
+ Assert(call_cntr < histogram->nbuckets);
+
+ bucket = histogram->buckets[call_cntr];
+
+ stakeys = find_mv_attnums(mvoid, &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(9 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+ values[3] = (char *) palloc0(1024 * sizeof(char));
+ values[4] = (char *) palloc0(1024 * sizeof(char));
+ values[5] = (char *) palloc0(1024 * sizeof(char));
+
+ values[6] = (char *) palloc(64 * sizeof(char));
+ values[7] = (char *) palloc(64 * sizeof(char));
+ values[8] = (char *) palloc(64 * sizeof(char));
+
+ /* we need to do this only when printing the actual values */
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * histogram->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* bucket ID */
+
+ /*
+ * currently we only print array of indexes, but the deduplicated
+ * values should be sorted, so this is actually quite useful
+ *
+ * TODO print the actual min/max values, using the output
+ * function of the attribute type
+ */
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bucket_size *= (bucket->max[i] - bucket->min[i]) * 1.0
+ / (histogram->nvalues[i]-1);
+
+ /* print the actual values, i.e. use output function etc. */
+ if (otype == 0)
+ {
+ Datum minval, maxval;
+ Datum minout, maxout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ minval = histogram->values[i][bucket->min[i]];
+ minout = FunctionCall1(&fmgrinfo[i], minval);
+
+ maxval = histogram->values[i][bucket->max[i]];
+ maxout = FunctionCall1(&fmgrinfo[i], maxval);
+
+ // snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(minout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ // snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ snprintf(buff, 1024, format, values[2], DatumGetPointer(maxout));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else if (otype == 1)
+ {
+ format = "%s, %d";
+ if (i == 0)
+ format = "{%s%d";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %d}";
+
+ snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else
+ {
+ format = "%s, %f";
+ if (i == 0)
+ format = "{%s%f";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %f}";
+
+ snprintf(buff, 1024, format, values[1],
+ bucket->min[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2],
+ bucket->max[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ snprintf(buff, 1024, format, values[3], bucket->nullsonly[i] ? "t" : "f");
+ strncpy(values[3], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[4], bucket->min_inclusive[i] ? "t" : "f");
+ strncpy(values[4], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[5], bucket->max_inclusive[i] ? "t" : "f");
+ strncpy(values[5], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[6], 64, "%f", bucket->ntuples); /* frequency */
+ snprintf(values[7], 64, "%f", bucket->ntuples / bucket_size); /* density */
+ snprintf(values[8], 64, "%f", bucket_size); /* bucket_size */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+ pfree(values[4]);
+ pfree(values[5]);
+ pfree(values[6]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
+
+#ifdef DEBUG_MVHIST
+/*
+ * prints debugging info about matched histogram buckets (full/partial)
+ *
+ * XXX Currently works only for INT data type.
+ */
+void
+debug_histogram_matches(MVSerializedHistogram mvhist, char *matches)
+{
+ int i, j;
+
+ float ffull = 0, fpartial = 0;
+ int nfull = 0, npartial = 0;
+
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ char ranges[1024];
+
+ if (! matches[i])
+ continue;
+
+ /* increment the counters */
+ nfull += (matches[i] == MVSTATS_MATCH_FULL) ? 1 : 0;
+ npartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? 1 : 0;
+
+ /* and also update the frequencies */
+ ffull += (matches[i] == MVSTATS_MATCH_FULL) ? bucket->ntuples : 0;
+ fpartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? bucket->ntuples : 0;
+
+ memset(ranges, 0, sizeof(ranges));
+
+ /* build ranges for all the dimentions */
+ for (j = 0; j < mvhist->ndimensions; j++)
+ {
+ sprintf(ranges, "%s [%d %d]", ranges,
+ DatumGetInt32(mvhist->values[j][bucket->min[j]]),
+ DatumGetInt32(mvhist->values[j][bucket->max[j]]));
+ }
+
+ elog(WARNING, "bucket %d %s => %d [%f]", i, ranges, matches[i], bucket->ntuples);
+ }
+
+ elog(WARNING, "full=%f partial=%f (%f)", ffull, fpartial, (ffull + 0.5 * fpartial));
+}
+#endif
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 6339631..3543239 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,9 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled, mcv_enabled,\n"
- " deps_built, mcv_built,\n"
- " mcv_max_items,\n"
+ " deps_enabled, mcv_enabled, hist_enabled,\n"
+ " deps_built, mcv_built, hist_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2154,8 +2154,17 @@ describeOneTableDetails(const char *schemaname,
first = false;
}
+ if (!strcmp(PQgetvalue(result, i, 6), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", histogram");
+ else
+ appendPQExpBuffer(&buf, "(histogram");
+ first = false;
+ }
+
appendPQExpBuffer(&buf, ") ON (%s)",
- PQgetvalue(result, i, 9));
+ PQgetvalue(result, i, 12));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index fd7107d..a5945af 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -38,13 +38,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -52,6 +55,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -67,17 +71,21 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 11
+#define Natts_pg_mv_statistic 15
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
#define Anum_pg_mv_statistic_mcv_enabled 5
-#define Anum_pg_mv_statistic_mcv_max_items 6
-#define Anum_pg_mv_statistic_deps_built 7
-#define Anum_pg_mv_statistic_mcv_built 8
-#define Anum_pg_mv_statistic_stakeys 9
-#define Anum_pg_mv_statistic_stadeps 10
-#define Anum_pg_mv_statistic_stamcv 11
+#define Anum_pg_mv_statistic_hist_enabled 6
+#define Anum_pg_mv_statistic_mcv_max_items 7
+#define Anum_pg_mv_statistic_hist_max_buckets 8
+#define Anum_pg_mv_statistic_deps_built 9
+#define Anum_pg_mv_statistic_mcv_built 10
+#define Anum_pg_mv_statistic_hist_built 11
+#define Anum_pg_mv_statistic_stakeys 12
+#define Anum_pg_mv_statistic_stadeps 13
+#define Anum_pg_mv_statistic_stamcv 14
+#define Anum_pg_mv_statistic_stahist 15
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 66b4bcd..7e915bd 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2674,6 +2674,10 @@ DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f
DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
DESCR("details about MCV list items");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3374 ( pg_mv_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_size}" _null_ _null_ pg_mv_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 5ae6b3c..46bece6 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -620,10 +620,12 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
+ bool hist_enabled; /* histogram enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
+ bool hist_built; /* histogram built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 4535db7..f05a517 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -92,6 +92,123 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ *
+ * TODO add more detailed description here
+ */
+
+typedef struct MVSerializedBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ uint16 *min;
+ bool *min_inclusive;
+
+ /* indexes of upper boundaries - values and information about the
+ * inequalities (exclusive vs. inclusive) */
+ uint16 *max;
+ bool *max_inclusive;
+
+} MVSerializedBucketData;
+
+typedef MVSerializedBucketData *MVSerializedBucket;
+
+typedef struct MVSerializedHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ /*
+ * keep this the same with MVHistogramData, because of
+ * deserialization (same offset)
+ */
+ MVSerializedBucket *buckets; /* array of buckets */
+
+ /*
+ * serialized boundary values, one array per dimension, deduplicated
+ * (the min/max indexes point into these arrays)
+ */
+ int *nvalues;
+ Datum **values;
+
+} MVSerializedHistogramData;
+
+typedef MVSerializedHistogramData *MVSerializedHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -99,20 +216,25 @@ typedef MCVListData *MCVList;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
+MVSerializedHistogram load_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVSerializedHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
* dimension within the stats).
*/
int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
@@ -121,6 +243,8 @@ extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_histogram_buckets(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -130,10 +254,20 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
+#ifdef DEBUG_MVHIST
+extern void debug_histogram_matches(MVSerializedHistogram mvhist, char *matches);
+#endif
+
+
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..a34edb8
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON mv_histogram (unknown_column) WITH (histogram);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON mv_histogram (a) WITH (histogram);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a, b) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+ERROR: maximum number of buckets is 16384
+-- correct command
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON mv_histogram (a, b, c, d) WITH (histogram);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 66071d8..1a1a4ca 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1375,7 +1375,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 85d94f1..a885235 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 6584d73..2efdcd7 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -164,3 +164,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..02f49b4
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,176 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON mv_histogram (unknown_column) WITH (histogram);
+
+-- single column
+CREATE STATISTICS s1 ON mv_histogram (a) WITH (histogram);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a) WITH (histogram);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON mv_histogram (a, a, b) WITH (histogram);
+
+-- unknown option
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (unknown_option);
+
+-- missing histogram statistics
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+
+-- invalid max_buckets value / too low
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+
+-- invalid max_buckets value / too high
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+
+-- correct command
+CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON mv_histogram (a, b, c, d) WITH (histogram);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DROP TABLE mv_histogram;
--
2.1.0
0006-multi-statistics-estimation.patchtext/x-patch; charset=UTF-8; name=0006-multi-statistics-estimation.patchDownload
From dec65426b12adcceb6303692b07bb4f5c3e564e2 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 6/9] multi-statistics estimation
The general idea is that a probability (which is what selectivity is)
can be split into a product of conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part may be
simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute the original
probability.
The implementation works in the other direction, though. We know what
probability P(A & B & C) we need to compute, and also what statistics
are available.
So we search for a combinations of statistics, covering the clauses in
an optimal way (most clauses covered, most dependencies exploited).
There are two possible approaches - exhaustive and greedy. The
exhaustive one walks through all permutations of stats using dynamic
programming, so it's guaranteed to find the optimal solution, but it
soon gets very slow as it's roughly O(N!). The dynamic programming may
improve that a bit, but it's still far too expensive for large numbers
of statistics (on a single table).
The greedy algorithm is very simple - in every step choose the best
solution. That may not guarantee the best solution globally (but maybe
it does?), but it only needs N steps to find the solution, so it's very
fast (processing the selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with respect to
runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply them to the
clauses using the conditional probabilities. We process the selected
stats one by one, and for each we select the estimated clauses and
conditions. See clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to be covered by
a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single multivariate
statistics.
Clauses not covered by a single statistics at this level will be passed
to clause_selectivity() but this will treat them as a collection of
simpler clauses (connected by AND or OR), and the clauses from the
previous level will be used as conditions.
So using the same example, the last clause will be passed to
clause_selectivity() with 'clause1' and 'clause2' as conditions, and it
will be processed using multivariate stats if possible.
The other limitation is that all the expressions have to be
mv-compatible, i.e. there can't be a mix of expressions. If this is
violated, the clause may be passed to the next level (just like with
list of clauses not covered by a single statistics), which splits that
into clauses handled by multivariate stats and clauses handler by
regular statistics.
rework clauselist_selectivity_or to handle OR-clauses correctly
We might invent a completely new set of functions here, resembling
clauselist_selectivity but adapting the ideas to OR-clauses.
But luckily we know that each OR-clause
(a OR b OR c)
may be rewritten as an equivalent AND-clause using negation:
NOT ((NOT a) AND (NOT b) AND (NOT c))
And that's something we can pass to clauselist_selectivity.
---
contrib/file_fdw/file_fdw.c | 3 +-
contrib/postgres_fdw/postgres_fdw.c | 6 +-
src/backend/optimizer/path/clausesel.c | 1990 ++++++++++++++++++++++++++------
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/backend/utils/mvstats/README.stats | 166 +++
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
10 files changed, 1887 insertions(+), 356 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index dc035d7..8f11b7a 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -969,7 +969,8 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
baserel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d79e4cc..2f4af21 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -498,7 +498,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
@@ -2149,7 +2150,8 @@ estimate_path_cost_size(PlannerInfo *root,
local_param_join_conds,
foreignrel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 0de2418..c1b8999 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -29,6 +29,8 @@
#include "utils/selfuncs.h"
#include "utils/typcache.h"
+#include "miscadmin.h"
+
/*
* Data structure for accumulating info about possible range-query
@@ -44,6 +46,13 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
+
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -60,23 +69,25 @@ static int count_mv_attnums(List *clauses, Index relid, int type);
static int count_varnos(List *clauses, Index *relid);
+static List *clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ Index relid, int types, bool remove);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Index relid, List *stats);
-static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
-
-static List *clauselist_mv_split(PlannerInfo *root, Index relid,
- List *clauses, List **mvclauses,
- MVStatisticInfo *mvstats, int types);
-
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats, List *clauses,
+ List *conditions, bool is_or);
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats,
- bool *fullmatch, Selectivity *lowsel);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or, bool *fullmatch,
+ Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -90,10 +101,33 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static List *choose_mv_statistics(PlannerInfo *root, Index relid,
+ List *mvstats, List *clauses, List *conditions);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
+static bool stats_type_matches(MVStatisticInfo *stat, int type);
+
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
@@ -168,14 +202,15 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
/* processing mv stats */
- Oid relid = InvalidOid;
+ Index relid = InvalidOid;
/* list of multivariate stats on the relation */
List *stats = NIL;
@@ -191,12 +226,13 @@ clauselist_selectivity(PlannerInfo *root,
stats = find_stats(root, relid);
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, or matching multivariate statistics, so just go directly
+ * to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
+ varRelid, jointype, sjinfo, conditions);
/*
* Apply functional dependencies, but first check that there are some stats
@@ -228,31 +264,96 @@ clauselist_selectivity(PlannerInfo *root,
(count_mv_attnums(clauses, relid,
MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
- /* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, relid,
- MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ ListCell *s;
+
+ /*
+ * Copy the conditions we got from the upper part of the expression tree
+ * so that we can add local conditions to it (we need to keep the
+ * original list intact, for sibling expressions - other expressions
+ * at the same level).
+ */
+ List *conditions_local = list_copy(conditions);
- /* and search for the statistic covering the most attributes */
- MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+ /* find the best combination of statistics */
+ List *solution = choose_mv_statistics(root, relid, stats,
+ clauses, conditions);
- if (mvstat != NULL) /* we have a matching stats */
+ /*
+ * We have a good solution, which is merely a list of statistics that
+ * we need to apply. We'll apply the statistics one by one (in the order
+ * as they appear in the list), and for each statistic we'll
+ *
+ * (1) find clauses compatible with the statistic (and remove them
+ * from the list)
+ *
+ * (2) find local conditions compatible with the statistic
+ *
+ * (3) do the estimation P(clauses | conditions)
+ *
+ * (4) append the estimated clauses to local conditions
+ *
+ * continuously modify
+ */
+ foreach (s, solution)
{
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
- mvstat, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ /* clauses compatible with the statistic we're applying right now */
+ List *stat_clauses = NIL;
+ List *stat_conditions = NIL;
- /* we've chosen the histogram to match the clauses */
- Assert(mvclauses != NIL);
+ /*
+ * Find clauses and conditions matching the statistic - the clauses
+ * need to be removed from the list, while conditions should remain
+ * there (so that we can apply them repeatedly).
+ */
+ stat_clauses
+ = clauses_matching_statistic(&clauses, mvstat, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ true);
+
+ stat_conditions
+ = clauses_matching_statistic(&conditions_local, mvstat, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ false);
+
+ /*
+ * If we got no clauses to estimate, we've done something wrong,
+ * either during the optimization, detecting compatible clause, or
+ * somewhere else.
+ *
+ * Also, we need at least two attributes in clauses and conditions.
+ */
+ Assert(stat_clauses != NIL);
+ Assert(count_mv_attnums(list_union(stat_clauses, stat_conditions),
+ relid, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2);
/* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ stat_clauses, stat_conditions,
+ false); /* AND */
+
+ /*
+ * Add the new clauses to the local conditions, so that we can use
+ * them for the subsequent statistics. We only add the clauses,
+ * because the conditions are already there (or should be).
+ */
+ conditions_local = list_concat(conditions_local, stat_clauses);
}
+
+ /* from now on, work only with the 'local' list of conditions */
+ conditions = conditions_local;
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ return s1 * clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo, conditions);
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -264,7 +365,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions);
/*
* Check for being passed a RestrictInfo.
@@ -423,6 +525,55 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Similar to clauselist_selectivity(), but for OR-clauses. We can't simply use
+ * the same multi-statistic estimation logic for AND-clauses, at least not
+ * directly, because there are a few key differences:
+ *
+ * - functional dependencies don't really apply to OR-clauses
+ *
+ * - clauselist_selectivity() is based on decomposing the selectivity into
+ * a sequence of conditional probabilities (selectivities), but that can
+ * be done only for AND-clauses
+ *
+ * We might invent a similar infrastructure for optimizing OR-clauses, doing
+ * something similar to what clause_selectivity does for AND-clauses, but
+ * luckily we know that each disjunctive normal form (aka OR-clause)
+ *
+ * (a OR b OR c)
+ *
+ * may be rewritten as an equivalent conjunctive normal form (aka AND-clause)
+ * by using negation:
+ *
+ * NOT ((NOT a) AND (NOT b) AND (NOT c))
+ *
+ * And that's something we can pass to clauselist_selectivity and let it do
+ * all the heavy lifting.
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
+{
+ List *args = NIL;
+ ListCell *l;
+ Expr *expr;
+
+ /* build arguments for the AND-clause by negating args of the OR-clause */
+ foreach (l, clauses)
+ args = lappend(args, makeBoolExpr(NOT_EXPR, list_make1(lfirst(l)), -1));
+
+ /* and then the actual OR-clause on the negated args */
+ expr = makeBoolExpr(AND_EXPR, args, -1);
+
+ /* instead of constructing NOT expression, just do (1.0 - s) */
+ return 1.0 - clauselist_selectivity(root, list_make1(expr), varRelid,
+ jointype, sjinfo, conditions);
+}
+
+/*
* addRangeClause --- add a new range clause for clauselist_selectivity
*
* Here is where we try to match up pairs of range-query clauses
@@ -629,7 +780,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -749,7 +901,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -758,29 +911,18 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
- /*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
- */
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(arg),
- varRelid,
- jointype,
- sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ /* just call to clauselist_selectivity_or() */
+ s1 = clauselist_selectivity_or(root,
+ ((BoolExpr *) clause)->args,
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -870,7 +1012,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -879,7 +1022,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else
{
@@ -943,15 +1087,16 @@ clause_selectivity(PlannerInfo *root,
* in the MCV list, then the selectivity is below the lowest frequency
* found in the MCV list,
*
- * TODO When applying the clauses to the histogram/MCV list, we can do
- * that from the most selective clauses first, because that'll
- * eliminate the buckets/items sooner (so we'll be able to skip
- * them without inspection, which is more expensive). But this
- * requires really knowing the per-clause selectivities in advance,
- * and that's not what we do now.
+ * TODO When applying the clauses to the histogram/MCV list, we can do that from
+ * the most selective clauses first, because that'll eliminate the
+ * buckets/items sooner (so we'll be able to skip them without inspection,
+ * which is more expensive). But this requires really knowing the
+ * per-clause selectivities in advance, and that's not what we do now.
+ *
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
@@ -969,7 +1114,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
*/
/* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions, is_or,
&fullmatch, &mcv_low);
/*
@@ -982,7 +1128,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
/* TODO if (fullmatch) without matching MCV item, use the mcv_low
* selectivity as upper bound */
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions, is_or);
/* TODO clamp to <= 1.0 (or more strictly, when possible) */
return s1 + s2;
@@ -1016,260 +1163,1325 @@ get_varattnos(Node * node, Index relid)
k + FirstLowInvalidHeapAttributeNumber);
}
- bms_free(varattnos);
+ bms_free(varattnos);
+
+ return result;
+}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Index relid, int types)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, relid, &attnums, types);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ bms_free(attnums);
+ attnums = NULL;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Index relid, int type)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Count varnos referenced in the clauses, and if there's a single varno then
+ * return the index in 'relid'.
+ */
+static int
+count_varnos(List *clauses, Index *relid)
+{
+ int cnt;
+ Bitmapset *varnos = NULL;
+
+ varnos = pull_varnos((Node *) clauses);
+ cnt = bms_num_members(varnos);
+
+ /* if there's a single varno in the clauses, remember it */
+ if (bms_num_members(varnos) == 1)
+ *relid = bms_singleton_member(varnos);
+
+ bms_free(varnos);
+
+ return cnt;
+}
+
+static List *
+clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ Index relid, int types, bool remove)
+{
+ int i;
+ Bitmapset *stat_attnums = NULL;
+ List *matching_clauses = NIL;
+ ListCell *lc;
+
+ /* build attnum bitmapset for this statistics */
+ for (i = 0; i < statistic->stakeys->dim1; i++)
+ stat_attnums = bms_add_member(stat_attnums,
+ statistic->stakeys->values[i]);
+
+ /*
+ * We can't use foreach here, because we may need to remove some of the
+ * clauses if (remove=true).
+ */
+ lc = list_head(*clauses);
+ while (lc)
+ {
+ Node *clause = (Node*)lfirst(lc);
+ Bitmapset *attnums = NULL;
+
+ /* must advance lc before list_delete possibly pfree's it */
+ lc = lnext(lc);
+
+ /*
+ * skip clauses that are not compatible with stats (just leave them
+ * in the original list)
+ *
+ * XXX Perhaps this should check what stats are actually available in
+ * the statistics (not a big deal now, because MCV and histograms
+ * handle the same types of conditions).
+ */
+ if (! clause_is_mv_compatible(clause, relid, &attnums, types))
+ {
+ bms_free(attnums);
+ continue;
+ }
+
+ /* if the clause is covered by the statistic, add it to the list */
+ if (bms_is_subset(attnums, stat_attnums))
+ {
+ matching_clauses = lappend(matching_clauses, clause);
+
+ /* if remove=true, remove the matching item from the main list */
+ if (remove)
+ *clauses = list_delete_ptr(*clauses, clause);
+ }
+
+ bms_free(attnums);
+ }
+
+ bms_free(stat_attnums);
+
+ return matching_clauses;
+}
+
+/*
+ * Selects the best combination of multivariate statistics, in an exhaustive
+ * way, where 'best' means:
+ *
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ * (c) using the most conditions to exploit dependency
+ *
+ * Don't call this directly but through choose_mv_statistics(), which does some
+ * additional tricks to minimize the runtime.
+ *
+ *
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with maximum
+ * depth equal to the number of multi-variate statistics available on the table.
+ * It actually explores all valid combinations of stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it matches are
+ * divided into 'conditions' (clauses already matched by at least one previous
+ * statistics) and clauses that are estimated.
+ *
+ * Then several checks are performed:
+ *
+ * (a) The statistics covers at least 2 columns, referenced in the estimated
+ * clauses (otherwise multi-variate stats are useless).
+ *
+ * (b) The statistics covers at least 1 new column, i.e. column not refefenced
+ * by the already used stats (and the new column has to be referenced by
+ * the clauses, of couse). Otherwise the statistics would not add any new
+ * information.
+ *
+ * There are some other sanity checks (e.g. stats must not be used twice etc.).
+ *
+ *
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a rather simple optimality criteria, so it may
+ * not do the best choice when
+ *
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but with
+ * statistics in a different order). It's unclear which solution in the best
+ * one - in a sense all of them are equal.
+ *
+ * TODO It might be possible to compute estimate for each of those solutions,
+ * and then combine them to get the final estimate (e.g. by using average
+ * or median).
+ *
+ * (b) Does not consider that some types of stats are a better match for some
+ * types of clauses (e.g. MCV list is generally a better match for equality
+ * conditions than a histogram).
+ *
+ * But maybe this is pointless - generally, each column is either a label
+ * (it's not important whether because of the data type or how it's used),
+ * or a value with ordering that makes sense. So either a MCV list is more
+ * appropriate (labels) or a histogram (values with orderings).
+ *
+ * Now sure what to do with statistics on columns mixing both types of data
+ * (some columns would work best with MCVs, some with histograms). Maybe we
+ * could invent a new type of statistics combining MCV list and histogram
+ * (keeping a small histogram for each MCV item, and a separate histogram
+ * for values not on the MCV list).
+ *
+ * TODO The algorithm should probably count number of Vars (not just attnums)
+ * when computing the 'score' of each solution. Computing the ratio of
+ * (num of all vars) / (num of condition vars) as a measure of how well
+ * the solution uses conditions might be useful.
+ */
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* this may run for a long sime, so let's make it interruptible */
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ /*
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int c;
+
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
+
+ Bitmapset *all_attnums = NULL;
+ Bitmapset *new_attnums = NULL;
+
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nclauses; c++)
+ {
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
+
+ if (covered)
+ break;
+ }
+
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
+
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
+
+ /* add the attnums into attnums from 'new clauses' */
+ // new_attnums = bms_union(new_attnums, clause_attnums);
+ }
+
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+ bms_free(new_attnums);
+
+ all_attnums = NULL;
+ new_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
+ {
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+ }
+
+ /*
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
+ */
+ ruled_out[i] = step;
+
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
+
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
+
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics covering
+ * the clauses. This chooses the "best" statistics at each step, so the
+ * resulting solution may not be the best solution globally, but this produces
+ * the solution in only N steps (where N is the number of statistics), while
+ * the exhaustive approach may have to walk through ~N! combinations (although
+ * some of those are terminated early).
+ *
+ * See the comments at choose_mv_statistics_exhaustive() as this does the same
+ * thing (but in a different way).
+ *
+ * Don't call this directly, but through choose_mv_statistics().
+ *
+ * TODO There are probably other metrics we might use - e.g. using number of
+ * columns (num_cond_columns / num_cov_columns), which might work better
+ * with a mix of simple and complex clauses.
+ *
+ * TODO Also the choice at the very first step should be handled in a special
+ * way, because there will be 0 conditions at that moment, so there needs
+ * to be some other criteria - e.g. using the simplest (or most complex?)
+ * clause might be a good idea.
+ *
+ * TODO We might also select multiple stats using different criteria, and branch
+ * the search. This is however tricky, because if we choose k statistics at
+ * each step, we get k^N branches to walk through (with N steps). That's
+ * not really good with large number of stats (yet better than exhaustive
+ * search).
+ */
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
+
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
+
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
+ */
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *attnums_new
+ = bms_union(attnums_covered, clauses_attnums[j]);
+
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = attnums_new;
+
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
+
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ attnums_new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = attnums_new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
+
+ if (gain > max_gain)
+ {
+ max_gain = gain;
+ best_stat = i;
+ }
+ }
+
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
+ }
+
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
+}
+
+/*
+ * Remove clauses not covered by any of the available statistics
+ *
+ * This helps us to reduce the amount of work done in choose_mv_statistics()
+ * by not having to deal with clauses that can't possibly be useful.
+ */
+static List *
+filter_clauses(PlannerInfo *root, Index relid, int type,
+ List *stats, List *clauses, Bitmapset **attnums)
+{
+ ListCell *c;
+ ListCell *s;
+
+ /* results (list of compatible clauses, attnums) */
+ List *rclauses = NIL;
+
+ foreach (c, clauses)
+ {
+ Node *clause = (Node*)lfirst(c);
+ Bitmapset *clause_attnums = NULL;
+
+ /*
+ * We do assume that thanks to previous checks, we should not run into
+ * clauses that are incompatible with multivariate stats here. We also
+ * need to collect the attnums for the clause.
+ *
+ * XXX Maybe turn this into an assert?
+ */
+ if (! clause_is_mv_compatible(clause, relid, &clause_attnums, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ /* Is there a multivariate statistics covering the clause? */
+ foreach (s, stats)
+ {
+ int k, matches = 0;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* skip statistics not matching the required type */
+ if (! stats_type_matches(stat, type))
+ continue;
+
+ /*
+ * see if all clause attributes are covered by the statistic
+ *
+ * We'll do that in the opposite direction, i.e. we'll see how many
+ * attributes of the statistic are referenced in the clause, and then
+ * compare the counts.
+ */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ if (bms_is_member(stat->stakeys->values[k], clause_attnums))
+ matches += 1;
+
+ /*
+ * If the number of matches is equal to attributes referenced by the
+ * clause, then the clause is covered by the statistic.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ *attnums = bms_union(*attnums, clause_attnums);
+ rclauses = lappend(rclauses, clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(clauses) >= list_length(rclauses));
+
+ return rclauses;
+}
+
+/*
+ * Remove statistics not covering any new clauses
+ *
+ * Statistics not covering any new clauses (conditions don't count) are not
+ * really useful, so let's ignore them. Also, we need the statistics to
+ * reference at least two different attributes (both in conditions and clauses
+ * combined), and at least one of them in the clauses alone.
+ *
+ * This check might be made more strict by checking against individual clauses,
+ * because by using the bitmapsets of all attnums we may actually use attnums
+ * from clauses that are not covered by the statistics. For example, we may
+ * have a condition
+ *
+ * (a=1 AND b=2)
+ *
+ * and a new clause
+ *
+ * (c=1 AND d=1)
+ *
+ * With only bitmapsets, statistics on [b,c] will pass through this (assuming
+ * there are some statistics covering both clases).
+ *
+ * Parameters:
+ *
+ * stats - list of statistics to filter
+ * new_attnums - attnums referenced in new clauses
+ * all_attnums - attnums referenced by contidions and new clauses combined
+ *
+ * Returns filtered list of statistics.
+ *
+ * TODO Do the more strict check, i.e. walk through individual clauses and
+ * conditions and only use those covered by the statistics.
+ */
+static List *
+filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+{
+ ListCell *s;
+ List *stats_filtered = NIL;
+
+ foreach (s, stats)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* see how many attributes the statistics covers */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ /* attributes from new clauses */
+ if (bms_is_member(stat->stakeys->values[k], new_attnums))
+ matches_new += 1;
+
+ /* attributes from onditions */
+ if (bms_is_member(stat->stakeys->values[k], all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ stats_filtered = lappend(stats_filtered, stat);
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(list_length(stats) >= list_length(stats_filtered));
+
+ return stats_filtered;
+}
+
+static MVStatisticInfo *
+make_stats_array(List *stats, int *nmvstats)
+{
+ int i;
+ ListCell *l;
+
+ MVStatisticInfo *mvstats = NULL;
+ *nmvstats = list_length(stats);
+
+ mvstats
+ = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
+
+ i = 0;
+ foreach (l, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
+ memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
+ }
+
+ return mvstats;
+}
+
+static Bitmapset **
+make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
+{
+ int i, j;
+ Bitmapset **stats_attnums = NULL;
+
+ Assert(nmvstats > 0);
- return result;
+ /* build bitmaps of attnums for the stats (easier to compare) */
+ stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i]
+ = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+
+ return stats_attnums;
}
+
/*
- * Collect attributes from mv-compatible clauses.
+ * Remove redundant statistics
+ *
+ * If there are multiple statistics covering the same set of columns (counting
+ * only those referenced by clauses and conditions), we can apply one of those
+ * anyway and further reduce the size of the optimization problem.
+ *
+ * Thus when redundant stats are detected, we keep the smaller one (the one with
+ * fewer columns), based on the assumption that it's more accurate and also
+ * faster to process. That may be untrue for two reasons - first, the accuracy
+ * really depends on number of buckets/MCV items, not the number of columns.
+ * Second, some types of statistics may work better for certain types of clauses
+ * (e.g. MCV lists for equality conditions) etc.
*/
-static Bitmapset *
-collect_mv_attnums(List *clauses, Index relid, int types)
+static List*
+filter_redundant_stats(List *stats, List *clauses, List *conditions)
{
- Bitmapset *attnums = NULL;
- ListCell *l;
+ int i, j, nmvstats;
+
+ MVStatisticInfo *mvstats;
+ bool *redundant;
+ Bitmapset **stats_attnums;
+ Bitmapset *varattnos;
+ Index relid;
+
+ Assert(list_length(stats) > 0);
+ Assert(list_length(clauses) > 0);
/*
- * Walk through the clauses and identify the ones we can estimate using
- * multivariate stats, and remember the relid/columns. We'll then
- * cross-check if we have suitable stats, and only if needed we'll split
- * the clauses into multivariate and regular lists.
+ * We'll convert the list of statistics into an array now, because
+ * the reduction of redundant statistics is easier to do that way
+ * (we can mark previous stats as redundant, etc.).
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* by default, none of the stats is redundant (so palloc0) */
+ redundant = palloc0(nmvstats * sizeof(bool));
+
+ /*
+ * We only expect a single relid here, and also we should get the
+ * same relid from clauses and conditions (but we get it from
+ * clauses, because those are certainly non-empty).
+ */
+ relid = bms_singleton_member(pull_varnos((Node*)clauses));
+
+ /*
+ * Get the varattnos from both conditions and clauses.
+ *
+ * This skips system attributes, although that should be impossible
+ * thanks to previous filtering out of incompatible clauses.
*
- * For now we're only interested in RestrictInfo nodes with nested OpExpr,
- * using either a range or equality.
+ * XXX Is that really true?
*/
- foreach (l, clauses)
+ varattnos = bms_union(get_varattnos((Node*)clauses, relid),
+ get_varattnos((Node*)conditions, relid));
+
+ for (i = 1; i < nmvstats; i++)
{
- Node *clause = (Node *) lfirst(l);
+ /* intersect with current statistics */
+ Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
- /* ignore the result here - we only need the attnums */
- clause_is_mv_compatible(clause, relid, &attnums, types);
+ /* walk through 'previous' stats and check redundancy */
+ for (j = 0; j < i; j++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *prev;
+
+ /* skip stats already identified as redundant */
+ if (redundant[j])
+ continue;
+
+ prev = bms_intersect(stats_attnums[j], varattnos);
+
+ switch (bms_subset_compare(curr, prev))
+ {
+ case BMS_EQUAL:
+ /*
+ * Use the smaller one (hopefully more accurate).
+ * If both have the same size, use the first one.
+ */
+ if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
+ redundant[i] = TRUE;
+ else
+ redundant[j] = TRUE;
+
+ break;
+
+ case BMS_SUBSET1: /* curr is subset of prev */
+ redundant[i] = TRUE;
+ break;
+
+ case BMS_SUBSET2: /* prev is subset of curr */
+ redundant[j] = TRUE;
+ break;
+
+ case BMS_DIFFERENT:
+ /* do nothing - keep both stats */
+ break;
+ }
+
+ bms_free(prev);
+ }
+
+ bms_free(curr);
}
- /*
- * If there are not at least two attributes referenced by the clause(s),
- * we can throw everything out (as we'll revert to simple stats).
- */
- if (bms_num_members(attnums) <= 1)
+ /* can't reduce all statistics (at least one has to remain) */
+ Assert(nmvstats > 0);
+
+ /* now, let's remove the reduced statistics from the arrays */
+ list_free(stats);
+ stats = NIL;
+
+ for (i = 0; i < nmvstats; i++)
{
- if (attnums != NULL)
- pfree(attnums);
- attnums = NULL;
+ MVStatisticInfo *info;
+
+ pfree(stats_attnums[i]);
+
+ if (redundant[i])
+ continue;
+
+ info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
+
+ stats = lappend(stats, info);
}
- return attnums;
+ pfree(mvstats);
+ pfree(stats_attnums);
+ pfree(redundant);
+
+ return stats;
}
-/*
- * Count the number of attributes in clauses compatible with multivariate stats.
- */
-static int
-count_mv_attnums(List *clauses, Index relid, int type)
+static Node**
+make_clauses_array(List *clauses, int *nclauses)
{
- int c;
- Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
+ int i;
+ ListCell *l;
- c = bms_num_members(attnums);
+ Node** clauses_array;
- bms_free(attnums);
+ *nclauses = list_length(clauses);
+ clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
- return c;
+ i = 0;
+ foreach (l, clauses)
+ clauses_array[i++] = (Node *)lfirst(l);
+
+ *nclauses = i;
+
+ return clauses_array;
}
-/*
- * Count varnos referenced in the clauses, and if there's a single varno then
- * return the index in 'relid'.
- */
-static int
-count_varnos(List *clauses, Index *relid)
+static Bitmapset **
+make_clauses_attnums(PlannerInfo *root, Index relid,
+ int type, Node **clauses, int nclauses)
{
- int cnt;
- Bitmapset *varnos = NULL;
+ int i;
+ Bitmapset **clauses_attnums
+ = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
- varnos = pull_varnos((Node *) clauses);
- cnt = bms_num_members(varnos);
+ for (i = 0; i < nclauses; i++)
+ {
+ Bitmapset * attnums = NULL;
- /* if there's a single varno in the clauses, remember it */
- if (bms_num_members(varnos) == 1)
- *relid = bms_singleton_member(varnos);
+ if (! clause_is_mv_compatible(clauses[i], relid, &attnums, type))
+ elog(ERROR, "should not get non-mv-compatible clause");
- bms_free(varnos);
+ clauses_attnums[i] = attnums;
+ }
- return cnt;
+ return clauses_attnums;
}
-
+
+static bool*
+make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses)
+{
+ int i, j;
+ bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < nclauses; j++)
+ cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+
+ return cover_map;
+}
+
/*
- * We're looking for statistics matching at least 2 attributes, referenced in
- * clauses compatible with multivariate statistics. The current selection
- * criteria is very simple - we choose the statistics referencing the most
- * attributes.
- *
- * If there are multiple statistics referencing the same number of columns
- * (from the clauses), the one with less source columns (as listed in the
- * ADD STATISTICS when creating the statistics) wins. Else the first one wins.
- *
- * This is a very simple criteria, and has several weaknesses:
- *
- * (a) does not consider the accuracy of the statistics
- *
- * If there are two histograms built on the same set of columns, but one
- * has 100 buckets and the other one has 1000 buckets (thus likely
- * providing better estimates), this is not currently considered.
- *
- * (b) does not consider the type of statistics
- *
- * If there are three statistics - one containing just a MCV list, another
- * one with just a histogram and a third one with both, we treat them equally.
+ * Chooses the combination of statistics, optimal for estimation of a particular
+ * clause list.
*
- * (c) does not consider the number of clauses
+ * This only handles a 'preparation' shared by the exhaustive and greedy
+ * implementations (see the previous methods), mostly trying to reduce the size
+ * of the problem (eliminate clauses/statistics that can't be really used in
+ * the solution).
*
- * As explained, only the number of referenced attributes counts, so if
- * there are multiple clauses on a single attribute, this still counts as
- * a single attribute.
+ * It also precomputes bitmaps for attributes covered by clauses and statistics,
+ * so that we don't need to do that over and over in the actual optimizations
+ * (as it's both CPU and memory intensive).
*
- * (d) does not consider type of condition
*
- * Some clauses may work better with some statistics - for example equality
- * clauses probably work better with MCV lists than with histograms. But
- * IS [NOT] NULL conditions may often work better with histograms (thanks
- * to NULL-buckets).
+ * TODO Another way to make the optimization problems smaller might be splitting
+ * the statistics into several disjoint subsets, i.e. if we can split the
+ * graph of statistics (after the elimination) into multiple components
+ * (so that stats in different components share no attributes), we can do
+ * the optimization for each component separately.
*
- * So for example with five WHERE conditions
- *
- * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
- *
- * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be selected
- * as it references the most columns.
- *
- * Once we have selected the multivariate statistics, we split the list of
- * clauses into two parts - conditions that are compatible with the selected
- * stats, and conditions are estimated using simple statistics.
- *
- * From the example above, conditions
- *
- * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
- *
- * will be estimated using the multivariate statistics (a,b,c,d) while the last
- * condition (e = 1) will get estimated using the regular ones.
- *
- * There are various alternative selection criteria (e.g. counting conditions
- * instead of just referenced attributes), but eventually the best option should
- * be to combine multiple statistics. But that's much harder to do correctly.
- *
- * TODO Select multiple statistics and combine them when computing the estimate.
- *
- * TODO This will probably have to consider compatibility of clauses, because
- * 'dependencies' will probably work only with equality clauses.
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew that we
+ * can cover 10 clauses and reuse 8 dependencies, maybe covering 9 clauses
+ * and 7 dependencies would be OK?
*/
-static MVStatisticInfo *
-choose_mv_statistics(List *stats, Bitmapset *attnums)
+static List*
+choose_mv_statistics(PlannerInfo *root, Index relid, List *stats,
+ List *clauses, List *conditions)
{
int i;
- ListCell *lc;
+ mv_solution_t *best = NULL;
+ List *result = NIL;
+
+ int nmvstats;
+ MVStatisticInfo *mvstats;
+
+ /* we only work with MCV lists and histograms here */
+ int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
- MVStatisticInfo *choice = NULL;
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
+ stats = list_copy(stats);
/*
- * Walk through the statistics (simple array with nmvstats elements) and for
- * each one count the referenced attributes (encoded in the 'attnums' bitmap).
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
*/
- foreach (lc, stats)
+ while (true)
{
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
-
- /* columns matching this statistics */
- int matches = 0;
+ List *tmp;
- int2vector * attrs = info->stakeys;
- int numattrs = attrs->dim1;
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
- /* skip dependencies-only stats */
- if (! (info->mcv_built || info->hist_built))
- continue;
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. We'll also keep info
+ * about attnums in clauses (without conditions) so that we can
+ * ignore stats covering just conditions (which is pointless).
+ */
+ tmp = filter_clauses(root, relid, type,
+ stats, clauses, &compatible_attnums);
- /* count columns covered by the histogram */
- for (i = 0; i < numattrs; i++)
- if (bms_is_member(attrs->values[i], attnums))
- matches++;
+ /* discard the original list */
+ list_free(clauses);
+ clauses = tmp;
/*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new
+ * attribute (by comparing with clauses).
*/
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
+ if (conditions != NIL)
{
- choice = info;
- current_matches = matches;
- current_dims = numattrs;
+ tmp = filter_clauses(root, relid, type,
+ stats, conditions, &condition_attnums);
+
+ /* discard the original list */
+ list_free(conditions);
+ conditions = tmp;
}
- }
- return choice;
-}
+ /* get a union of attnums (from conditions and new clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ tmp = filter_stats(stats, compatible_attnums, all_attnums);
+ /* if we've not eliminated anything, terminate */
+ if (list_length(stats) == list_length(tmp))
+ break;
-/*
- * This splits the clauses list into two parts - one containing clauses that
- * will be evaluated using the chosen statistics, and the remaining clauses
- * (either non-mvcompatible, or not related to the histogram).
- */
-static List *
-clauselist_mv_split(PlannerInfo *root, Index relid,
- List *clauses, List **mvclauses,
- MVStatisticInfo *mvstats, int types)
-{
- int i;
- ListCell *l;
- List *non_mvclauses = NIL;
+ /* work only with filtered statistics from now */
+ list_free(stats);
+ stats = tmp;
+ }
- /* FIXME is there a better way to get info on int2vector? */
- int2vector * attrs = mvstats->stakeys;
- int numattrs = mvstats->stakeys->dim1;
+ /* only do the optimization if we have clauses/statistics */
+ if ((list_length(stats) == 0) || (list_length(clauses) == 0))
+ return NULL;
- Bitmapset *mvattnums = NULL;
+ /* remove redundant stats (stats covered by another stats) */
+ stats = filter_redundant_stats(stats, clauses, conditions);
- /* build bitmap of attributes, so we can do bms_is_subset later */
- for (i = 0; i < numattrs; i++)
- mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
- /* erase the list of mv-compatible clauses */
- *mvclauses = NIL;
+ /* collect clauses an bitmap of attnums */
+ clauses_array = make_clauses_array(clauses, &nclauses);
+ clauses_attnums = make_clauses_attnums(root, relid, type,
+ clauses_array, nclauses);
- foreach (l, clauses)
- {
- bool match = false; /* by default not mv-compatible */
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(l);
+ /* collect conditions and bitmap of attnums */
+ conditions_array = make_clauses_array(conditions, &nconditions);
+ conditions_attnums = make_clauses_attnums(root, relid, type,
+ conditions_array, nconditions);
- if (clause_is_mv_compatible(clause, relid, &attnums, types))
+ /*
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
+ */
+ clause_cover_map = make_cover_map(stats_attnums, nmvstats,
+ clauses_attnums, nclauses);
+
+ condition_cover_map = make_cover_map(stats_attnums, nmvstats,
+ conditions_attnums, nconditions);
+
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ /* no stats are ruled out by default */
+ for (i = 0; i < nmvstats; i++)
+ ruled_out[i] = -1;
+
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* create a list of statistics from the array */
+ if (best != NULL)
+ {
+ for (i = 0; i < best->nstats; i++)
{
- /* are all the attributes part of the selected stats? */
- if (bms_is_subset(attnums, mvattnums))
- match = true;
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
+ result = lappend(result, info);
}
- /*
- * The clause matches the selected stats, so put it to the list of
- * mv-compatible clauses. Otherwise, keep it in the list of 'regular'
- * clauses (that may be selected later).
- */
- if (match)
- *mvclauses = lappend(*mvclauses, clause);
- else
- non_mvclauses = lappend(non_mvclauses, clause);
+ pfree(best);
}
- /*
- * Perform regular estimation using the clauses incompatible with the chosen
- * histogram (or MV stats in general).
- */
- return non_mvclauses;
+ /* cleanup (maybe leave it up to the memory context?) */
+ for (i = 0; i < nmvstats; i++)
+ bms_free(stats_attnums[i]);
+
+ for (i = 0; i < nclauses; i++)
+ bms_free(clauses_attnums[i]);
+
+ for (i = 0; i < nconditions; i++)
+ bms_free(conditions_attnums[i]);
+
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(conditions_attnums);
+ pfree(clauses_array);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+ pfree(mvstats);
+
+ list_free(clauses);
+ list_free(conditions);
+ list_free(stats);
+
+ return result;
}
typedef struct
@@ -1474,6 +2686,7 @@ clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums, int type
return true;
}
+
/*
* collect attnums from functional dependencies
*
@@ -2022,6 +3235,24 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
* Check that there are stats with at least one of the requested types.
*/
static bool
+stats_type_matches(MVStatisticInfo *stat, int type)
+{
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
+
+ return false;
+}
+
+/*
+ * Check that there are stats with at least one of the requested types.
+ */
+static bool
has_stats(List *stats, int type)
{
ListCell *s;
@@ -2030,13 +3261,8 @@ has_stats(List *stats, int type)
{
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
- if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
- return true;
-
- if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
- return true;
-
- if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ /* terminate if we've found at least one matching statistics */
+ if (stats_type_matches(stat, type))
return true;
}
@@ -2087,22 +3313,26 @@ find_stats(PlannerInfo *root, Index relid)
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -2113,32 +3343,85 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
- nmatches, matches,
- lowsel, fullmatch, false);
+ ((is_or) ? 0 : nmatches), matches,
+ lowsel, fullmatch, is_or);
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
{
- /* used to 'scale' for MCV lists not covering all tuples */
+ /*
+ * Find out what part of the data is covered by the MCV list,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample items got into the MCV, and the rest
+ * is either in a histogram, or not covered by stats).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole list, which might save us a bit of time
+ * spent on accessing the not-matching part of the MCV list.
+ * Although it's likely in a cache, so it's very fast.
+ */
u += mcvlist->items[i]->frequency;
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s*u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -2369,64 +3652,57 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
}
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each MCV item */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching MCV items */
- or_nmatches = mcvlist->nitems;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
/* by default none of the MCV items matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
stakeys, mcvlist,
- or_nmatches, or_matches,
+ tmp_nmatches, tmp_matches,
lowsel, fullmatch, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mcvlist->nitems; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
+ pfree(tmp_matches);
}
else
- {
elog(ERROR, "unknown clause type: %d", clause->type);
- }
}
/*
@@ -2484,15 +3760,18 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
MVSerializedHistogram mvhist = NULL;
@@ -2505,25 +3784,55 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
Assert (mvhist != NULL);
Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
/*
- * Bitmap of bucket matches (mismatch, partial, full). by default
- * all buckets fully match (and we'll eliminate them).
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+ matches = palloc0(sizeof(char) * nmatches);
- nmatches = mvhist->nbuckets;
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, false);
- /* build the match bitmap */
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
- nmatches, matches, false);
+ ((is_or) ? 0 : nmatches), matches,
+ is_or);
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
/*
* Find out what part of the data is covered by the histogram,
* so that we can 'scale' the selectivity properly (e.g. when
@@ -2537,10 +3846,23 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
*/
u += mvhist->buckets[i]->ntuples;
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
#ifdef DEBUG_MVHIST
@@ -2549,9 +3871,14 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s * u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/* cached result of bucket boundary comparison for a single dimension */
@@ -2699,7 +4026,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
{
int i;
ListCell * l;
-
+
/*
* Used for caching function calls, only once per deduplicated value.
*
@@ -2742,7 +4069,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
FmgrInfo opproc; /* operator */
fmgr_info(get_opcode(expr->opno), &opproc);
-
+
/* reset the cache (per clause) */
memset(callcache, 0, mvhist->nbuckets);
@@ -2902,64 +4229,57 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each bucket */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching buckets */
- or_nmatches = mvhist->nbuckets;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mvhist->nbuckets;
- /* by default none of the buckets matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ /* by default none of the buckets matches the clauses (OR clause) */
+ tmp_matches = palloc0(sizeof(char) * mvhist->nbuckets);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* but AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ tmp_nmatches = update_match_bitmap_histogram(root, tmp_clauses,
stakeys, mvhist,
- or_nmatches, or_matches, or_clause(clause));
+ tmp_nmatches, tmp_matches, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mvhist->nbuckets; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
-
+ pfree(tmp_matches);
}
else
elog(ERROR, "unknown clause type: %d", clause->type);
}
- /* free the call cache */
pfree(callcache);
return nmatches;
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 5fc2f9c..7384cb8 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3520,7 +3520,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3543,7 +3544,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3710,7 +3712,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3746,7 +3748,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3783,7 +3786,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3921,12 +3925,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3938,7 +3944,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index ea831f5..6299e75 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 46c95b0..7d0a3a1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1627,13 +1627,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6259,7 +6261,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6579,7 +6582,8 @@ btcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7330,7 +7334,8 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7560,7 +7565,7 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ea5a09a..27a8de5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -393,6 +394,15 @@ static const struct config_enum_entry force_parallel_mode_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3707,6 +3717,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 3e4f4d1..d404914 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -90,6 +90,137 @@ even attempting to do the more expensive estimation.
Whenever we find there are no suitable stats, we skip the expensive steps.
+Combining multiple statistics
+-----------------------------
+
+When estimating selectivity of a list of clauses, there may exist no statistics
+covering all of them. If there are multiple statistics, each covering some
+subset of the attributes, the optimizer needs to figure out which of those
+statistics to apply.
+
+When the statistics do not overlap, the solution is trivial - we can simply
+split the groups of conditions by the matching statistics, and then multiply the
+selectivities. For example assume multivariate statistics on (b,c) and (d,e),
+and a condition like this:
+
+ (a=1) AND (b=2) AND (c=3) AND (d=4) AND (e=5)
+
+Then (a=1) is not covered by any of the statistics, so will be estimated using
+the regular per-column statistics. The two conditions ((b=2) AND (c=3)) will be
+estimated using the (b,c) statistics, and ((d=4) AND (e=5)) will be estimated
+using (d,e) statistics. And the resulting selectivities will be estimated.
+
+Now, what if the statistics overlap? For example assume the same condition as
+above, but let's say we have statistics on (a,b,c) and (a,c,d,e). What then?
+
+As selectivity is just a probability that the condition holds for a random row,
+we can write the selectivity like this:
+
+ P(a=1 & b=2 & c=3 & d=4 & e=5)
+
+and we can rewrite it using conditional probability like this
+
+ P(a=1 & b=2 & c=3) * P(d=4 & e=5 | a=1 & b=2 & c=3)
+
+Notice that the first part already matches to (a,b,c) statistics. If we assume
+that columns that are not referenced by the same statistics are independent, we
+may rewrite the second half like this
+
+ P(d=4 & e=5 | a=1 & b=2 & c=3) = P(d=4 & e=5 | a=1 & c=3)
+
+which corresponds to the statistics on (a,c,d,e).
+
+If there are multiple statistics defined on a table, it's not difficult to come
+up with examples when there are multiple ways to combine them to cover a list of
+clauses. We need a way to find the best combination of statistics.
+
+This is the purpose of choose_mv_statistics(). It searches through the possible
+combinations of statistics, and searches such combination that
+
+ (a) covers the most clauses of the list
+
+ (b) reuses the maximum number of clauses as conditions
+ (in conditional probabilities)
+
+While (a) criteria seems natural, the (b) may seem a bit awkward at first. The
+idea is that conditions in a way of transfering information about dependencies
+between statistics.
+
+There are two alternative implementations of choose_mv_statistics() - greedy
+and exhaustive. Exhaustive actually searches through all possible combinations
+of statistics, and for larger numbers of statistics may get quite expensive
+(as it, unsurprisingly, has exponential cost). Greedy terminates in less than
+K steps (when K is the number of clauses), and in each step chooses the best
+next statistics. I've been unable to come up with an example where those two
+approaches would produce different combinations.
+
+It's possible to choose the optimization using mvstat_search_type, with either
+'greedy' or 'exhaustive' values (default is 'greedy').
+
+ SET mvstat_search_type = 'exhaustive';
+
+Note: This is meant mostly for experimentation. I do expect we'll choose one of
+the algorithms and remove the GUC before commit.
+
+
+Limitations of combining statistics
+-----------------------------------
+
+As described in the section 'Combining multiple statistics', the current appoach
+is based on transfering information between statistics by means of conditional
+probabilities. This is a relatively cheap and efficient approach, but it is
+based on two assumptions:
+
+ (1) The overlap between the statistics needs to be sufficiently large, i.e.
+ there needs to be enough columns shared by the statistics to transfer
+ information about dependencies between the remaining columns.
+
+ (2) The query needs to include sufficient clauses on the shared columns.
+
+How a violation of those assumptions may be a problem can be illustrated by
+a simple example. Assume a table with three columns (a,b,c) containing exactly
+the same values, and statistics on (a,b) and (b,c):
+
+ CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+
+ CREATE STATISTICS s1 ON test (a,b) WITH (mcv);
+ CREATE STATISTICS s2 ON test (b,c) WITH (mcv);
+
+ ANALYZE test;
+
+First, let's estimate this query:
+
+ SELECT * FROM test WHERE (a < 10) AND (c < 10);
+
+Clearly, there are no conditions on 'b' (which is the only column shared by the
+two statistics), so we'll end up with an estimate based on assumption of
+independence:
+
+ P(a < 10) * P(c < 10) = 0.01 * 0.01 = 0.0001
+
+Which is a significant under-estimate, as the proper selectivity is 0.01.
+
+But let's estimate another query:
+
+ SELECT * FROM test WHERE (a < 10) AND (b < 500) AND (c < 10);
+
+In this case, the estimate may be computed for example like this:
+
+ P[(a < 10) & (b < 500) & (c < 10)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (a < 10) & (b < 500)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (b < 500)]
+
+The trouble is the probability P(c < 10 | b < 500) evaluates to 0.02, because
+we have assumed (a) and (c) are independent because there is no statistic
+containing both these columns, and the condition on (b) does not transfer
+sufficient amount of information between the two statistics.
+
+Currently, the only solution is to build statistics on all three columns, but
+see the 'combining statistics using convolution' section for ideas on how to
+improve this.
+
+
Further (possibly crazy) ideas
------------------------------
@@ -111,3 +242,38 @@ But of course, this may result in expensive estimation (CPU-wise).
So we might add a GUC to choose between a simple (single statistics) and thus
multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
+
+
+Combining stats using convolution
+---------------------------------
+
+While the current approach for combining statistics is based on conditional
+probabilities, and thus only works when the query includes conditions on the
+overlapping parts of the statistics. But there may be other ways to combine
+statistics, relaxing this requirement.
+
+Let's assume two histograms H1 and H2 - then combining them might work about
+like this:
+
+
+ for (buckets of H1, satisfying local conditions)
+ {
+ for (buckets of H2, overlapping with H1 bucket)
+ {
+ mark H2 bucket as 'valid'
+ }
+ }
+
+ s1 = s2 = 0.0
+ for (buckets of H2 marked as valid)
+ {
+ s1 += frequency
+
+ if (bucket satistifes local conditions)
+ s2 += frequency
+ }
+
+ s = (s2 / s1) /* final selectivity estimate */
+
+However this may quickly get non-trivial, e.g. when combining two statistics
+of different types (histogram vs. MCV).
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 78c7cae..a5ac088 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -191,11 +191,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index f05a517..35b2f8e 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,6 +17,14 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
--
2.1.0
0007-multivariate-ndistinct-coefficients.patchtext/x-patch; charset=UTF-8; name=0007-multivariate-ndistinct-coefficients.patchDownload
From e42a2efeb060692d0a1ebe23f28c654130b26dcd Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Wed, 23 Dec 2015 02:07:58 +0100
Subject: [PATCH 7/9] multivariate ndistinct coefficients
---
doc/src/sgml/ref/create_statistics.sgml | 9 ++
src/backend/catalog/system_views.sql | 3 +-
src/backend/commands/analyze.c | 2 +-
src/backend/commands/statscmds.c | 11 +-
src/backend/optimizer/path/clausesel.c | 4 +
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/adt/selfuncs.c | 93 +++++++++++++++-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.ndistinct | 83 ++++++++++++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 23 +++-
src/backend/utils/mvstats/mvdist.c | 171 +++++++++++++++++++++++++++++
src/include/catalog/pg_mv_statistic.h | 26 +++--
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 9 +-
src/test/regress/expected/rules.out | 3 +-
16 files changed, 424 insertions(+), 23 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.ndistinct
create mode 100644 src/backend/utils/mvstats/mvdist.c
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index fd3382e..80360a6 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -168,6 +168,15 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>ndistinct</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables ndistinct coefficients for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 6afdee0..a550141 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -169,7 +169,8 @@ CREATE VIEW pg_mv_stats AS
length(S.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
length(S.stahist) AS histbytes,
- pg_mv_stats_histogram_info(S.stahist) AS histinfo
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo,
+ standcoeff AS ndcoeff
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index cbaa4e1..0f6db77 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -582,7 +582,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
}
/* Build multivariate stats (if there are any). */
- build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
+ build_mv_stats(onerel, totalrows, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index b974655..6ea0e13 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -138,7 +138,8 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
build_mcv = false,
- build_histogram = false;
+ build_histogram = false,
+ build_ndistinct = false;
int32 max_buckets = -1,
max_mcv_items = -1;
@@ -221,6 +222,8 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "ndistinct") == 0)
+ build_ndistinct = defGetBoolean(opt);
else if (strcmp(opt->defname, "mcv") == 0)
build_mcv = defGetBoolean(opt);
else if (strcmp(opt->defname, "max_mcv_items") == 0)
@@ -275,10 +278,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv || build_histogram))
+ if (! (build_dependencies || build_mcv || build_histogram || build_ndistinct))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram, ndistinct) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -311,6 +314,7 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+ values[Anum_pg_mv_statistic_ndist_enabled-1] = BoolGetDatum(build_ndistinct);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
@@ -318,6 +322,7 @@ CreateStatistics(CreateStatsStmt *stmt)
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
nulls[Anum_pg_mv_statistic_stahist -1] = true;
+ nulls[Anum_pg_mv_statistic_standist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index c1b8999..2540da9 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -59,6 +59,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
#define MV_CLAUSE_TYPE_HIST 0x04
+#define MV_CLAUSE_TYPE_NDIST 0x08
static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
int type);
@@ -3246,6 +3247,9 @@ stats_type_matches(MVStatisticInfo *stat, int type)
if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
return true;
+ if ((type & MV_CLAUSE_TYPE_NDIST) && stat->ndist_built)
+ return true;
+
return false;
}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index d46aed2..bd2c306 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -416,7 +416,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built || mvstat->ndist_built)
{
info = makeNode(MVStatisticInfo);
@@ -427,11 +427,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
info->hist_enabled = mvstat->hist_enabled;
+ info->ndist_enabled = mvstat->ndist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
info->hist_built = mvstat->hist_built;
+ info->ndist_built = mvstat->ndist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 7d0a3a1..a84dd2b 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -132,6 +132,7 @@
#include "utils/fmgroids.h"
#include "utils/index_selfuncs.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/nabstime.h"
#include "utils/pg_locale.h"
#include "utils/rel.h"
@@ -206,6 +207,7 @@ static Const *string_to_const(const char *str, Oid datatype);
static Const *string_to_bytea_const(const char *str, size_t str_len);
static List *add_predicate_to_quals(IndexOptInfo *index, List *indexQuals);
+static Oid find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos);
/*
* eqsel - Selectivity of "=" for any data types.
@@ -3422,12 +3424,26 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
* don't know by how much. We should never clamp to less than the
* largest ndistinct value for any of the Vars, though, since
* there will surely be at least that many groups.
+ *
+ * However we don't need to do this if we have ndistinct stats on
+ * the columns - in that case we can simply use the coefficient
+ * to get the (probably way more accurate) estimate.
+ *
+ * XXX Probably needs refactoring (don't like to mix with clamp
+ * and coeff at the same time).
*/
double clamp = rel->tuples;
+ double coeff = 1.0;
if (relvarcount > 1)
{
- clamp *= 0.1;
+ Oid oid = find_ndistinct_coeff(root, rel, varinfos);
+
+ if (oid != InvalidOid)
+ coeff = load_mv_ndistinct(oid);
+ else
+ clamp *= 0.1;
+
if (clamp < relmaxndistinct)
{
clamp = relmaxndistinct;
@@ -3436,6 +3452,13 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
clamp = rel->tuples;
}
}
+
+ /*
+ * Apply ndistinct coefficient from multivar stats (we must do this
+ * before clamping the estimate in any way.
+ */
+ reldistinct /= coeff;
+
if (reldistinct > clamp)
reldistinct = clamp;
@@ -7582,3 +7605,71 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
/* XXX what about pages_per_range? */
}
+
+/*
+ * Find applicable ndistinct statistics and compute the coefficient to
+ * correct the estimate (simply a product of per-column ndistincts).
+ *
+ * Currently we only look for a perfect match, i.e. a single ndistinct
+ * estimate exactly matching all the columns of the statistics.
+ */
+static Oid
+find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+ VariableStatData vardata;
+
+ foreach(lc, varinfos)
+ {
+ GroupVarInfo *varinfo = (GroupVarInfo *) lfirst(lc);
+
+ if (varinfo->rel != rel)
+ continue;
+
+ /* FIXME handle expressions in general only */
+
+ /*
+ * examine the variable (or expression) so that we know which
+ * attribute we're dealing with - we need this for matching the
+ * ndistinct coefficient
+ *
+ * FIXME probably might remember this from estimate_num_groups
+ */
+ examine_variable(root, varinfo->var, 0, &vardata);
+
+ if (HeapTupleIsValid(vardata.statsTuple))
+ {
+ Form_pg_statistic stats
+ = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+
+ attnums = bms_add_member(attnums, stats->staattnum);
+
+ ReleaseVariableStats(vardata);
+ }
+ }
+
+ /* look for a matching ndistinct statistics */
+ foreach (lc, rel->mvstatlist)
+ {
+ int i;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* skip statistics without ndistinct coefficient built */
+ if (!info->ndist_built)
+ continue;
+
+ /* only exact matches for now (same set of columns) */
+ if (bms_num_members(attnums) != info->stakeys->dim1)
+ continue;
+
+ /* check that the columns match */
+ for (i = 0; i < info->stakeys->dim1; i++)
+ if (bms_is_member(info->stakeys->values[i], attnums))
+ continue;
+
+ return info->mvoid;
+ }
+
+ return InvalidOid;
+}
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 9dbb3b6..d4b88e9 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o histogram.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o mvdist.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.ndistinct b/src/backend/utils/mvstats/README.ndistinct
new file mode 100644
index 0000000..32d1624
--- /dev/null
+++ b/src/backend/utils/mvstats/README.ndistinct
@@ -0,0 +1,83 @@
+ndistinct coefficients
+======================
+
+Estimating number of distinct groups in a combination of columns is tricky,
+and the estimation error is often significant. By ndistinct coefficient we
+mean a ratio
+
+ q = ndistinct(a) * ndistinct(b) / ndistinct(a,b)
+
+where 'a' and 'b' are columns, ndistinct(a) is (an estimate of) a number of
+distinct values in column 'a'. And ndistinct(a,b) is the same thing for the
+pair of columns.
+
+The meaning of the coefficient may be illustrated by answering the following
+question: Given a combination of columns (a,b), how many distinct values of 'b'
+matches a chosen value of 'a' on average?
+
+Let's assume we know ndistinct(a) and ndistinct(a,b). Then the answer to the
+question clearly is
+
+ ndistinct(a,b) / ndistinct(a)
+
+and by using 'q' we may rewrite this as
+
+ ndistinct(b) / q
+
+so 'q' may be considered as a correction factor of the ndistinct estimate given
+a condition on one of the columns.
+
+This may be generalized to a combination of 'n' columns
+
+ [ndistinct(c1) * ... * ndistinct(cn)] / ndistinct(c1, ..., cn)
+
+and the meaning is very similar, except that we need to use conditions on (n-1)
+of the columns.
+
+
+Selectivity estimation
+----------------------
+
+As explained in the previous paragraph, ndistinct coefficients may be used to
+estimate cardinality of a column, given some apriori knowledge. Let's assume
+we need to estimate selectivity of a condition
+
+ (a=1) AND (b=2)
+
+which we can expand like this
+
+ P(a=1 & b=2) = P(a=1) * P(b=2 | a=1)
+
+Let's also assume that the distribution of 'b' is uniform, i.e. that
+
+ P(a=1) = 1/ndistinct(a)
+ P(b=2) = 1/ndistinct(b)
+ P(a=1 & b=2) = 1/ndistinct(a,b)
+
+ P(b=2 | a=1) = ndistinct(a) / ndistinct(a,b)
+
+which may be rewritten like
+
+ P(b=2 | a=1)
+ = ndistinct(a,b) / ndistinct(a)
+ = (1/ndistinct(b)) * [(ndistinct(a) * ndistinct(b)) / ndistinct(a,b)]
+ = (1/ndistinct(b)) * q
+
+and therefore
+
+ P(a=1 & b=2) = (1/ndistinct(a)) * (1/ndistinct(b)) * q
+
+This also illustrates 'q' as a correction coefficient.
+
+It also explains why we store the coefficient and not simply ndistinct(a,b).
+This way we can simply estimate individual clauses and then simply correct
+the estimate by multiplying the result with 'q' - we don't have to mess with
+ndistinct estimates at all.
+
+Naturally, as the coefficient is derives from ndistinct(a,b), it may be also
+used to estimate GROUP BY clauses on the combination of columns, replacing the
+existing heuristics in estimate_num_groups().
+
+Note: Currently only the GROUP BY estimation is implemented. It's a bit unclear
+how to implement the clause estimation when there are other statistics (esp.
+MCV lists and/or functional dependencies) available.
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index d404914..6d4b09b 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -20,6 +20,8 @@ Currently we only have two kinds of multivariate statistics
(c) multivariate histograms (README.histogram)
+ (d) ndistinct coefficients
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index ffb76f4..2be980d 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -32,7 +32,8 @@ static List* list_mv_stats(Oid relid);
* and serializes them back into the catalog (as bytea values).
*/
void
-build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats)
{
ListCell *lc;
@@ -53,6 +54,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
MVHistogram histogram = NULL;
+ double ndist = -1;
int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
@@ -92,6 +94,9 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->ndist_enabled)
+ ndist = build_mv_ndistinct(totalrows, numrows, rows, attrs, stats);
+
/* build the MCV list */
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
@@ -101,7 +106,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, ndist, attrs, stats);
}
}
@@ -183,6 +188,8 @@ list_mv_stats(Oid relid)
info->mcv_built = stats->mcv_built;
info->hist_enabled = stats->hist_enabled;
info->hist_built = stats->hist_built;
+ info->ndist_enabled = stats->ndist_enabled;
+ info->ndist_built = stats->ndist_built;
result = lappend(result, info);
}
@@ -252,7 +259,7 @@ find_mv_attnums(Oid mvoid, Oid *relid)
void
update_mv_stats(Oid mvoid,
MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
- int2vector *attrs, VacAttrStats **stats)
+ double ndistcoeff, int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -292,26 +299,36 @@ update_mv_stats(Oid mvoid,
= PointerGetDatum(data);
}
+ if (ndistcoeff > 1.0)
+ {
+ nulls[Anum_pg_mv_statistic_standist -1] = false;
+ values[Anum_pg_mv_statistic_standist-1] = Float8GetDatum(ndistcoeff);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
replaces[Anum_pg_mv_statistic_stahist-1] = true;
+ replaces[Anum_pg_mv_statistic_standist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_hist_built-1] = false;
+ nulls[Anum_pg_mv_statistic_ndist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_ndist_built-1] = true;
replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
+ values[Anum_pg_mv_statistic_ndist_built-1] = BoolGetDatum(ndistcoeff > 1.0);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/mvdist.c b/src/backend/utils/mvstats/mvdist.c
new file mode 100644
index 0000000..59b8358
--- /dev/null
+++ b/src/backend/utils/mvstats/mvdist.c
@@ -0,0 +1,171 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvdist.c
+ * POSTGRES multivariate distinct coefficients
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mvdist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include <math.h>
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+static double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
+
+/*
+ * Compute ndistinct coefficient for the combination of attributes. This
+ * computes the ndistinct estimate using the same estimator used in analyze.c
+ * and then computes the coefficient.
+ */
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats)
+{
+ int i, j;
+ int f1, cnt, d;
+ int nmultiple, summultiple;
+ int numattrs = attrs->dim1;
+ MultiSortSupport mss = multi_sort_init(numattrs);
+ double ndistcoeff;
+
+ /*
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simpler / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ Assert(numattrs >= 2);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* accumulate all the data into the array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ items[j].values[i]
+ = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+ }
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /* count number of distinct combinations */
+
+ f1 = 0;
+ cnt = 1;
+ d = 1;
+ for (i = 1; i < numrows; i++)
+ {
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ {
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ d++;
+ cnt = 0;
+ }
+
+ cnt += 1;
+ }
+
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ ndistcoeff = 1 / estimate_ndistinct(totalrows, numrows, d, f1);
+
+ /*
+ * now count distinct values for each attribute and incrementally
+ * compute ndistinct(a,b) / (ndistinct(a) * ndistinct(b))
+ *
+ * FIXME Probably need to handle cases when one of the ndistinct
+ * estimates is negative, and also check that the combined
+ * ndistinct is greater than any of those partial values.
+ */
+ for (i = 0; i < numattrs; i++)
+ ndistcoeff *= stats[i]->stadistinct;
+
+ return ndistcoeff;
+}
+
+double
+load_mv_ndistinct(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->ndist_enabled && mvstat->ndist_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_standist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return DatumGetFloat8(deps);
+}
+
+/* The Duj1 estimator (already used in analyze.c). */
+static double
+estimate_ndistinct(double totalrows, int numrows, int d, int f1)
+{
+ double numer,
+ denom,
+ ndistinct;
+
+ numer = (double) numrows *(double) d;
+
+ denom = (double) (numrows - f1) +
+ (double) f1 * (double) numrows / totalrows;
+
+ ndistinct = numer / denom;
+
+ /* Clamp to sane range in case of roundoff error */
+ if (ndistinct < (double) d)
+ ndistinct = (double) d;
+
+ if (ndistinct > totalrows)
+ ndistinct = totalrows;
+
+ return floor(ndistinct + 0.5);
+}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index a5945af..ee353da 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -39,6 +39,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
bool hist_enabled; /* build histogram? */
+ bool ndist_enabled; /* build ndist coefficient? */
/* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
@@ -48,6 +49,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
bool hist_built; /* histogram was built */
+ bool ndist_built; /* ndistinct coeff built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -56,6 +58,7 @@ CATALOG(pg_mv_statistic,3381)
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
bytea stahist; /* MV histogram (serialized) */
+ float8 standcoeff; /* ndistinct coeff (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -71,21 +74,24 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 15
+#define Natts_pg_mv_statistic 18
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
#define Anum_pg_mv_statistic_mcv_enabled 5
#define Anum_pg_mv_statistic_hist_enabled 6
-#define Anum_pg_mv_statistic_mcv_max_items 7
-#define Anum_pg_mv_statistic_hist_max_buckets 8
-#define Anum_pg_mv_statistic_deps_built 9
-#define Anum_pg_mv_statistic_mcv_built 10
-#define Anum_pg_mv_statistic_hist_built 11
-#define Anum_pg_mv_statistic_stakeys 12
-#define Anum_pg_mv_statistic_stadeps 13
-#define Anum_pg_mv_statistic_stamcv 14
-#define Anum_pg_mv_statistic_stahist 15
+#define Anum_pg_mv_statistic_ndist_enabled 7
+#define Anum_pg_mv_statistic_mcv_max_items 8
+#define Anum_pg_mv_statistic_hist_max_buckets 9
+#define Anum_pg_mv_statistic_deps_built 10
+#define Anum_pg_mv_statistic_mcv_built 11
+#define Anum_pg_mv_statistic_hist_built 12
+#define Anum_pg_mv_statistic_ndist_built 13
+#define Anum_pg_mv_statistic_stakeys 14
+#define Anum_pg_mv_statistic_stadeps 15
+#define Anum_pg_mv_statistic_stamcv 16
+#define Anum_pg_mv_statistic_stahist 17
+#define Anum_pg_mv_statistic_standist 18
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 46bece6..a2fafd2 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -621,11 +621,13 @@ typedef struct MVStatisticInfo
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
bool hist_enabled; /* histogram enabled */
+ bool ndist_enabled; /* ndistinct coefficient enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
bool hist_built; /* histogram built */
+ bool ndist_built; /* ndistinct coefficient built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 35b2f8e..fb2c5d8 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -225,6 +225,7 @@ typedef MVSerializedHistogramData *MVSerializedHistogram;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
MVSerializedHistogram load_mv_histogram(Oid mvoid);
+double load_mv_ndistinct(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
@@ -266,11 +267,17 @@ MVHistogram
build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int numrows_total);
-void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
void update_mv_stats(Oid relid, MVDependencies dependencies,
MCVList mcvlist, MVHistogram histogram,
+ double ndistcoeff,
int2vector *attrs, VacAttrStats **stats);
#ifdef DEBUG_MVHIST
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 1a1a4ca..0ad935e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1377,7 +1377,8 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
length(s.stahist) AS histbytes,
- pg_mv_stats_histogram_info(s.stahist) AS histinfo
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo,
+ s.standcoeff AS ndcoeff
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
--
2.1.0
0008-change-how-we-apply-selectivity-to-number-of-groups-.patchtext/x-patch; charset=UTF-8; name=0008-change-how-we-apply-selectivity-to-number-of-groups-.patchDownload
From 16df0859ba9478af4d93fc8fe45f17b4f255e1a8 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 26 Jan 2016 18:14:33 +0100
Subject: [PATCH 8/9] change how we apply selectivity to number of groups
estimate
Instead of simply multiplying the ndistinct estimate with selecticity,
we instead use the formula for the expected number of distinct values
observed in 'k' rows when there are 'd' distinct values in the bin
d * (1 - ((d - 1) / d)^k)
This is 'with replacements' which seems appropriate for the use, and it
mostly assumes uniform distribution of the distinct values. So if the
distribution is not uniform (e.g. there are very frequent groups) this
may be less accurate than the current algorithm in some cases, giving
over-estimates. But that's probably better than OOM.
---
src/backend/utils/adt/selfuncs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index a84dd2b..ce3ad19 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3465,7 +3465,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
/*
* Multiply by restriction selectivity.
*/
- reldistinct *= rel->rows / rel->tuples;
+ reldistinct = reldistinct * (1 - powl((reldistinct - 1) / reldistinct,rel->rows));
/*
* Update estimate of total distinct groups.
--
2.1.0
0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchtext/x-patch; charset=UTF-8; name=0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchDownload
From 60ab2e6675b5d43f5cebccb7fd06c7e7387992f3 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Sun, 28 Feb 2016 21:16:40 +0100
Subject: [PATCH 9/9] fixup of regression tests (plans changes by group by
estimation)
---
src/test/regress/expected/join.out | 20 ++++++++++----------
src/test/regress/expected/subselect.out | 25 +++++++++++--------------
src/test/regress/expected/union.out | 16 ++++++++--------
3 files changed, 29 insertions(+), 32 deletions(-)
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 59d7877..d9dd5ca 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3951,17 +3951,17 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
on d.a = s.id;
QUERY PLAN
---------------------------------------
- Merge Left Join
- Merge Cond: (d.a = s.id)
- -> Sort
- Sort Key: d.a
- -> Seq Scan on d
+ Merge Right Join
+ Merge Cond: (s.id = d.a)
-> Sort
Sort Key: s.id
-> Subquery Scan on s
-> HashAggregate
Group Key: b.id
-> Seq Scan on b
+ -> Sort
+ Sort Key: d.a
+ -> Seq Scan on d
(11 rows)
-- similarly, but keying off a DISTINCT clause
@@ -3970,17 +3970,17 @@ select d.* from d left join (select distinct * from b) s
on d.a = s.id;
QUERY PLAN
---------------------------------------------
- Merge Left Join
- Merge Cond: (d.a = s.id)
- -> Sort
- Sort Key: d.a
- -> Seq Scan on d
+ Merge Right Join
+ Merge Cond: (s.id = d.a)
-> Sort
Sort Key: s.id
-> Subquery Scan on s
-> HashAggregate
Group Key: b.id, b.c_id
-> Seq Scan on b
+ -> Sort
+ Sort Key: d.a
+ -> Seq Scan on d
(11 rows)
-- check join removal works when uniqueness of the join condition is enforced
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index de64ca7..0fc93d9 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -807,27 +807,24 @@ select * from int4_tbl where
explain (verbose, costs off)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
- QUERY PLAN
-----------------------------------------------------------------------
- Hash Join
+ QUERY PLAN
+----------------------------------------------------------------
+ Hash Semi Join
Output: o.f1
Hash Cond: (o.f1 = "ANY_subquery".f1)
-> Seq Scan on public.int4_tbl o
Output: o.f1
-> Hash
Output: "ANY_subquery".f1, "ANY_subquery".g
- -> HashAggregate
+ -> Subquery Scan on "ANY_subquery"
Output: "ANY_subquery".f1, "ANY_subquery".g
- Group Key: "ANY_subquery".f1, "ANY_subquery".g
- -> Subquery Scan on "ANY_subquery"
- Output: "ANY_subquery".f1, "ANY_subquery".g
- Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
- -> HashAggregate
- Output: i.f1, (generate_series(1, 2) / 10)
- Group Key: i.f1
- -> Seq Scan on public.int4_tbl i
- Output: i.f1
-(18 rows)
+ Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
+ -> HashAggregate
+ Output: i.f1, (generate_series(1, 2) / 10)
+ Group Key: i.f1
+ -> Seq Scan on public.int4_tbl i
+ Output: i.f1
+(15 rows)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 016571b..f2e297e 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -263,16 +263,16 @@ ORDER BY 1;
SELECT q2 FROM int8_tbl INTERSECT SELECT q1 FROM int8_tbl;
q2
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
SELECT q2 FROM int8_tbl INTERSECT ALL SELECT q1 FROM int8_tbl;
q2
------------------
+ 123
4567890123456789
4567890123456789
- 123
(3 rows)
SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
@@ -305,16 +305,16 @@ SELECT q1 FROM int8_tbl EXCEPT SELECT q2 FROM int8_tbl;
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q2 FROM int8_tbl;
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT DISTINCT q2 FROM int8_tbl;
q1
------------------
+ 123
4567890123456789
4567890123456789
- 123
(3 rows)
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q1 FROM int8_tbl FOR NO KEY UPDATE;
@@ -343,8 +343,8 @@ SELECT f1 FROM float8_tbl EXCEPT SELECT f1 FROM int4_tbl ORDER BY 1;
SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl;
q1
-------------------
- 4567890123456789
123
+ 4567890123456789
456
4567890123456789
123
@@ -355,15 +355,15 @@ SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FR
SELECT q1 FROM int8_tbl INTERSECT (((SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl)));
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
(((SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl))) UNION ALL SELECT q2 FROM int8_tbl;
q1
-------------------
- 4567890123456789
123
+ 4567890123456789
456
4567890123456789
123
@@ -419,8 +419,8 @@ HINT: There is a column named "q2" in table "*SELECT* 2", but it cannot be refe
SELECT q1 FROM int8_tbl EXCEPT (((SELECT q2 FROM int8_tbl ORDER BY q2 LIMIT 1)));
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
--
--
2.1.0
On Tue, Mar 8, 2016 at 12:13 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
Hi,
attached is v11 of the patch - this is mostly a cleanup of v10, removing
redundant code, adding missing comments, removing obsolete FIXME/TODOs
and so on. Overall this shaves ~20kB from the patch (not a primary
objective, though).
This has some some conflicts with the pathification commit, in the
regression tests.
To avoid that, I applied it to the commit before that, 3fc6e2d7f5b652b417fa6^
Having done that, In my hands, it fails its own regression tests.
Diff attached.
It breaks contrib postgres_fdw, I'll look into that when I get a
chance of no one beats me to it.
postgres_fdw.c: In function 'postgresGetForeignJoinPaths':
postgres_fdw.c:3623: error: too few arguments to function
'clauselist_selectivity'
postgres_fdw.c:3642: error: too few arguments to function
'clauselist_selectivity'
Cheers,
Jeff
Attachments:
regression.diffsapplication/octet-stream; name=regression.diffsDownload
*** /home/jjanes/pgsql/git/src/test/regress/expected/mv_dependencies.out 2016-03-08 18:08:45.275328461 -0800
--- /home/jjanes/pgsql/git/src/test/regress/results/mv_dependencies.out 2016-03-08 18:17:34.914707058 -0800
***************
*** 21,26 ****
--- 21,28 ----
ERROR: unrecognized STATISTICS option "unknown_option"
-- correct command
CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+ ERROR: duplicate key value violates unique constraint "pg_mv_statistic_name_index"
+ DETAIL: Key (staname, stanamespace)=(s1, 2200) already exists.
-- random data (no functional dependencies)
INSERT INTO functional_dependencies
SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
***************
*** 29,36 ****
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
deps_enabled | deps_built | pg_mv_stats_dependencies_show
--------------+------------+-------------------------------
! t | f |
! (1 row)
TRUNCATE functional_dependencies;
-- a => b, a => c, b => c
--- 31,37 ----
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
deps_enabled | deps_built | pg_mv_stats_dependencies_show
--------------+------------+-------------------------------
! (0 rows)
TRUNCATE functional_dependencies;
-- a => b, a => c, b => c
***************
*** 41,48 ****
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
deps_enabled | deps_built | pg_mv_stats_dependencies_show
--------------+------------+-------------------------------
! t | t | 1 => 2, 1 => 3, 2 => 3
! (1 row)
TRUNCATE functional_dependencies;
-- a => b, a => c
--- 42,48 ----
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
deps_enabled | deps_built | pg_mv_stats_dependencies_show
--------------+------------+-------------------------------
! (0 rows)
TRUNCATE functional_dependencies;
-- a => b, a => c
***************
*** 53,60 ****
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
deps_enabled | deps_built | pg_mv_stats_dependencies_show
--------------+------------+-------------------------------
! t | t | 1 => 2, 1 => 3
! (1 row)
TRUNCATE functional_dependencies;
-- check explain (expect bitmap index scan, not plain index scan)
--- 53,59 ----
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
deps_enabled | deps_built | pg_mv_stats_dependencies_show
--------------+------------+-------------------------------
! (0 rows)
TRUNCATE functional_dependencies;
-- check explain (expect bitmap index scan, not plain index scan)
***************
*** 66,83 ****
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
deps_enabled | deps_built | pg_mv_stats_dependencies_show
--------------+------------+-------------------------------
! t | t | 1 => 2, 1 => 3, 2 => 3
! (1 row)
EXPLAIN (COSTS off)
SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
! QUERY PLAN
! ---------------------------------------------
! Bitmap Heap Scan on functional_dependencies
! Recheck Cond: ((a = 10) AND (b = 5))
! -> Bitmap Index Scan on fdeps_idx
! Index Cond: ((a = 10) AND (b = 5))
! (4 rows)
DROP TABLE functional_dependencies;
-- varlena type (text)
--- 65,79 ----
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
deps_enabled | deps_built | pg_mv_stats_dependencies_show
--------------+------------+-------------------------------
! (0 rows)
EXPLAIN (COSTS off)
SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
! QUERY PLAN
! -------------------------------------------------------
! Index Scan using fdeps_idx on functional_dependencies
! Index Cond: ((a = 10) AND (b = 5))
! (2 rows)
DROP TABLE functional_dependencies;
-- varlena type (text)
***************
*** 103,172 ****
INSERT INTO functional_dependencies
SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
ANALYZE functional_dependencies;
! SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
! FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
! deps_enabled | deps_built | pg_mv_stats_dependencies_show
! --------------+------------+-------------------------------
! t | t | 1 => 2, 1 => 3, 2 => 3
! (1 row)
!
! TRUNCATE functional_dependencies;
! -- a => b, a => c
! INSERT INTO functional_dependencies
! SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
! ANALYZE functional_dependencies;
! SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
! FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
! deps_enabled | deps_built | pg_mv_stats_dependencies_show
! --------------+------------+-------------------------------
! t | t | 1 => 2, 1 => 3
! (1 row)
!
! TRUNCATE functional_dependencies;
! -- check explain (expect bitmap index scan, not plain index scan)
! INSERT INTO functional_dependencies
! SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
! CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
! ANALYZE functional_dependencies;
! SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
! FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
! deps_enabled | deps_built | pg_mv_stats_dependencies_show
! --------------+------------+-------------------------------
! t | t | 1 => 2, 1 => 3, 2 => 3
! (1 row)
!
! EXPLAIN (COSTS off)
! SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
! QUERY PLAN
! ------------------------------------------------------------
! Bitmap Heap Scan on functional_dependencies
! Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
! -> Bitmap Index Scan on fdeps_idx
! Index Cond: ((a = '10'::text) AND (b = '5'::text))
! (4 rows)
!
! DROP TABLE functional_dependencies;
! -- NULL values (mix of int and text columns)
! CREATE TABLE functional_dependencies (
! a INT,
! b TEXT,
! c INT,
! d TEXT
! );
! CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
! INSERT INTO functional_dependencies
! SELECT
! mod(i, 100),
! (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
! mod(i, 400),
! (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
! FROM generate_series(1,10000) s(i);
! ANALYZE functional_dependencies;
! SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
! FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
! deps_enabled | deps_built | pg_mv_stats_dependencies_show
! --------------+------------+----------------------------------------
! t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
! (1 row)
!
! DROP TABLE functional_dependencies;
--- 99,108 ----
INSERT INTO functional_dependencies
SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
ANALYZE functional_dependencies;
! WARNING: terminating connection because of crash of another server process
! DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
! HINT: In a moment you should be able to reconnect to the database and repeat your command.
! server closed the connection unexpectedly
! This probably means the server terminated abnormally
! before or while processing the request.
! connection to server was lost
======================================================================
*** /home/jjanes/pgsql/git/src/test/regress/expected/mv_mcv.out 2016-03-08 18:08:45.299328161 -0800
--- /home/jjanes/pgsql/git/src/test/regress/results/mv_mcv.out 2016-03-08 18:17:34.643710446 -0800
***************
*** 80,207 ****
EXPLAIN (COSTS off)
SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
! QUERY PLAN
! --------------------------------------------
! Bitmap Heap Scan on mcv_list
! Recheck Cond: ((a = 10) AND (b = 5))
! -> Bitmap Index Scan on mcv_idx
! Index Cond: ((a = 10) AND (b = 5))
! (4 rows)
!
! DROP TABLE mcv_list;
! -- varlena type (text)
! CREATE TABLE mcv_list (
! a TEXT,
! b TEXT,
! c TEXT
! );
! CREATE STATISTICS s2 ON mcv_list (a, b, c) WITH (mcv);
! -- random data
! INSERT INTO mcv_list
! SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
! ANALYZE mcv_list;
! SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
! FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
! mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
! -------------+-----------+--------------------------
! t | f |
! (1 row)
!
! TRUNCATE mcv_list;
! -- a => b, a => c, b => c
! INSERT INTO mcv_list
! SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
! ANALYZE mcv_list;
! SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
! FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
! mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
! -------------+-----------+--------------------------
! t | t | nitems=1000
! (1 row)
!
! TRUNCATE mcv_list;
! -- a => b, a => c
! INSERT INTO mcv_list
! SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
! ANALYZE mcv_list;
! SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
! FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
! mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
! -------------+-----------+--------------------------
! t | t | nitems=1000
! (1 row)
!
! TRUNCATE mcv_list;
! -- check explain (expect bitmap index scan, not plain index scan)
! INSERT INTO mcv_list
! SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
! CREATE INDEX mcv_idx ON mcv_list (a, b);
! ANALYZE mcv_list;
! SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
! FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
! mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
! -------------+-----------+--------------------------
! t | t | nitems=100
! (1 row)
!
! EXPLAIN (COSTS off)
! SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
! QUERY PLAN
! ------------------------------------------------------------
! Bitmap Heap Scan on mcv_list
! Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
! -> Bitmap Index Scan on mcv_idx
! Index Cond: ((a = '10'::text) AND (b = '5'::text))
! (4 rows)
!
! TRUNCATE mcv_list;
! -- check explain (expect bitmap index scan, not plain index scan) with NULLs
! INSERT INTO mcv_list
! SELECT
! (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
! (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
! (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
! FROM generate_series(1,1000000) s(i);
! ANALYZE mcv_list;
! SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
! FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
! mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
! -------------+-----------+--------------------------
! t | t | nitems=100
! (1 row)
!
! EXPLAIN (COSTS off)
! SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
! QUERY PLAN
! ---------------------------------------------------
! Bitmap Heap Scan on mcv_list
! Recheck Cond: ((a IS NULL) AND (b IS NULL))
! -> Bitmap Index Scan on mcv_idx
! Index Cond: ((a IS NULL) AND (b IS NULL))
! (4 rows)
!
! DROP TABLE mcv_list;
! -- NULL values (mix of int and text columns)
! CREATE TABLE mcv_list (
! a INT,
! b TEXT,
! c INT,
! d TEXT
! );
! CREATE STATISTICS s3 ON mcv_list (a, b, c, d) WITH (mcv);
! INSERT INTO mcv_list
! SELECT
! mod(i, 100),
! (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
! mod(i, 400),
! (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
! FROM generate_series(1,10000) s(i);
! ANALYZE mcv_list;
! SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
! FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
! mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
! -------------+-----------+--------------------------
! t | t | nitems=1200
! (1 row)
!
! DROP TABLE mcv_list;
--- 80,86 ----
EXPLAIN (COSTS off)
SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
! server closed the connection unexpectedly
! This probably means the server terminated abnormally
! before or while processing the request.
! connection to server was lost
======================================================================
*** /home/jjanes/pgsql/git/src/test/regress/expected/mv_histogram.out 2016-03-08 18:08:45.373327236 -0800
--- /home/jjanes/pgsql/git/src/test/regress/results/mv_histogram.out 2016-03-08 18:17:34.920706983 -0800
***************
*** 30,35 ****
--- 30,37 ----
ERROR: maximum number of buckets is 16384
-- correct command
CREATE STATISTICS s1 ON mv_histogram (a, b, c) WITH (histogram);
+ ERROR: duplicate key value violates unique constraint "pg_mv_statistic_name_index"
+ DETAIL: Key (staname, stanamespace)=(s1, 2200) already exists.
-- random data (no functional dependencies)
INSERT INTO mv_histogram
SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
***************
*** 38,45 ****
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! t | t
! (1 row)
TRUNCATE mv_histogram;
-- a => b, a => c, b => c
--- 40,46 ----
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! (0 rows)
TRUNCATE mv_histogram;
-- a => b, a => c, b => c
***************
*** 50,57 ****
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! t | t
! (1 row)
TRUNCATE mv_histogram;
-- a => b, a => c
--- 51,57 ----
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! (0 rows)
TRUNCATE mv_histogram;
-- a => b, a => c
***************
*** 62,69 ****
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! t | t
! (1 row)
TRUNCATE mv_histogram;
-- check explain (expect bitmap index scan, not plain index scan)
--- 62,68 ----
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! (0 rows)
TRUNCATE mv_histogram;
-- check explain (expect bitmap index scan, not plain index scan)
***************
*** 75,92 ****
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! t | t
! (1 row)
EXPLAIN (COSTS off)
SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
! QUERY PLAN
! --------------------------------------------
! Bitmap Heap Scan on mv_histogram
! Recheck Cond: ((a = 10) AND (b = 5))
! -> Bitmap Index Scan on hist_idx
! Index Cond: ((a = 10) AND (b = 5))
! (4 rows)
DROP TABLE mv_histogram;
-- varlena type (text)
--- 74,88 ----
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! (0 rows)
EXPLAIN (COSTS off)
SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
! QUERY PLAN
! -------------------------------------------
! Index Scan using hist_idx on mv_histogram
! Index Cond: ((a = 10) AND (b = 5))
! (2 rows)
DROP TABLE mv_histogram;
-- varlena type (text)
***************
*** 96,101 ****
--- 92,99 ----
c TEXT
);
CREATE STATISTICS s2 ON mv_histogram (a, b, c) WITH (histogram);
+ ERROR: duplicate key value violates unique constraint "pg_mv_statistic_name_index"
+ DETAIL: Key (staname, stanamespace)=(s2, 2200) already exists.
-- random data (no functional dependencies)
INSERT INTO mv_histogram
SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
***************
*** 104,111 ****
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! t | t
! (1 row)
TRUNCATE mv_histogram;
-- a => b, a => c, b => c
--- 102,108 ----
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! (0 rows)
TRUNCATE mv_histogram;
-- a => b, a => c, b => c
***************
*** 116,123 ****
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! t | t
! (1 row)
TRUNCATE mv_histogram;
-- a => b, a => c
--- 113,119 ----
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! (0 rows)
TRUNCATE mv_histogram;
-- a => b, a => c
***************
*** 128,207 ****
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! t | t
! (1 row)
TRUNCATE mv_histogram;
-- check explain (expect bitmap index scan, not plain index scan)
INSERT INTO mv_histogram
SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
! CREATE INDEX hist_idx ON mv_histogram (a, b);
! ANALYZE mv_histogram;
! SELECT hist_enabled, hist_built
! FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
! hist_enabled | hist_built
! --------------+------------
! t | t
! (1 row)
!
! EXPLAIN (COSTS off)
! SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
! QUERY PLAN
! ------------------------------------------------------------
! Bitmap Heap Scan on mv_histogram
! Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
! -> Bitmap Index Scan on hist_idx
! Index Cond: ((a = '10'::text) AND (b = '5'::text))
! (4 rows)
!
! TRUNCATE mv_histogram;
! -- check explain (expect bitmap index scan, not plain index scan) with NULLs
! INSERT INTO mv_histogram
! SELECT
! (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
! (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
! (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
! FROM generate_series(1,1000000) s(i);
! ANALYZE mv_histogram;
! SELECT hist_enabled, hist_built
! FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
! hist_enabled | hist_built
! --------------+------------
! t | t
! (1 row)
!
! EXPLAIN (COSTS off)
! SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
! QUERY PLAN
! ---------------------------------------------------
! Bitmap Heap Scan on mv_histogram
! Recheck Cond: ((a IS NULL) AND (b IS NULL))
! -> Bitmap Index Scan on hist_idx
! Index Cond: ((a IS NULL) AND (b IS NULL))
! (4 rows)
!
! DROP TABLE mv_histogram;
! -- NULL values (mix of int and text columns)
! CREATE TABLE mv_histogram (
! a INT,
! b TEXT,
! c INT,
! d TEXT
! );
! CREATE STATISTICS s3 ON mv_histogram (a, b, c, d) WITH (histogram);
! INSERT INTO mv_histogram
! SELECT
! mod(i, 100),
! (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
! mod(i, 400),
! (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
! FROM generate_series(1,10000) s(i);
! ANALYZE mv_histogram;
! SELECT hist_enabled, hist_built
! FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
! hist_enabled | hist_built
! --------------+------------
! t | t
! (1 row)
!
! DROP TABLE mv_histogram;
--- 124,139 ----
FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
hist_enabled | hist_built
--------------+------------
! (0 rows)
TRUNCATE mv_histogram;
-- check explain (expect bitmap index scan, not plain index scan)
INSERT INTO mv_histogram
SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
! WARNING: terminating connection because of crash of another server process
! DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
! HINT: In a moment you should be able to reconnect to the database and repeat your command.
! server closed the connection unexpectedly
! This probably means the server terminated abnormally
! before or while processing the request.
! connection to server was lost
======================================================================
Hi,
thanks for looking at the patch. Sorry for the issues, attached is a
version v13 that should fix them (or most of them).
On Tue, 2016-03-08 at 18:24 -0800, Jeff Janes wrote:
On Tue, Mar 8, 2016 at 12:13 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:Hi,
attached is v11 of the patch - this is mostly a cleanup of v10, removing
redundant code, adding missing comments, removing obsolete FIXME/TODOs
and so on. Overall this shaves ~20kB from the patch (not a primary
objective, though).This has some some conflicts with the pathification commit, in the
regression tests.
Yeah, there was one join plan difference, due to the ndistinct
estimation patch. Meh. Fixed.
To avoid that, I applied it to the commit before that, 3fc6e2d7f5b652b417fa6^
Rebased to 51c0f63e.
Having done that, In my hands, it fails its own regression tests.
Diff attached.
Fixed. This was caused by making names of the statistics unique across
tables, thus the regression tests started to fail when executed through
'make check' (but 'make installcheck' was still fine).
The diff however also includes a segfault, apparently in processing of
functional dependencies somewhere in ANALYZE. Sadly I've been unable to
reproduce any such failure, despite running the tests many times (even
when applied on the same commit). Is there any chance this might be due
to a broken build, or something like that. If not, can you try
reproducing it and investigate a bit (enable core dumps etc.)?
It breaks contrib postgres_fdw, I'll look into that when I get a
chance of no one beats me to it.postgres_fdw.c: In function 'postgresGetForeignJoinPaths':
postgres_fdw.c:3623: error: too few arguments to function
'clauselist_selectivity'
postgres_fdw.c:3642: error: too few arguments to function
'clauselist_selectivity'
Yeah, apparently there are two new calls to clauselist_selectivity, so I
had to add NIL as list of conditions.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchtext/x-patch; charset=UTF-8; name=0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchDownload
From 5c28e5ca8feb2c2010d98bc69de952355bd6f3a5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 1/9] teach pull_(varno|varattno)_walker about RestrictInfo
otherwise pull_varnos fails when processing OR clauses
---
src/backend/optimizer/util/var.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index dff52c4..80d01bd 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -197,6 +197,13 @@ pull_varnos_walker(Node *node, pull_varnos_context *context)
context->sublevels_up--;
return result;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo*)node;
+ context->varnos = bms_add_members(context->varnos,
+ rinfo->clause_relids);
+ return false;
+ }
return expression_tree_walker(node, pull_varnos_walker,
(void *) context);
}
@@ -245,6 +252,15 @@ pull_varattnos_walker(Node *node, pull_varattnos_context *context)
return false;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *)node;
+
+ return expression_tree_walker((Node*)rinfo->clause,
+ pull_varattnos_walker,
+ (void*) context);
+ }
+
/* Should not find an unplanned subquery */
Assert(!IsA(node, Query));
--
2.1.0
0002-shared-infrastructure-and-functional-dependencies.patchtext/x-patch; charset=UTF-8; name=0002-shared-infrastructure-and-functional-dependencies.patchDownload
From 414281cc51fe5a548b334531a1bfa8562375c681 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 2/9] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate stats, most
importantly:
- adds a new system catalog (pg_mv_statistic)
- CREATE STATISTICS name ON table (columns) WITH (options)
- DROP STATISTICS name
- implementation of functional dependencies (the simplest type of
multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e. it does not
influence the query planning (subject to follow-up patches).
The current implementation requires a valid 'ltopr' for the columns, so
that we can sort the sample rows in various ways, both in this patch
and other kinds of statistics. Maybe this restriction could be relaxed
in the future, requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
Maybe some of the stats (functional dependencies and MCV list with
limited functionality) might be made to work with hashes of the values,
which is sufficient for equality comparisons. But the queries would
require the equality operator anyway, so it's not really a weaker
requirement. The hashes might reduce space requirements, though.
The algorithm detecting the dependencies is rather simple and probably
needs improvements, so that it detects more complicated dependencies,
and also validation of the math.
The name 'functional dependencies' is more correct (than 'association
rules') as it's exactly the name used in relational theory (esp. Normal
Forms) for tracking column-level dependencies.
The multivariate statistics are automatically removed in two situations
(a) after a DROP TABLE (obviously)
(b) after ALTER TABLE ... DROP COLUMN, if the statistics would be
defined on less than 2 columns (remaining)
If there are more at least two remaining columns, we keep the
statistics but perform cleanup on the next ANALYZE. The dropped columns
are removed from stakeys, and the new statistics is built on the
smaller set.
We can't do this at DROP COLUMN, because that'd leave us with invalid
statistics, or we'd have to throw it away although we can still use it.
This lazy approach lets us use the statistics although some of the
columns are dead.
This also adds a simple list of statistics to \d in psql.
This means the statistics are created within a schema by using a
qualified name (or using the default schema)
CREATE STATISTICS schema.statistics ON ...
and then dropped by specifying qualified name
DROP STATISTICS schema.statistics
or searching through search_path (just like with other objects).
This also gets rid of the "(opt_)stats_name" definitions in gram.y and
instead replaces them with just "opt_any_name", although the optional
case is not really handled currently - there's no generated name yet
(so either we should drop it or implement it).
I'm not entirely sure making statistics schema-specific is that a great
idea. Maybe it should be "global", but that does not seem right (e.g.
it makes multi-tenant systems based on schemas more difficult to
manage, because tenants would interact).
---
doc/src/sgml/ref/allfiles.sgml | 2 +
doc/src/sgml/ref/create_statistics.sgml | 174 ++++++++++
doc/src/sgml/ref/drop_statistics.sgml | 90 ++++++
doc/src/sgml/reference.sgml | 2 +
src/backend/catalog/Makefile | 1 +
src/backend/catalog/dependency.c | 11 +-
src/backend/catalog/heap.c | 102 ++++++
src/backend/catalog/namespace.c | 51 +++
src/backend/catalog/objectaddress.c | 22 ++
src/backend/catalog/system_views.sql | 11 +
src/backend/commands/Makefile | 6 +-
src/backend/commands/analyze.c | 21 ++
src/backend/commands/dropcmds.c | 4 +
src/backend/commands/event_trigger.c | 3 +
src/backend/commands/statscmds.c | 331 +++++++++++++++++++
src/backend/commands/tablecmds.c | 8 +-
src/backend/nodes/copyfuncs.c | 16 +
src/backend/nodes/outfuncs.c | 18 ++
src/backend/optimizer/util/plancat.c | 63 ++++
src/backend/parser/gram.y | 34 +-
src/backend/tcop/utility.c | 11 +
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/relcache.c | 59 ++++
src/backend/utils/cache/syscache.c | 23 ++
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/README.dependencies | 222 +++++++++++++
src/backend/utils/mvstats/common.c | 356 +++++++++++++++++++++
src/backend/utils/mvstats/common.h | 75 +++++
src/backend/utils/mvstats/dependencies.c | 437 ++++++++++++++++++++++++++
src/bin/psql/describe.c | 44 +++
src/include/catalog/dependency.h | 5 +-
src/include/catalog/heap.h | 1 +
src/include/catalog/indexing.h | 7 +
src/include/catalog/namespace.h | 2 +
src/include/catalog/pg_mv_statistic.h | 73 +++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/commands/defrem.h | 4 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 12 +
src/include/nodes/relation.h | 28 ++
src/include/utils/mvstats.h | 70 +++++
src/include/utils/rel.h | 4 +
src/include/utils/relcache.h | 1 +
src/include/utils/syscache.h | 2 +
src/test/regress/expected/rules.out | 9 +
src/test/regress/expected/sanity_check.out | 1 +
47 files changed, 2432 insertions(+), 11 deletions(-)
create mode 100644 doc/src/sgml/ref/create_statistics.sgml
create mode 100644 doc/src/sgml/ref/drop_statistics.sgml
create mode 100644 src/backend/commands/statscmds.c
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/README.dependencies
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index bf95453..c0f7653 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -76,6 +76,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY createSchema SYSTEM "create_schema.sgml">
<!ENTITY createSequence SYSTEM "create_sequence.sgml">
<!ENTITY createServer SYSTEM "create_server.sgml">
+<!ENTITY createStatistics SYSTEM "create_statistics.sgml">
<!ENTITY createTable SYSTEM "create_table.sgml">
<!ENTITY createTableAs SYSTEM "create_table_as.sgml">
<!ENTITY createTableSpace SYSTEM "create_tablespace.sgml">
@@ -119,6 +120,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY dropSchema SYSTEM "drop_schema.sgml">
<!ENTITY dropSequence SYSTEM "drop_sequence.sgml">
<!ENTITY dropServer SYSTEM "drop_server.sgml">
+<!ENTITY dropStatistics SYSTEM "drop_statistics.sgml">
<!ENTITY dropTable SYSTEM "drop_table.sgml">
<!ENTITY dropTableSpace SYSTEM "drop_tablespace.sgml">
<!ENTITY dropTransform SYSTEM "drop_transform.sgml">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
new file mode 100644
index 0000000..a86eae3
--- /dev/null
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -0,0 +1,174 @@
+<!--
+doc/src/sgml/ref/create_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-CREATESTATISTICS">
+ <indexterm zone="sql-createstatistics">
+ <primary>CREATE STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>CREATE STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>CREATE STATISTICS</refname>
+ <refpurpose>define a new statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable> ON <replaceable class="PARAMETER">table_name</replaceable> ( [
+ { <replaceable class="PARAMETER">column_name</replaceable> } ] [, ...])
+[ WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )
+</synopsis>
+
+ </refsynopsisdiv>
+
+ <refsect1 id="SQL-CREATESTATISTICS-description">
+ <title>Description</title>
+
+ <para>
+ <command>CREATE STATISTICS</command> will create a new multivariate
+ statistics on the table. The statistics will be created in the in the
+ current database. The statistics will be owned by the user issuing
+ the command.
+ </para>
+
+ <para>
+ If a schema name is given (for example, <literal>CREATE STATISTICS
+ myschema.mystat ...</>) then the statistics is created in the specified
+ schema. Otherwise it is created in the current schema. The name of
+ the table must be distinct from the name of any other statistics in the
+ same schema.
+ </para>
+
+ <para>
+ To be able to create a table, you must have <literal>USAGE</literal>
+ privilege on all column types or the type in the <literal>OF</literal>
+ clause, respectively.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>IF NOT EXISTS</></term>
+ <listitem>
+ <para>
+ Do not throw an error if a statistics with the same name already exists.
+ A notice is issued in this case. Note that there is no guarantee that
+ the existing statistics is anything like the one that would have been
+ created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">statistics_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to be created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">table_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the table the statistics should
+ be created on.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name</replaceable></term>
+ <listitem>
+ <para>
+ The name of a column to be included in the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ ...
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <refsect2 id="SQL-CREATESTATISTICS-parameters">
+ <title id="SQL-CREATESTATISTICS-parameters-title">Statistics Parameters</title>
+
+ <indexterm zone="sql-createstatistics-parameters">
+ <primary>statistics parameters</primary>
+ </indexterm>
+
+ <para>
+ The <literal>WITH</> clause can specify <firstterm>statistics parameters</>
+ for statistics. The currently available parameters are listed below.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>dependencies</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables functional dependencies for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1 id="SQL-CREATESTATISTICS-notes">
+ <title>Notes</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+
+ <refsect1 id="SQL-CREATESTATISTICS-examples">
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>CREATE STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-dropstatistics"></member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/doc/src/sgml/ref/drop_statistics.sgml b/doc/src/sgml/ref/drop_statistics.sgml
new file mode 100644
index 0000000..4cc0b70
--- /dev/null
+++ b/doc/src/sgml/ref/drop_statistics.sgml
@@ -0,0 +1,90 @@
+<!--
+doc/src/sgml/ref/drop_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-DROPSTATISTICS">
+ <indexterm zone="sql-dropstatistics">
+ <primary>DROP STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>DROP STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>DROP STATISTICS</refname>
+ <refpurpose>remove a statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> [, ...]
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <command>DROP STATISTICS</command> removes statistics from the database.
+ Only the statistics owner, the schema owner, and superuser can drop a
+ statistics.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>IF EXISTS</literal></term>
+ <listitem>
+ <para>
+ Do not throw an error if the statistics does not exist. A notice is
+ issued in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to drop.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>DROP STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-createstatistics"></member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 03020df..2b07b2d 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -104,6 +104,7 @@
&createSchema;
&createSequence;
&createServer;
+ &createStatistics;
&createTable;
&createTableAs;
&createTableSpace;
@@ -147,6 +148,7 @@
&dropSchema;
&dropSequence;
&dropServer;
+ &dropStatistics;
&dropTable;
&dropTableSpace;
&dropTSConfig;
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 25130ec..058b8a9 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index c48e37b..8200454 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -40,6 +40,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -160,7 +161,8 @@ static const Oid object_classes[] = {
ExtensionRelationId, /* OCLASS_EXTENSION */
EventTriggerRelationId, /* OCLASS_EVENT_TRIGGER */
PolicyRelationId, /* OCLASS_POLICY */
- TransformRelationId /* OCLASS_TRANSFORM */
+ TransformRelationId, /* OCLASS_TRANSFORM */
+ MvStatisticRelationId /* OCLASS_STATISTICS */
};
@@ -1272,6 +1274,10 @@ doDeletion(const ObjectAddress *object, int flags)
DropTransformById(object->objectId);
break;
+ case OCLASS_STATISTICS:
+ RemoveStatisticsById(object->objectId);
+ break;
+
default:
elog(ERROR, "unrecognized object class: %u",
object->classId);
@@ -2415,6 +2421,9 @@ getObjectClass(const ObjectAddress *object)
case TransformRelationId:
return OCLASS_TRANSFORM;
+
+ case MvStatisticRelationId:
+ return OCLASS_STATISTICS;
}
/* shouldn't get here */
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 6a4a9d9..e7d9aaa 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -47,6 +47,7 @@
#include "catalog/pg_constraint_fn.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_tablespace.h"
@@ -1613,7 +1614,10 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
if (attnum > 0)
+ {
RemoveStatistics(relid, attnum);
+ RemoveMVStatistics(relid, attnum);
+ }
relation_close(rel, NoLock);
}
@@ -1841,6 +1845,11 @@ heap_drop_with_catalog(Oid relid)
RemoveStatistics(relid, 0);
/*
+ * delete multi-variate statistics
+ */
+ RemoveMVStatistics(relid, 0);
+
+ /*
* delete attribute tuples
*/
DeleteAttributeTuples(relid);
@@ -2696,6 +2705,99 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
/*
+ * RemoveMVStatistics --- remove entries in pg_mv_statistic for a rel
+ *
+ * If attnum is zero, remove all entries for rel; else remove only the one(s)
+ * for that column.
+ */
+void
+RemoveMVStatistics(Oid relid, AttrNumber attnum)
+{
+ Relation pgmvstatistic;
+ TupleDesc tupdesc = NULL;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ /*
+ * When dropping a column, we'll drop statistics with a single
+ * remaining (undropped column). To do that, we need the tuple
+ * descriptor.
+ *
+ * We already have the relation locked (as we're running ALTER
+ * TABLE ... DROP COLUMN), so we'll just get the descriptor here.
+ */
+ if (attnum != 0)
+ {
+ Relation rel = relation_open(relid, NoLock);
+
+ /* multivariate stats are supported on tables and matviews */
+ if (rel->rd_rel->relkind == RELKIND_RELATION ||
+ rel->rd_rel->relkind == RELKIND_MATVIEW)
+ tupdesc = RelationGetDescr(rel);
+
+ relation_close(rel, NoLock);
+ }
+
+ if (tupdesc == NULL)
+ return;
+
+ pgmvstatistic = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ scan = systable_beginscan(pgmvstatistic,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ bool delete = true;
+
+ if (attnum != 0)
+ {
+ Datum adatum;
+ bool isnull;
+ int i;
+ int ncolumns = 0;
+ ArrayType *arr;
+ int16 *attnums;
+
+ /* get the columns */
+ adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+ attnums = (int16*)ARR_DATA_PTR(arr);
+
+ for (i = 0; i < ARR_DIMS(arr)[0]; i++)
+ {
+ /* count the column unless it's has been / is being dropped */
+ if ((! tupdesc->attrs[attnums[i]-1]->attisdropped) &&
+ (attnums[i] != attnum))
+ ncolumns += 1;
+ }
+
+ /* delete if there are less than two attributes */
+ delete = (ncolumns < 2);
+ }
+
+ if (delete)
+ simple_heap_delete(pgmvstatistic, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(pgmvstatistic, RowExclusiveLock);
+}
+
+
+/*
* RelationTruncateIndexes - truncate all indexes associated
* with the heap relation to zero tuples.
*
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 446b2ac..dfd5bef 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -4201,3 +4201,54 @@ pg_is_other_temp_schema(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(isOtherTempNamespace(oid));
}
+
+Oid
+get_statistics_oid(List *names, bool missing_ok)
+{
+ char *schemaname;
+ char *stats_name;
+ Oid namespaceId;
+ Oid stats_oid = InvalidOid;
+ ListCell *l;
+
+ /* deconstruct the name list */
+ DeconstructQualifiedName(names, &schemaname, &stats_name);
+
+ if (schemaname)
+ {
+ /* use exact schema given */
+ namespaceId = LookupExplicitNamespace(schemaname, missing_ok);
+ if (missing_ok && !OidIsValid(namespaceId))
+ stats_oid = InvalidOid;
+ else
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ }
+ else
+ {
+ /* search for it in search path */
+ recomputeNamespacePath();
+
+ foreach(l, activeSearchPath)
+ {
+ namespaceId = lfirst_oid(l);
+
+ if (namespaceId == myTempNamespace)
+ continue; /* do not look in temp namespace */
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ if (OidIsValid(stats_oid))
+ break;
+ }
+ }
+
+ if (!OidIsValid(stats_oid) && !missing_ok)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics \"%s\" does not exist",
+ NameListToString(names))));
+
+ return stats_oid;
+}
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index d2aaa6d..3a6a0b0 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -39,6 +39,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_opfamily.h"
@@ -438,9 +439,22 @@ static const ObjectPropertyType ObjectProperty[] =
Anum_pg_type_typacl,
ACL_KIND_TYPE,
true
+ },
+ {
+ MvStatisticRelationId,
+ MvStatisticOidIndexId,
+ MVSTATOID,
+ MVSTATNAMENSP,
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ InvalidAttrNumber, /* XXX same owner as relation */
+ InvalidAttrNumber, /* no ACL (same as relation) */
+ -1, /* no ACL */
+ true
}
};
+
/*
* This struct maps the string object types as returned by
* getObjectTypeDescription into ObjType enum values. Note that some enum
@@ -913,6 +927,11 @@ get_object_address(ObjectType objtype, List *objname, List *objargs,
address = get_object_address_defacl(objname, objargs,
missing_ok);
break;
+ case OBJECT_STATISTICS:
+ address.classId = MvStatisticRelationId;
+ address.objectId = get_statistics_oid(objname, missing_ok);
+ address.objectSubId = 0;
+ break;
default:
elog(ERROR, "unrecognized objtype: %d", (int) objtype);
/* placate compiler, in case it thinks elog might return */
@@ -2185,6 +2204,9 @@ check_object_ownership(Oid roleid, ObjectType objtype, ObjectAddress address,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser")));
break;
+ case OBJECT_STATISTICS:
+ /* FIXME do the right owner checks here */
+ break;
default:
elog(ERROR, "unrecognized object type: %d",
(int) objtype);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index abf9a70..b8a264e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,6 +158,17 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.staname AS staname,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats WITH (security_barrier) AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index b1ac704..5151001 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -18,8 +18,8 @@ OBJS = aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
event_trigger.o explain.o extension.o foreigncmds.o functioncmds.o \
indexcmds.o lockcmds.o matview.o operatorcmds.o opclasscmds.o \
policy.o portalcmds.o prepare.o proclang.o \
- schemacmds.o seclabel.o sequence.o tablecmds.o tablespace.o trigger.o \
- tsearchcmds.o typecmds.o user.o vacuum.o vacuumlazy.o \
- variable.o view.o
+ schemacmds.o seclabel.o sequence.o statscmds.o \
+ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o \
+ user.o vacuum.o vacuumlazy.o variable.o view.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8a5f07c..8ac9915 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -27,6 +27,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -55,7 +56,11 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "utils/mvstats.h"
+#include "access/sysattr.h"
/* Per-index data for ANALYZE */
typedef struct AnlIndexData
@@ -460,6 +465,19 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, because we need to build more
+ * detailed stats (more MCV items / histogram buckets) to get
+ * good accuracy. Maybe it'd be appropriate to use samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -562,6 +580,9 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/dropcmds.c b/src/backend/commands/dropcmds.c
index 522027a..cd65b58 100644
--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -292,6 +292,10 @@ does_not_exist_skipping(ObjectType objtype, List *objname, List *objargs)
msg = gettext_noop("schema \"%s\" does not exist, skipping");
name = NameListToString(objname);
break;
+ case OBJECT_STATISTICS:
+ msg = gettext_noop("statistics \"%s\" does not exist, skipping");
+ name = NameListToString(objname);
+ break;
case OBJECT_TSPARSER:
if (!schema_does_not_exist_skipping(objname, &msg, &name))
{
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 9e32f8d..09061bb 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -110,6 +110,7 @@ static event_trigger_support_data event_trigger_support[] = {
{"SCHEMA", true},
{"SEQUENCE", true},
{"SERVER", true},
+ {"STATISTICS", true},
{"TABLE", true},
{"TABLESPACE", false},
{"TRANSFORM", true},
@@ -1106,6 +1107,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
case OBJECT_RULE:
case OBJECT_SCHEMA:
case OBJECT_SEQUENCE:
+ case OBJECT_STATISTICS:
case OBJECT_TABCONSTRAINT:
case OBJECT_TABLE:
case OBJECT_TRANSFORM:
@@ -1167,6 +1169,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
case OCLASS_DEFACL:
case OCLASS_EXTENSION:
case OCLASS_POLICY:
+ case OCLASS_STATISTICS:
return true;
}
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
new file mode 100644
index 0000000..84a8b13
--- /dev/null
+++ b/src/backend/commands/statscmds.c
@@ -0,0 +1,331 @@
+/*-------------------------------------------------------------------------
+ *
+ * statscmds.c
+ * Commands for creating and altering multivariate statistics
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/commands/statscmds.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/multixact.h"
+#include "access/reloptions.h"
+#include "access/relscan.h"
+#include "access/sysattr.h"
+#include "access/xact.h"
+#include "access/xlog.h"
+#include "catalog/catalog.h"
+#include "catalog/dependency.h"
+#include "catalog/heap.h"
+#include "catalog/index.h"
+#include "catalog/indexing.h"
+#include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_constraint.h"
+#include "catalog/pg_depend.h"
+#include "catalog/pg_foreign_table.h"
+#include "catalog/pg_inherits.h"
+#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
+#include "catalog/pg_namespace.h"
+#include "catalog/pg_opclass.h"
+#include "catalog/pg_tablespace.h"
+#include "catalog/pg_trigger.h"
+#include "catalog/pg_type.h"
+#include "catalog/pg_type_fn.h"
+#include "catalog/storage.h"
+#include "catalog/toasting.h"
+#include "commands/cluster.h"
+#include "commands/comment.h"
+#include "commands/defrem.h"
+#include "commands/event_trigger.h"
+#include "commands/policy.h"
+#include "commands/sequence.h"
+#include "commands/tablecmds.h"
+#include "commands/tablespace.h"
+#include "commands/trigger.h"
+#include "commands/typecmds.h"
+#include "commands/user.h"
+#include "executor/executor.h"
+#include "foreign/foreign.h"
+#include "miscadmin.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/parsenodes.h"
+#include "optimizer/clauses.h"
+#include "optimizer/planner.h"
+#include "parser/parse_clause.h"
+#include "parser/parse_coerce.h"
+#include "parser/parse_collate.h"
+#include "parser/parse_expr.h"
+#include "parser/parse_oper.h"
+#include "parser/parse_relation.h"
+#include "parser/parse_type.h"
+#include "parser/parse_utilcmd.h"
+#include "parser/parser.h"
+#include "pgstat.h"
+#include "rewrite/rewriteDefine.h"
+#include "rewrite/rewriteHandler.h"
+#include "rewrite/rewriteManip.h"
+#include "storage/bufmgr.h"
+#include "storage/lmgr.h"
+#include "storage/lock.h"
+#include "storage/predicate.h"
+#include "storage/smgr.h"
+#include "utils/acl.h"
+#include "utils/builtins.h"
+#include "utils/fmgroids.h"
+#include "utils/inval.h"
+#include "utils/lsyscache.h"
+#include "utils/memutils.h"
+#include "utils/relcache.h"
+#include "utils/ruleutils.h"
+#include "utils/snapmgr.h"
+#include "utils/syscache.h"
+#include "utils/tqual.h"
+#include "utils/typcache.h"
+#include "utils/mvstats.h"
+
+
+/* used for sorting the attnums in ExecCreateStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the CREATE STATISTICS name ON table (columns) WITH (options)
+ *
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+ObjectAddress
+CreateStatistics(CreateStatsStmt *stmt)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+ ObjectAddress address = InvalidObjectAddress;
+ char *namestr;
+ NameData staname;
+ Oid statoid;
+ Oid namespaceId;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+ Relation rel;
+ ObjectAddress parentobject, childobject;
+
+ /* by default build nothing */
+ bool build_dependencies = false;
+
+ Assert(IsA(stmt, CreateStatsStmt));
+
+ /* resolve the pieces of the name (namespace etc.) */
+ namespaceId = QualifiedNameGetCreationNamespace(stmt->defnames, &namestr);
+ namestrcpy(&staname, namestr);
+
+ /*
+ * If if_not_exists was given and the statistics already exists, bail out.
+ */
+ if (stmt->if_not_exists &&
+ SearchSysCacheExists2(MVSTATNAMENSP,
+ PointerGetDatum(&staname),
+ ObjectIdGetDatum(namespaceId)))
+ {
+ ereport(NOTICE,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("statistics \"%s\" already exists, skipping",
+ namestr)));
+ return InvalidObjectAddress;
+ }
+
+ rel = heap_openrv(stmt->relation, AccessExclusiveLock);
+
+ /* transform the column names to attnum values */
+
+ foreach(l, stmt->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, stmt->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+ values[Anum_pg_mv_statistic_staname -1] = NameGetDatum(&staname);
+ values[Anum_pg_mv_statistic_stanamespace -1] = ObjectIdGetDatum(namespaceId);
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ statoid = HeapTupleGetOid(htup);
+
+ heap_freetuple(htup);
+
+
+ /*
+ * Store a dependency too, so that statistics are dropped on DROP TABLE
+ */
+ parentobject.classId = RelationRelationId;
+ parentobject.objectId = ObjectIdGetDatum(RelationGetRelid(rel));
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+ /*
+ * Also record dependency on the schema (to drop statistics on DROP SCHEMA)
+ */
+ parentobject.classId = NamespaceRelationId;
+ parentobject.objectId = ObjectIdGetDatum(namespaceId);
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ relation_close(rel, NoLock);
+
+ /*
+ * Invalidate relcache so that others see the new statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ ObjectAddressSet(address, MvStatisticRelationId, statoid);
+
+ return address;
+}
+
+
+/*
+ * Implements the DROP STATISTICS
+ *
+ * DROP STATISTICS stats_name ON table_name
+ *
+ * The first one requires an exact match, the second one just drops
+ * all the statistics on a table.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+ Relation relation;
+ HeapTuple tup;
+
+ /*
+ * Delete the pg_proc tuple.
+ */
+ relation = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ tup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(statsOid));
+ if (!HeapTupleIsValid(tup)) /* should not happen */
+ elog(ERROR, "cache lookup failed for statistics %u", statsOid);
+
+ simple_heap_delete(relation, &tup->t_self);
+
+ ReleaseSysCache(tup);
+
+ heap_close(relation, RowExclusiveLock);
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 96dc923..96ab02f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -37,6 +37,7 @@
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_tablespace.h"
@@ -95,7 +96,7 @@
#include "utils/syscache.h"
#include "utils/tqual.h"
#include "utils/typcache.h"
-
+#include "utils/mvstats.h"
/*
* ON COMMIT action list
@@ -143,8 +144,9 @@ static List *on_commits = NIL;
#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
-#define AT_PASS_MISC 8 /* other stuff */
-#define AT_NUM_PASSES 9
+#define AT_PASS_ADD_STATS 8 /* ADD statistics */
+#define AT_PASS_MISC 9 /* other stuff */
+#define AT_NUM_PASSES 10
typedef struct AlteredTableInfo
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df7c2fa..fce46cb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4124,6 +4124,19 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static CreateStatsStmt *
+_copyCreateStatsStmt(const CreateStatsStmt *from)
+{
+ CreateStatsStmt *newnode = makeNode(CreateStatsStmt);
+
+ COPY_NODE_FIELD(defnames);
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4999,6 +5012,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_CreateStatsStmt:
+ retval = _copyCreateStatsStmt(from);
+ break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index eb0fc1e..07206d7 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2153,6 +2153,21 @@ _outIndexOptInfo(StringInfo str, const IndexOptInfo *node)
}
static void
+_outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
+{
+ WRITE_NODE_TYPE("MVSTATISTICINFO");
+
+ /* NB: this isn't a complete set of fields */
+ WRITE_OID_FIELD(mvoid);
+
+ /* enabled statistics */
+ WRITE_BOOL_FIELD(deps_enabled);
+
+ /* built/available statistics */
+ WRITE_BOOL_FIELD(deps_built);
+}
+
+static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
@@ -3636,6 +3651,9 @@ _outNode(StringInfo str, const void *obj)
case T_PlannerParamItem:
_outPlannerParamItem(str, obj);
break;
+ case T_MVStatisticInfo:
+ _outMVStatisticInfo(str, obj);
+ break;
case T_ExtensibleNode:
_outExtensibleNode(str, obj);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index ad715bb..31939dd 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -28,6 +28,7 @@
#include "catalog/dependency.h"
#include "catalog/heap.h"
#include "catalog/pg_am.h"
+#include "catalog/pg_mv_statistic.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -40,7 +41,9 @@
#include "parser/parsetree.h"
#include "rewrite/rewriteManip.h"
#include "storage/bufmgr.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
+#include "utils/syscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -94,6 +97,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
Relation relation;
bool hasindex;
List *indexinfos = NIL;
+ List *stainfos = NIL;
/*
* We need not lock the relation since it was already locked, either by
@@ -387,6 +391,65 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->indexlist = indexinfos;
+ if (true)
+ {
+ List *mvstatoidlist;
+ ListCell *l;
+
+ mvstatoidlist = RelationGetMVStatList(relation);
+
+ foreach(l, mvstatoidlist)
+ {
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ Oid mvoid = lfirst_oid(l);
+ Form_pg_mv_statistic mvstat;
+ MVStatisticInfo *info;
+
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ continue;
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /* unavailable stats are not interesting for the planner */
+ if (mvstat->deps_built)
+ {
+ info = makeNode(MVStatisticInfo);
+
+ info->mvoid = mvoid;
+ info->rel = rel;
+
+ /* enabled statistics */
+ info->deps_enabled = mvstat->deps_enabled;
+
+ /* built/available statistics */
+ info->deps_built = mvstat->deps_built;
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ info->stakeys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+
+ stainfos = lcons(info, stainfos);
+ }
+
+ ReleaseSysCache(htup);
+ }
+
+ list_free(mvstatoidlist);
+ }
+
+ rel->mvstatlist = stainfos;
+
/* Grab foreign-table info using the relcache, while we have it */
if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b307b48..3be3f02 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -241,7 +241,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
ConstraintsSetStmt CopyStmt CreateAsStmt CreateCastStmt
CreateDomainStmt CreateExtensionStmt CreateGroupStmt CreateOpClassStmt
CreateOpFamilyStmt AlterOpFamilyStmt CreatePLangStmt
- CreateSchemaStmt CreateSeqStmt CreateStmt CreateTableSpaceStmt
+ CreateSchemaStmt CreateSeqStmt CreateStmt CreateStatsStmt CreateTableSpaceStmt
CreateFdwStmt CreateForeignServerStmt CreateForeignTableStmt
CreateAssertStmt CreateTransformStmt CreateTrigStmt CreateEventTrigStmt
CreateUserStmt CreateUserMappingStmt CreateRoleStmt CreatePolicyStmt
@@ -809,6 +809,7 @@ stmt :
| CreateSchemaStmt
| CreateSeqStmt
| CreateStmt
+ | CreateStatsStmt
| CreateTableSpaceStmt
| CreateTransformStmt
| CreateTrigStmt
@@ -3436,6 +3437,36 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * CREATE STATISTICS stats_name ON relname (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+
+CreateStatsStmt: CREATE STATISTICS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $3;
+ n->relation = $5;
+ n->keys = $7;
+ n->options = $9;
+ n->if_not_exists = false;
+ $$ = (Node *)n;
+ }
+ | CREATE STATISTICS IF_P NOT EXISTS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $6;
+ n->relation = $8;
+ n->keys = $10;
+ n->options = $12;
+ n->if_not_exists = true;
+ $$ = (Node *)n;
+ }
+ ;
+
/*****************************************************************************
*
@@ -5621,6 +5652,7 @@ drop_type: TABLE { $$ = OBJECT_TABLE; }
| TEXT_P SEARCH DICTIONARY { $$ = OBJECT_TSDICTIONARY; }
| TEXT_P SEARCH TEMPLATE { $$ = OBJECT_TSTEMPLATE; }
| TEXT_P SEARCH CONFIGURATION { $$ = OBJECT_TSCONFIGURATION; }
+ | STATISTICS { $$ = OBJECT_STATISTICS; }
;
any_name_list:
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 045f7f0..2ba88e2 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1520,6 +1520,10 @@ ProcessUtilitySlow(Node *parsetree,
address = ExecSecLabelStmt((SecLabelStmt *) parsetree);
break;
+ case T_CreateStatsStmt: /* CREATE STATISTICS */
+ address = CreateStatistics((CreateStatsStmt *) parsetree);
+ break;
+
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(parsetree));
@@ -2160,6 +2164,9 @@ CreateCommandTag(Node *parsetree)
case OBJECT_TRANSFORM:
tag = "DROP TRANSFORM";
break;
+ case OBJECT_STATISTICS:
+ tag = "DROP STATISTICS";
+ break;
default:
tag = "???";
}
@@ -2527,6 +2534,10 @@ CreateCommandTag(Node *parsetree)
tag = "EXECUTE";
break;
+ case T_CreateStatsStmt:
+ tag = "CREATE STATISTICS";
+ break;
+
case T_DeallocateStmt:
{
DeallocateStmt *stmt = (DeallocateStmt *) parsetree;
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 130c06d..3bc4c8a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -47,6 +47,7 @@
#include "catalog/pg_auth_members.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
@@ -3956,6 +3957,62 @@ RelationGetIndexList(Relation relation)
return result;
}
+
+List *
+RelationGetMVStatList(Relation relation)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result;
+ List *oldlist;
+ MemoryContext oldcxt;
+
+ /* Quick exit if we already computed the list. */
+ if (relation->rd_mvstatvalid != 0)
+ return list_copy(relation->rd_mvstatlist);
+
+ /*
+ * We build the list we intend to return (in the caller's context) while
+ * doing the scan. After successfully completing the scan, we copy that
+ * list into the relcache entry. This avoids cache-context memory leakage
+ * if we get some sort of error partway through.
+ */
+ result = NIL;
+
+ /* Prepare to scan pg_index for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(relation)));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ /* TODO maybe include only already built statistics? */
+ result = insert_ordered_oid(result, HeapTupleGetOid(htup));
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* Now save a copy of the completed list in the relcache entry. */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ oldlist = relation->rd_mvstatlist;
+ relation->rd_mvstatlist = list_copy(result);
+
+ relation->rd_mvstatvalid = true;
+ MemoryContextSwitchTo(oldcxt);
+
+ /* Don't leak the old list, if there is one */
+ list_free(oldlist);
+
+ return result;
+}
+
/*
* insert_ordered_oid
* Insert a new Oid into a sorted list of Oids, preserving ordering
@@ -4920,6 +4977,8 @@ load_relcache_init_file(bool shared)
rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_idattr = NULL;
+ rel->rd_mvstatvalid = false;
+ rel->rd_mvstatlist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 65ffe84..3c1bc4b 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -44,6 +44,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -502,6 +503,28 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATNAMENSP */
+ MvStatisticNameIndexId,
+ 2,
+ {
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ 0,
+ 0
+ },
+ 4
+ },
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 4
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.dependencies b/src/backend/utils/mvstats/README.dependencies
new file mode 100644
index 0000000..1f96fbc
--- /dev/null
+++ b/src/backend/utils/mvstats/README.dependencies
@@ -0,0 +1,222 @@
+Soft functional dependencies
+============================
+
+A type of multivariate statistics used to capture cases when one column (or
+possibly a combination of columns) determines values in another column. We may
+also say that one column implies the other one.
+
+A simple artificial example may be a table with two columns, created like this
+
+ CREATE TABLE t (a INT, b INT)
+ AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+
+Clearly, once we know the value for column 'a' the value for 'b' is trivially
+determined, as it's simply (a/10). A more practical example may be addresses,
+where (ZIP code -> city name), i.e. once we know the ZIP, we probably know the
+city it belongs to, as ZIP codes are usually assigned to one city. Larger cities
+may have multiple ZIP codes, so the dependency can't be reversed.
+
+Functional dependencies are a concept well described in relational theory,
+particularly in definition of normalization and "normal forms". Wikipedia has a
+nice definition of a functional dependency [1]:
+
+ In a given table, an attribute Y is said to have a functional dependency on
+ a set of attributes X (written X -> Y) if and only if each X value is
+ associated with precisely one Y value. For example, in an "Employee" table
+ that includes the attributes "Employee ID" and "Employee Date of Birth", the
+ functional dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ It follows from the previous two sentences that each {Employee ID} is
+ associated with precisely one {Employee Date of Birth}.
+
+ [1] http://en.wikipedia.org/wiki/Database_normalization
+
+Many datasets might be normalized not to contain such dependencies, but often
+it's not practical for various reasons. In some cases it's actually a conscious
+design choice to model the dataset in denormalized way, either because of
+performance or to make querying easier.
+
+The functional dependencies are called 'soft' because the implementation is
+meant to allow small number of rows contradicting the dependency. Many actual
+data sets contain some sort of errors, either because of data entry mistakes
+(user mistyping the ZIP code) or issues in generating the data (e.g. a ZIP code
+mistakenly assigned to two cities in different states). A strict implementation
+would ignore dependencies on such noisy data, rendering the approach unusable on
+such data sets.
+
+
+Mining dependencies (ANALYZE)
+-----------------------------
+
+The current build algorithm is rather simple - for each pair (a,b) of columns,
+the data are sorted lexicographically (first by 'a', then by 'b'). Then for each
+group (rows with the same 'a' value) we decide whether the group is neutral,
+supporting or contradicting the dependency (a->b).
+
+A group is considered neutral when it's too small - e.g. when there's a single
+row in the group, there can't possibly be multiple values in 'b'. For this
+reason we ignore groups smaller than a threshold (currently 3 rows).
+
+For sufficiently large groups (3 rows or more), we count the number of distinct
+values in 'b'. When there's a single 'b' value, the group is considered to
+support the dependency (a->b), otherwise it's condidered as contradicting it.
+
+At the end, we compare the number of rows in supporting and contradicting groups,
+and if there are at least 10x as many supporting rows, we consider the
+functional dependency to be valid.
+
+
+This approach has the negative property that the algorithm is that it's a bit
+fragile with respect to the sample - there may be data sets producing quite
+different results for each ANALYZE execution (as even a single row may change
+the outcome of the final 10x test).
+
+It was proposed to make the dependencies "fuzzy" - e.g. track some coefficient
+between [0,1] determining how much the dependency holds. That would however mean
+we have to keep all the dependencies, as eliminating them based on the value of
+the coefficient (e.g. throw away dependencies <= 0.5) would result in exactly
+the same fragility issues. This would also make it more complicated to combine
+dependencies. So this does not seem like a practical approach.
+
+A better approach might be to replace the constants (min_group_size=3 and 10x)
+with values somehow related to the particular data set.
+
+
+Clause reduction (planner/optimizer)
+------------------------------------
+
+Apllying the functional dependencies is quite simple - given a list of equality
+clauses, check which clauses are redundant (i.e. implied by some other clause).
+For example given clause list
+
+ (a = 2) AND (b = 2) AND (c = 3)
+
+and dependencies (a->b) and (a->d), the list of clauses may be simplified to
+
+ (a = 1) AND (c = 3)
+
+Functional dependencies may only be applied to equality clauses, all other types
+of clauses are ignored. See clauselist_apply_dependencies() for more details.
+
+
+Compatibility of clauses
+------------------------
+
+The reduction assumes the clauses really are redundant, and the value in the
+reduced clause (b=2) is the value determined by (a=1). If that's not the case
+and the values are "incompatible" the result will be over-estimation.
+
+This may happen for example when using conditions on ZIP and city name with
+mismatching values (ZIP for a different city), etc. In such case the result
+set will be empty, but we'll estimate the selectivity using the ZIP condition.
+
+In this case the default estimation based on AVIA principle happens to work
+better, but mostly by chance.
+
+
+Dependencies vs. MCV/histogram
+------------------------------
+
+In some cases the "compatibility" of the conditions might be verified using the
+other types of multivariate stats - MCV lists and histograms.
+
+For MCV lists the verification might be very simple - peek into the list if
+there are any items matching the clause on the 'a' column (e.g. ZIP code), and
+if such item is found, check that the 'b' column matches the other clause. If it
+does not, the clauses are contradictory. We can't really say if such item was
+not found, except maybe restricting the selectivity using the MCV data (e.g.
+using min/max selectivity, or something).
+
+With histograms, it might work similarly - we can't check the values directly
+(because histograms use buckets, unlike MCV lists, storing the actual values).
+So we can only observe the buckets matching the clauses - if those buckets have
+very low frequency, it probably means the two clauses are incompatible.
+
+It's unclear what 'low frequency' is, but if one of the clauses is implied
+(automatically true because of the other clause), then
+
+ selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+
+So we might compute selectivity of the first clause - for example using regular
+statistics. And then check if the selectivity computed from the histogram is
+about the same (or significantly lower).
+
+The problem is that histograms work well only when the data ordering matches the
+natural meaning. For values that serve as labels - like city names or ZIP codes,
+or even generated IDs, histograms really don't work all that well. For example
+sorting cities by name won't match the sorting of ZIP codes, rendering the
+histogram unusable.
+
+So MCVs are probably going to work much better, because they don't really assume
+any sort of ordering. And it's probably more appropriate for the label-like data.
+
+A good question however is why even use functional dependencies in such cases
+and not simply use the MCV/histogram instead. One reason is that the functional
+dependencies allow fallback to regular stats, and often produce more accurate
+estimates - especially compared to histograms, that are quite bad in estimating
+equality clauses.
+
+
+Limitations
+-----------
+
+Let's see the main liminations of functional dependencies, especially those
+related to the current implementation.
+
+The current implementation supports only dependencies between two columns, but
+this is merely a simplification of the initial implementation. It's certainly
+useful to mine for dependencies involving multiple columns on the 'left' side,
+i.e. a condition for the dependency. That is dependencies like (a,b -> c).
+
+The implementation may/should be smart enough not to mine redundant conditions,
+e.g. (a->b) and (a,c -> b), because the latter is a trivial consequence of the
+former one (if values of 'a' determine 'b', adding another column won't change
+that relationship). The ANALYZE should first analyze 1:1 dependencies, then 2:1
+dependencies (and skip the already identified ones), etc.
+
+For example the dependency
+
+ (city name -> zip code)
+
+is much stronger, i.e. whenever it hold, then
+
+ (city name, state name -> zip code)
+
+holds too. But in case there are cities with the same name in different states,
+then only the latter dependency will be valid.
+
+Of course, there probably are cities with the same name within a single state,
+but hopefully this is relatively rare occurence (and thus we'll still detect
+the 'soft' dependency).
+
+Handling multiple columns on the right side of the dependency, is not necessary,
+as those dependencies may be simply decomposed into a set of dependencies with
+the same meaning, one for each column on the right side. For example
+
+ (a -> b,c)
+
+is exactly the same as
+
+ (a -> b) & (a -> c)
+
+Of course, storing the first form may be more efficient thant storing multiple
+'simple' dependencies separately.
+
+
+TODO Support dependencies with multiple columns on left/right.
+
+TODO Investigate using histogram and MCV list to verify the dependencies.
+
+TODO Investigate statistical testing of the distribution (to decide whether it
+ makes sense to build the histogram/MCV list).
+
+TODO Using a min/max of selectivities would probably make more sense for the
+ associated columns.
+
+TODO Consider eliminating the implied columns from the histogram and MCV lists
+ (but maybe that's not a good idea, because that'd make it impossible to use
+ these stats for non-equality clauses and also it wouldn't be possible to
+ use the stats for verification of the dependencies).
+
+TODO The reduction probably might be extended to also handle IS NULL clauses,
+ assuming we fix the ANALYZE to properly handle NULL values. We however
+ won't be able to reduce IS NOT NULL (unless I'm missing something).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..a755c49
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,356 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+static List* list_mv_stats(Oid relid);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ ListCell *lc;
+ List *mvstats;
+
+ TupleDesc tupdesc = RelationGetDescr(onerel);
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel));
+
+ foreach (lc, mvstats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies deps = NULL;
+
+ VacAttrStats **stats = NULL;
+ int numatts = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = stat->stakeys;
+
+ /* see how many of the columns are not dropped */
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ numatts += 1;
+
+ /* if there are dropped attributes, build a filtered int2vector */
+ if (numatts != attrs->dim1)
+ {
+ int16 *tmp = palloc0(numatts * sizeof(int16));
+ int attnum = 0;
+
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ tmp[attnum++] = attrs->values[j];
+
+ pfree(attrs);
+ attrs = buildint2vector(tmp, numatts);
+ }
+
+ /* filter only the interesting vacattrstats records */
+ stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(stat->mvoid, deps, attrs);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+static List*
+list_mv_stats(Oid relid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result = NIL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ info->mvoid = HeapTupleGetOid(htup);
+ info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_built = stats->deps_built;
+
+ result = lappend(result, info);
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_stakeys-1] = false;
+
+ /* use the new attnums, in case we removed some dropped ones */
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_stakeys -1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..6d5465b
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/datum.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+#include "utils/fmgroids.h"
+#include "utils/builtins.h"
+#include "access/sysattr.h"
+
+#include "utils/mvstats.h"
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..2a064a0
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,437 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ * Detect functional dependencies between columns.
+ *
+ * TODO This builds a complete set of dependencies, i.e. including transitive
+ * dependencies - if we identify [A => B] and [B => C], we're likely to
+ * identify [A => C] too. It might be better to keep only the minimal set
+ * of dependencies, i.e. prune all the dependencies that we can recreate
+ * by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may be
+ * recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check before
+ * actually doing the work
+ *
+ * The second option has the advantage that we don't really need to perform
+ * the sort/count. It's not sufficient alone, though, because we may
+ * discover the dependencies in the wrong order. For example we may find
+ *
+ * (a -> b), (a -> c) and then (b -> c)
+ *
+ * None of those dependencies is a combination of the already known ones,
+ * yet (a -> C) is a combination of (a -> b) and (b -> c).
+ *
+ *
+ * FIXME Currently we simply replace NULL values with 0 and then handle is as
+ * a regular value, but that groups NULL and actual 0 values. That's
+ * clearly incorrect - we need to handle NULL values as a separate value.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct values in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we can estimate the
+ * average group size and use it as a threshold. Or something
+ * like that. Seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, stats);
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ SortItem current;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, stats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ current = items[0];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
+ {
+ n_violations += 1;
+ }
+
+ current = items[i];
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means there's a 1:1 relation (or one is a 'label'), making the
+ * conditions rather redundant. Although it's possible that the
+ * query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index fd8dc91..4f106c3 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2104,6 +2104,50 @@ describeOneTableDetails(const char *schemaname,
PQclear(result);
}
+ /* print any multivariate statistics */
+ if (pset.sversion >= 90500)
+ {
+ printfPQExpBuffer(&buf,
+ "SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
+ " deps_enabled,\n"
+ " deps_built,\n"
+ " (SELECT string_agg(attname::text,', ')\n"
+ " FROM ((SELECT unnest(stakeys) AS attnum) s\n"
+ " JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
+ "FROM pg_mv_statistic stat WHERE starelid = '%s' ORDER BY 1;",
+ oid);
+
+ result = PSQLexec(buf.data);
+ if (!result)
+ goto error_return;
+ else
+ tuples = PQntuples(result);
+
+ if (tuples > 0)
+ {
+ printTableAddFooter(&cont, _("Statistics:"));
+ for (i = 0; i < tuples; i++)
+ {
+ printfPQExpBuffer(&buf, " ");
+
+ /* statistics name (qualified with namespace) */
+ appendPQExpBuffer(&buf, "\"%s.%s\" ",
+ PQgetvalue(result, i, 1),
+ PQgetvalue(result, i, 2));
+
+ /* options */
+ if (!strcmp(PQgetvalue(result, i, 4), "t"))
+ appendPQExpBuffer(&buf, "(dependencies)");
+
+ appendPQExpBuffer(&buf, " ON (%s)",
+ PQgetvalue(result, i, 6));
+
+ printTableAddFooter(&cont, buf.data);
+ }
+ }
+ PQclear(result);
+ }
+
/* print rules */
if (tableinfo.hasrules && tableinfo.relkind != 'm')
{
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 049bf9f..12211fe 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -153,10 +153,11 @@ typedef enum ObjectClass
OCLASS_EXTENSION, /* pg_extension */
OCLASS_EVENT_TRIGGER, /* pg_event_trigger */
OCLASS_POLICY, /* pg_policy */
- OCLASS_TRANSFORM /* pg_transform */
+ OCLASS_TRANSFORM, /* pg_transform */
+ OCLASS_STATISTICS /* pg_mv_statistics */
} ObjectClass;
-#define LAST_OCLASS OCLASS_TRANSFORM
+#define LAST_OCLASS OCLASS_STATISTICS
/* in dependency.c */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index b80d8d8..5ae42f7 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -119,6 +119,7 @@ extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
DropBehavior behavior, bool complain, bool internal);
extern void RemoveAttrDefaultById(Oid attrdefId);
extern void RemoveStatistics(Oid relid, AttrNumber attnum);
+extern void RemoveMVStatistics(Oid relid, AttrNumber attnum);
extern Form_pg_attribute SystemAttributeDefinition(AttrNumber attno,
bool relhasoids);
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index ab2c1a8..a768bb5 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,13 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_name_index, 3997, on pg_mv_statistic using btree(staname name_ops, stanamespace oid_ops));
+#define MvStatisticNameIndexId 3997
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index 2ccb3a7..44cf9c6 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -137,6 +137,8 @@ extern Oid get_collation_oid(List *collname, bool missing_ok);
extern Oid get_conversion_oid(List *conname, bool missing_ok);
extern Oid FindDefaultConversionProc(int32 for_encoding, int32 to_encoding);
+extern Oid get_statistics_oid(List *names, bool missing_ok);
+
/* initialization & transaction cleanup code */
extern void InitializeSearchPath(void);
extern void AtEOXact_Namespace(bool isCommit, bool parallel);
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..a568a07
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,73 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+ NameData staname; /* statistics name */
+ Oid stanamespace; /* OID of namespace containing this statistics */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_mv_statistic
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 7
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_staname 2
+#define Anum_pg_mv_statistic_stanamespace 3
+#define Anum_pg_mv_statistic_deps_enabled 4
+#define Anum_pg_mv_statistic_deps_built 5
+#define Anum_pg_mv_statistic_stakeys 6
+#define Anum_pg_mv_statistic_stadeps 7
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index cbbb883..eecce40 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2666,6 +2666,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index b7a38ce..a52096b 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3577, 3578);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 54f67e9..99a6a62 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -75,6 +75,10 @@ extern ObjectAddress DefineOperator(List *names, List *parameters);
extern void RemoveOperatorById(Oid operOid);
extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
+/* commands/statscmds.c */
+extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
+extern void RemoveStatisticsById(Oid statsOid);
+
/* commands/aggregatecmds.c */
extern ObjectAddress DefineAggregate(List *name, List *args, bool oldstyle,
List *parameters, const char *queryString);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index fad9988..545b62a 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -266,6 +266,7 @@ typedef enum NodeTag
T_PlaceHolderInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
+ T_MVStatisticInfo,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
@@ -401,6 +402,7 @@ typedef enum NodeTag
T_CreatePolicyStmt,
T_AlterPolicyStmt,
T_CreateTransformStmt,
+ T_CreateStatsStmt,
/*
* TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fd0629..e1807fb 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -601,6 +601,17 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct CreateStatsStmt
+{
+ NodeTag type;
+ List *defnames; /* qualified name (list of Value strings) */
+ RangeVar *relation; /* relation to build statistics on */
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+ bool if_not_exists; /* just do nothing if statistics already exists? */
+} CreateStatsStmt;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1410,6 +1421,7 @@ typedef enum ObjectType
OBJECT_RULE,
OBJECT_SCHEMA,
OBJECT_SEQUENCE,
+ OBJECT_STATISTICS,
OBJECT_TABCONSTRAINT,
OBJECT_TABLE,
OBJECT_TABLESPACE,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 641728b..e10dcf1 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -539,6 +539,7 @@ typedef struct RelOptInfo
List *lateral_vars; /* LATERAL Vars and PHVs referenced by rel */
Relids lateral_referencers; /* rels that reference me laterally */
List *indexlist; /* list of IndexOptInfo */
+ List *mvstatlist; /* list of MVStatisticInfo */
BlockNumber pages; /* size estimates derived from pg_class */
double tuples;
double allvisfrac;
@@ -634,6 +635,33 @@ typedef struct IndexOptInfo
void (*amcostestimate) (); /* AM's cost estimator */
} IndexOptInfo;
+/*
+ * MVStatisticInfo
+ * Information about multivariate stats for planning/optimization
+ *
+ * This contains information about which columns are covered by the
+ * statistics (stakeys), which options were requested while adding the
+ * statistics (*_enabled), and which kinds of statistics were actually
+ * built and are available for the optimizer (*_built).
+ */
+typedef struct MVStatisticInfo
+{
+ NodeTag type;
+
+ Oid mvoid; /* OID of the statistics row */
+ RelOptInfo *rel; /* back-link to index's table */
+
+ /* enabled statistics */
+ bool deps_enabled; /* functional dependencies enabled */
+
+ /* built/available statistics */
+ bool deps_built; /* functional dependencies built */
+
+ /* columns in the statistics (attnums) */
+ int2vector *stakeys; /* attnums of the columns covered */
+
+} MVStatisticInfo;
+
/*
* EquivalenceClasses
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..7ebd961
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,70 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "fmgr.h"
+#include "commands/vacuum.h"
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+
+#endif
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index f2bebf2..8771f9c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -61,6 +61,7 @@ typedef struct RelationData
bool rd_isvalid; /* relcache entry is valid */
char rd_indexvalid; /* state of rd_indexlist: 0 = not valid, 1 =
* valid, 2 = temporarily forced */
+ bool rd_mvstatvalid; /* state of rd_mvstatlist: true/false */
/*
* rd_createSubid is the ID of the highest subtransaction the rel has
@@ -93,6 +94,9 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
+
+ /* data managed by RelationGetMVStatList: */
+ List *rd_mvstatlist; /* list of OIDs of multivariate stats */
/* data managed by RelationGetIndexAttrBitmap: */
Bitmapset *rd_indexattr; /* identifies columns used in indexes */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 1b48304..9f03c8d 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -38,6 +38,7 @@ extern void RelationClose(Relation relation);
* Routines to compute/retrieve additional cached information
*/
extern List *RelationGetIndexList(Relation relation);
+extern List *RelationGetMVStatList(Relation relation);
extern Oid RelationGetOidIndex(Relation relation);
extern Oid RelationGetReplicaIndex(Relation relation);
extern List *RelationGetIndexExpressions(Relation relation);
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 256615b..0e0658d 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,8 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATNAMENSP,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 81bc5c9..84b4425 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1368,6 +1368,15 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.staname,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index eb0bc88..92a0d8a 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
--
2.1.0
0003-clause-reduction-using-functional-dependencies.patchtext/x-patch; charset=UTF-8; name=0003-clause-reduction-using-functional-dependencies.patchDownload
From 6f359af9ce78fd21bde74b76e45508364da992b2 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 19:42:18 +0200
Subject: [PATCH 3/9] clause reduction using functional dependencies
During planning, use functional dependencies to decide which clauses to
skip during cardinality estimation. Initial and rather simplistic
implementation.
This only works with regular WHERE clauses, not clauses used for join
clauses.
Note: The clause_is_mv_compatible() needs to identify the relation (so
that we can fetch the list of multivariate stats by OID).
planner_rt_fetch() seems like the appropriate way to get the relation
OID, but apparently it only works with simple vars. Maybe
examine_variable() would make this work with more complex vars too?
Includes regression tests analyzing functional dependencies (part of
ANALYZE) on several datasets (no dependencies, no transitive
dependencies, ...).
Checks that a query with conditions on two columns, where one (B) is
functionally dependent on the other one (A), correctly ignores the
clause on (B) and chooses bitmap index scan instead of plain index scan
(which is what happens otherwise, thanks to assumption of
independence).
Note: Functional dependencies only work with equality clauses, no
inequalities etc.
---
src/backend/optimizer/path/clausesel.c | 891 +++++++++++++++++++++++++-
src/backend/utils/mvstats/README.stats | 36 ++
src/backend/utils/mvstats/common.c | 5 +-
src/backend/utils/mvstats/dependencies.c | 24 +
src/include/utils/mvstats.h | 16 +-
src/test/regress/expected/mv_dependencies.out | 172 +++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 150 +++++
9 files changed, 1293 insertions(+), 5 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.stats
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 02660c2..80708fe 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -14,14 +14,19 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
+#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/selfuncs.h"
+#include "utils/typcache.h"
/*
@@ -41,6 +46,23 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+
+static bool clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum);
+
+static Bitmapset *collect_mv_attnums(List *clauses, Index relid);
+
+static int count_mv_attnums(List *clauses, Index relid);
+
+static int count_varnos(List *clauses, Index *relid);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Index relid, List *stats);
+
+static bool has_stats(List *stats, int type);
+
+static List * find_stats(PlannerInfo *root, Index relid);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
@@ -60,7 +82,19 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
+ * The first thing we try to do is applying multivariate statistics, in a way
+ * that intends to minimize the overhead when there are no multivariate stats
+ * on the relation. Thus we do several simple (and inexpensive) checks first,
+ * to verify that suitable multivariate statistics exist.
+ *
+ * If we identify such multivariate statistics apply, we try to apply them.
+ * Currently we only have (soft) functional dependencies, so we try to reduce
+ * the list of clauses.
+ *
+ * Then we remove the clauses estimated using multivariate stats, and process
+ * the rest of the clauses using the regular per-column stats.
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -99,6 +133,22 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+
+ /* list of multivariate stats on the relation */
+ List *stats = NIL;
+
+ /*
+ * To fetch the statistics, we first need to determine the rel. Currently
+ * point we only support estimates of simple restrictions with all Vars
+ * referencing a single baserel. However set_baserel_size_estimates() sets
+ * varRelid=0 so we have to actually inspect the clauses by pull_varnos
+ * and see if there's just a single varno referenced.
+ */
+ if ((count_varnos(clauses, &relid) == 1) && ((varRelid == 0) || (varRelid == relid)))
+ stats = find_stats(root, relid);
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +158,24 @@ clauselist_selectivity(PlannerInfo *root,
varRelid, jointype, sjinfo);
/*
+ * Apply functional dependencies, but first check that there are some stats
+ * with functional dependencies built (by simply walking the stats list),
+ * and that there are at two or more attributes referenced by clauses that
+ * may be reduced using functional dependencies.
+ *
+ * We would find that anyway when trying to actually apply the functional
+ * dependencies, but let's do the cheap checks first.
+ *
+ * After applying the functional dependencies we get the remainig clauses
+ * that need to be estimated by other types of stats (MCV, histograms etc).
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
+ (count_mv_attnums(clauses, relid) >= 2))
+ {
+ clauses = clauselist_apply_dependencies(root, clauses, relid, stats);
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -763,3 +831,824 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Pull varattnos from the clauses, similarly to pull_varattnos() but:
+ *
+ * (a) only get attributes for a particular relation (relid)
+ * (b) ignore system attributes (we can't build stats on them anyway)
+ *
+ * This makes it possible to directly compare the result with attnum
+ * values from pg_attribute etc.
+ */
+static Bitmapset *
+get_varattnos(Node * node, Index relid)
+{
+ int k;
+ Bitmapset *varattnos = NULL;
+ Bitmapset *result = NULL;
+
+ /* get the varattnos */
+ pull_varattnos(node, relid, &varattnos);
+
+ k = -1;
+ while ((k = bms_next_member(varattnos, k)) >= 0)
+ {
+ if (k + FirstLowInvalidHeapAttributeNumber > 0)
+ result
+ = bms_add_member(result,
+ k + FirstLowInvalidHeapAttributeNumber);
+ }
+
+ bms_free(varattnos);
+
+ return result;
+}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Index relid)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(clause, relid, &attnum))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Index relid)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Count varnos referenced in the clauses, and if there's a single varno then
+ * return the index in 'relid'.
+ */
+static int
+count_varnos(List *clauses, Index *relid)
+{
+ int cnt;
+ Bitmapset *varnos = NULL;
+
+ varnos = pull_varnos((Node *) clauses);
+ cnt = bms_num_members(varnos);
+
+ /* if there's a single varno in the clauses, remember it */
+ if (bms_num_members(varnos) == 1)
+ *relid = bms_singleton_member(varnos);
+
+ bms_free(varnos);
+
+ return cnt;
+}
+
+typedef struct
+{
+ Index varno; /* relid we're interested in */
+ Bitmapset *varattnos; /* attnums referenced by the clauses */
+} mv_compatible_context;
+
+/*
+ * Recursive walker that checks compatibility of the clause with multivariate
+ * statistics, and collects attnums from the Vars.
+ *
+ * XXX The original idea was to combine this with expression_tree_walker, but
+ * I've been unable to make that work - seems that does not quite allow
+ * checking the structure. Hence the explicit calls to the walker.
+ */
+static bool
+mv_compatible_walker(Node *node, mv_compatible_context *context)
+{
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) node;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return true;
+
+ /* clauses referencing multiple varnos are incompatible */
+ if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+ return true;
+
+ /* check the clause inside the RestrictInfo */
+ return mv_compatible_walker((Node*)rinfo->clause, (void *) context);
+ }
+
+ if (IsA(node, Var))
+ {
+ Var * var = (Var*)node;
+
+ /*
+ * Also, the variable needs to reference the right relid (this might be
+ * unnecessary given the other checks, but let's be sure).
+ */
+ if (var->varno != context->varno)
+ return true;
+
+ /* Also skip system attributes (we don't allow stats on those). */
+ if (! AttrNumberIsForUserDefinedAttr(var->varattno))
+ return true;
+
+ /* Seems fine, so let's remember the attnum. */
+ context->varattnos = bms_add_member(context->varattnos, var->varattno);
+
+ return false;
+ }
+
+ /*
+ * And finally the operator expressions - we only allow simple expressions
+ * with two arguments, where one is a Var and the other is a constant, and
+ * it's a simple comparison (which we detect using estimator function).
+ */
+ if (is_opclause(node))
+ {
+ OpExpr *expr = (OpExpr *) node;
+ Var *var;
+ bool varonleft = true;
+ bool ok;
+
+ /*
+ * Only expressions with two arguments are considered compatible.
+ *
+ * XXX Possibly unnecessary (can OpExpr have different arg count?).
+ */
+ if (list_length(expr->args) != 2)
+ return true;
+
+ /* see if it actually has the right */
+ ok = (NumRelids((Node*)expr) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ /* unsupported structure (two variables or so) */
+ if (! ok)
+ return true;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the clause.
+ * Otherwise note the relid and attnum for the variable. This uses the
+ * function for estimating selectivity, ont the operator directly (a bit
+ * awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+
+ /* equality conditions are compatible with all statistics */
+ break;
+
+ default:
+
+ /* unknown estimator */
+ return true;
+ }
+
+ var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ return mv_compatible_walker((Node *) var, context);
+ }
+
+ /* Node not explicitly supported, so terminate */
+ return true;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
+{
+ mv_compatible_context context;
+
+ context.varno = relid;
+ context.varattnos = NULL; /* no attnums */
+
+ if (mv_compatible_walker(clause, (void *) &context))
+ return false;
+
+ /* remember the newly collected attnums */
+ *attnum = bms_singleton_member(context.varattnos);
+
+ return true;
+}
+
+/*
+ * collect attnums from functional dependencies
+ *
+ * Walk through all statistics on the relation, and collect attnums covered
+ * by those with functional dependencies. We only look at columns specified
+ * when creating the statistics, not at columns actually referenced by the
+ * dependencies (which may only be a subset of the attributes).
+ */
+static Bitmapset*
+fdeps_collect_attnums(List *stats)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ int2vector *stakeys = info->stakeys;
+
+ /* skip stats without functional dependencies built */
+ if (! info->deps_built)
+ continue;
+
+ for (j = 0; j < stakeys->dim1; j++)
+ attnums = bms_add_member(attnums, stakeys->values[j]);
+ }
+
+ return attnums;
+}
+
+/* transforms bitmapset into an array (index => value) */
+static int*
+make_idx_to_attnum_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum;
+
+ int *mapping = (int*)palloc0(bms_num_members(attnums) * sizeof(int));
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attidx++] = attnum;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+/* transforms bitmapset into an array (value => index) */
+static int*
+make_attnum_to_idx_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum;
+ int maxattnum = -1;
+ int *mapping;
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ maxattnum = attnum;
+
+ mapping = (int*)palloc0((maxattnum+1) * sizeof(int));
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attnum] = attidx++;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+/* build adjacency matrix for the dependencies */
+static bool*
+build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx)
+{
+ ListCell *lc;
+ int natts = bms_num_members(attnums);
+ bool *matrix = (bool*)palloc0(natts * natts * sizeof(bool));
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies dependencies = NULL;
+
+ /* skip stats without functional dependencies built */
+ if (! stat->deps_built)
+ continue;
+
+ /* fetch and deserialize dependencies */
+ dependencies = load_mv_dependencies(stat->mvoid);
+ if (dependencies == NULL)
+ {
+ elog(WARNING, "failed to deserialize func deps %d", stat->mvoid);
+ continue;
+ }
+
+ /* set matrix[a,b] to 'true' if 'a=>b' */
+ for (j = 0; j < dependencies->ndeps; j++)
+ {
+ int aidx = attnum_to_idx[dependencies->deps[j]->a];
+ int bidx = attnum_to_idx[dependencies->deps[j]->b];
+
+ /* a=> b */
+ matrix[aidx * natts + bidx] = true;
+ }
+ }
+
+ return matrix;
+}
+
+/*
+ * multiply the adjacency matrix
+ *
+ * By multiplying the adjacency matrix, we derive dependencies implied by those
+ * stored in the catalog (but possibly in several separate rows). We need to
+ * repeat the multiplication until no new dependencies are discovered. The
+ * maximum number of multiplications is equal to the number of attributes.
+ *
+ * This is based on modeling the functional dependencies as edges in a directed
+ * graph with attributes as vertices.
+ */
+static void
+multiply_adjacency_matrix(bool *matrix, int natts)
+{
+ int i;
+
+ /* repeat the multiplication up to natts-times */
+ for (i = 0; i < natts; i++)
+ {
+ bool changed = false; /* no changes in this round */
+ int k, l, m;
+
+ /* k => l */
+ for (k = 0; k < natts; k++)
+ {
+ for (l = 0; l < natts; l++)
+ {
+ /* skip already known dependencies */
+ if (matrix[k * natts + l])
+ continue;
+
+ /*
+ * compute (k,l) in the multiplied matrix
+ *
+ * We don't really care about the exact value, just true/false,
+ * so terminate the loop once we get a hit. Also, this makes it
+ * safe to modify the matrix in-place.
+ */
+ for (m = 0; m < natts; m++)
+ {
+ if (matrix[k * natts + m] * matrix[m * natts + l])
+ {
+ matrix[k * natts + l] = true;
+ changed = true;
+ break;
+ }
+ }
+ }
+ }
+
+ /* no transitive dependency added in this round, so terminate */
+ if (! changed)
+ break;
+ }
+}
+
+/*
+ * Reduce clauses using functional dependencies
+ *
+ * Walk through clauses and eliminate the redundant ones (implied by other
+ * clauses). This is done by first deriving a transitive closure of all the
+ * functional dependencies (by multiplying the adjacency matrix).
+ */
+static List*
+fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx, Index relid)
+{
+ int i;
+ ListCell *lc;
+ List *reduced_clauses = NIL;
+
+ int nmvclauses; /* size of the arrays */
+ bool *reduced;
+ AttrNumber *mvattnums;
+ Node **mvclauses;
+
+ int natts = bms_num_members(attnums);
+
+ /*
+ * Preallocate space for all clauses (the list only containst
+ * compatible clauses at this point). This makes it somewhat easier
+ * to access the stats / attnums randomly.
+ *
+ * XXX This assumes each clause references exactly one Var, so the
+ * arrays are sized accordingly - for functional dependencies
+ * this is safe, because it only works with Var=Const.
+ */
+ mvclauses = (Node**)palloc0(list_length(clauses) * sizeof(Node*));
+ mvattnums = (AttrNumber*)palloc0(list_length(clauses) * sizeof(AttrNumber));
+ reduced = (bool*)palloc0(list_length(clauses) * sizeof(bool));
+
+ /* fill the arrays */
+ nmvclauses = 0;
+ foreach (lc, clauses)
+ {
+ Node * clause = (Node*)lfirst(lc);
+ Bitmapset * attnums = get_varattnos(clause, relid);
+
+ mvclauses[nmvclauses] = clause;
+ mvattnums[nmvclauses] = bms_singleton_member(attnums);
+ nmvclauses++;
+ }
+
+ Assert(nmvclauses == list_length(clauses));
+
+ /* now try to reduce the clauses (using the dependencies) */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ int j;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[i], attnums))
+ continue;
+
+ /* this clause was already reduced, so let's skip it */
+ if (reduced[i])
+ continue;
+
+ /* walk the potentially 'implied' clauses */
+ for (j = 0; j < nmvclauses; j++)
+ {
+ int aidx, bidx;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[j], attnums))
+ continue;
+
+ aidx = attnum_to_idx[mvattnums[i]];
+ bidx = attnum_to_idx[mvattnums[j]];
+
+ /* can't reduce the clause by itself, or if already reduced */
+ if ((i == j) || reduced[j])
+ continue;
+
+ /* mark the clause as reduced (if aidx => bidx) */
+ reduced[j] = matrix[aidx * natts + bidx];
+ }
+ }
+
+ /* now walk through the clauses, and keep only those not reduced */
+ for (i = 0; i < nmvclauses; i++)
+ if (! reduced[i])
+ reduced_clauses = lappend(reduced_clauses, mvclauses[i]);
+
+ pfree(reduced);
+ pfree(mvclauses);
+ pfree(mvattnums);
+
+ return reduced_clauses;
+}
+
+/*
+ * filter clauses that are interesting for the reduction step
+ *
+ * Functional dependencies can only work with equality clauses with attributes
+ * covered by at least one of the statistics, so we walk through the clauses
+ * and copy the uninteresting ones directly to the result (reduced) clauses.
+ *
+ * That includes clauses that:
+ * (a) are not mv-compatible
+ * (b) reference more than a single attnum
+ * (c) use attnum not covered by functional depencencies
+ *
+ * The clauses interesting for the reduction step are copied to deps_clauses.
+ *
+ * root - planner root
+ * clauses - list of clauses (input)
+ * deps_attnums - attributes covered by dependencies
+ * reduced_clauses - resulting clauses (not subject to reduction step)
+ * deps_clauses - clauses to be processed by reduction
+ * relid - relid of the baserel
+ *
+ * The return value is a bitmap of attnums referenced by deps_clauses.
+ */
+static Bitmapset *
+fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Index relid)
+{
+ ListCell *lc;
+ Bitmapset *clause_attnums = NULL;
+
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(lc);
+
+ if (! clause_is_mv_compatible(clause, relid, &attnum))
+
+ /* clause incompatible with functional dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(attnum, deps_attnums))
+
+ /* clause not covered by the dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else
+ {
+ *deps_clauses = lappend(*deps_clauses, clause);
+ clause_attnums = bms_add_member(clause_attnums, attnum);
+ }
+ }
+
+ return clause_attnums;
+}
+
+/*
+ * reduce list of equality clauses using soft functional dependencies
+ *
+ * We simply walk through list of functional dependencies, and for each one we
+ * check whether the dependency 'matches' the clauses, i.e. if there's a clause
+ * matching the condition. If yes, we attempt to remove all clauses matching
+ * the implied part of the dependency from the list.
+ *
+ * This only reduces equality clauses, and ignores all the other types. We might
+ * extend it to handle IS NULL clause, in the future.
+ *
+ * We also assume the equality clauses are 'compatible'. For example we can't
+ * identify when the clauses use a mismatching zip code and city name. In such
+ * case the usual approach (product of selectivities) would produce a better
+ * estimate, although mostly by chance.
+ *
+ * The implementation needs to be careful about cyclic dependencies, e.g. when
+ *
+ * (a -> b) and (b -> a)
+ *
+ * at the same time, which means there's 1:1 relationship between te columns.
+ * In this case we must not reduce clauses on both attributes at the same time.
+ *
+ * TODO Currently we only apply functional dependencies at the same level, but
+ * maybe we could transfer the clauses from upper levels to the subtrees?
+ * For example let's say we have (a->b) dependency, and condition
+ *
+ * (a=1) AND (b=2 OR c=3)
+ *
+ * Currently, we won't be able to perform any reduction, because we'll
+ * consider (a=1) and (b=2 OR c=3) independently. But maybe we could pass
+ * (a=1) into the other expression, and only check it against conditions
+ * of the functional dependencies?
+ *
+ * In this case we'd end up with
+ *
+ * (a=1)
+ *
+ * as we'd consider (b=2) implied thanks to the rule, rendering the whole
+ * OR clause valid.
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Index relid, List *stats)
+{
+ List *reduced_clauses = NIL;
+
+ /*
+ * matrix of (natts x natts), 1 means x=>y
+ *
+ * This serves two purposes - first, it merges dependencies from all
+ * the statistics, second it makes generating all the transitive
+ * dependencies easier.
+ *
+ * We need to build this only for attributes from the dependencies,
+ * not for all attributes in the table.
+ *
+ * We can't do that only for attributes from the clauses, because we
+ * want to build transitive dependencies (including those going
+ * through attributes not listed in the stats).
+ *
+ * This only works for A=>B dependencies, not sure how to do that
+ * for complex dependencies.
+ */
+ bool *deps_matrix;
+ int deps_natts; /* size of the matric */
+
+ /* mapping attnum <=> matrix index */
+ int *deps_idx_to_attnum;
+ int *deps_attnum_to_idx;
+
+ /* attnums in dependencies and clauses (and intersection) */
+ List *deps_clauses = NIL;
+ Bitmapset *deps_attnums = NULL;
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *intersect_attnums = NULL;
+
+ /*
+ * Is there at least one statistics with functional dependencies?
+ * If not, return the original clauses right away.
+ *
+ * XXX Isn't this pointless, thanks to exactly the same check in
+ * clauselist_selectivity()? Can we trigger the condition here?
+ */
+ if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ return clauses;
+
+ /*
+ * Build the dependency matrix, i.e. attribute adjacency matrix,
+ * where 1 means (a=>b). Once we have the adjacency matrix, we'll
+ * multiply it by itself, to get transitive dependencies.
+ *
+ * Note: This is pretty much transitive closure from graph theory.
+ *
+ * First, let's see what attributes are covered by functional
+ * dependencies (sides of the adjacency matrix), and also a maximum
+ * attribute (size of mapping to simple integer indexes);
+ */
+ deps_attnums = fdeps_collect_attnums(stats);
+
+ /*
+ * Walk through the clauses - clauses that are (one of)
+ *
+ * (a) not mv-compatible
+ * (b) are using more than a single attnum
+ * (c) using attnum not covered by functional depencencies
+ *
+ * may be copied directly to the result. The interesting clauses are
+ * kept in 'deps_clauses' and will be processed later.
+ */
+ clause_attnums = fdeps_filter_clauses(root, clauses, deps_attnums,
+ &reduced_clauses, &deps_clauses, relid);
+
+ /*
+ * we need at least two clauses referencing two different attributes
+ * referencing to do the reduction
+ */
+ if ((list_length(deps_clauses) < 2) || (bms_num_members(clause_attnums) < 2))
+ {
+ bms_free(clause_attnums);
+ list_free(reduced_clauses);
+ list_free(deps_clauses);
+
+ return clauses;
+ }
+
+
+ /*
+ * We need at least two matching attributes in the clauses and
+ * dependencies, otherwise we can't really reduce anything.
+ */
+ intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
+ if (bms_num_members(intersect_attnums) < 2)
+ {
+ bms_free(clause_attnums);
+ bms_free(deps_attnums);
+ bms_free(intersect_attnums);
+
+ list_free(deps_clauses);
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ /*
+ * Build mapping between matrix indexes and attnums, and then the
+ * adjacency matrix itself.
+ */
+ deps_idx_to_attnum = make_idx_to_attnum_mapping(deps_attnums);
+ deps_attnum_to_idx = make_attnum_to_idx_mapping(deps_attnums);
+
+ /* build the adjacency matrix */
+ deps_matrix = build_adjacency_matrix(stats, deps_attnums,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx);
+
+ deps_natts = bms_num_members(deps_attnums);
+
+ /*
+ * Multiply the matrix N-times (N = size of the matrix), so that we
+ * get all the transitive dependencies. That makes the next step
+ * much easier and faster.
+ *
+ * This is essentially an adjacency matrix from graph theory, and
+ * by multiplying it we get transitive edges. We don't really care
+ * about the exact number (number of paths between vertices) though,
+ * so we can do the multiplication in-place (we don't care whether
+ * we found the dependency in this round or in the previous one).
+ *
+ * Track how many new dependencies were added, and stop when 0, but
+ * we can't multiply more than N-times (longest path in the graph).
+ */
+ multiply_adjacency_matrix(deps_matrix, deps_natts);
+
+ /*
+ * Walk through the clauses, and see which other clauses we may
+ * reduce. The matrix contains all transitive dependencies, which
+ * makes this very fast.
+ *
+ * We have to be careful not to reduce the clause using itself, or
+ * reducing all clauses forming a cycle (so we have to skip already
+ * eliminated clauses).
+ *
+ * I'm not sure whether this guarantees finding the best solution,
+ * i.e. reducing the most clauses, but it probably does (thanks to
+ * having all the transitive dependencies).
+ */
+ deps_clauses = fdeps_reduce_clauses(deps_clauses,
+ deps_attnums, deps_matrix,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx, relid);
+
+ /* join the two lists of clauses */
+ reduced_clauses = list_union(reduced_clauses, deps_clauses);
+
+ pfree(deps_matrix);
+ pfree(deps_idx_to_attnum);
+ pfree(deps_attnum_to_idx);
+
+ bms_free(deps_attnums);
+ bms_free(clause_attnums);
+ bms_free(intersect_attnums);
+
+ return reduced_clauses;
+}
+
+/*
+ * Check that there are stats with at least one of the requested types.
+ */
+static bool
+has_stats(List *stats, int type)
+{
+ ListCell *s;
+
+ foreach (s, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Lookups stats for a given baserel.
+ */
+static List *
+find_stats(PlannerInfo *root, Index relid)
+{
+ Assert(root->simple_rel_array[relid] != NULL);
+
+ return root->simple_rel_array[relid]->mvstatlist;
+}
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
new file mode 100644
index 0000000..a38ea7b
--- /dev/null
+++ b/src/backend/utils/mvstats/README.stats
@@ -0,0 +1,36 @@
+Multivariate statististics
+==========================
+
+When estimating various quantities (e.g. condition selectivities) the default
+approach relies on the assumption of independence. In practice that's often
+not true, resulting in estimation errors.
+
+Multivariate stats track different types of dependencies between the columns,
+hopefully improving the estimates.
+
+Currently we only have one kind of multivariate statistics - soft functional
+dependencies, and we use it to improve estimates of equality clauses. See
+README.dependencies for details.
+
+
+Selectivity estimation
+----------------------
+
+When estimating selectivity, we aim to achieve several things:
+
+ (a) maximize the estimate accuracy
+
+ (b) minimize the overhead, especially when no suitable multivariate stats
+ exist (so if you are not using multivariate stats, there's no overhead)
+
+This clauselist_selectivity() performs several inexpensive checks first, before
+even attempting to do the more expensive estimation.
+
+ (1) check if there are multivariate stats on the relation
+
+ (2) check there are at least two attributes referenced by clauses compatible
+ with multivariate statistics (equality clauses for func. dependencies)
+
+ (3) perform reduction of equality clauses using func. dependencies
+
+ (4) estimate the reduced list of clauses using regular statistics
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index a755c49..bd200bc 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -84,7 +84,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(stat->mvoid, deps, attrs);
@@ -163,6 +164,7 @@ list_mv_stats(Oid relid)
info->mvoid = HeapTupleGetOid(htup);
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
result = lappend(result, info);
@@ -274,6 +276,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 2a064a0..c80ba33 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -435,3 +435,27 @@ pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(result));
}
+
+MVDependencies
+load_mv_dependencies(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->deps_enabled && mvstat->deps_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_dependencies(DatumGetByteaP(deps));
+}
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 7ebd961..cc43a79 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,12 +17,20 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
int16 a;
@@ -48,6 +56,8 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
+MVDependencies load_mv_dependencies(Oid mvoid);
+
bytea * serialize_mv_dependencies(MVDependencies dependencies);
/* deserialization of stats (serialization is private to analyze) */
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..e759997
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,172 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index bec0316..4f2ffb8 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -110,3 +110,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7e9b319..097a04f 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -162,3 +162,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..48dea4d
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,150 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
--
2.1.0
0004-multivariate-MCV-lists.patchtext/x-patch; charset=UTF-8; name=0004-multivariate-MCV-lists.patchDownload
From eea437d2d84469974efc8fbf2fddd926acbbd426 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 16:52:15 +0200
Subject: [PATCH 4/9] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
---
doc/src/sgml/ref/create_statistics.sgml | 18 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 45 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 829 ++++++++++++++++++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.mcv | 137 ++++
src/backend/utils/mvstats/README.stats | 89 ++-
src/backend/utils/mvstats/common.c | 104 ++-
src/backend/utils/mvstats/common.h | 11 +-
src/backend/utils/mvstats/mcv.c | 1094 +++++++++++++++++++++++++++++++
src/bin/psql/describe.c | 25 +-
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 69 +-
src/test/regress/expected/mv_mcv.out | 207 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 178 +++++
22 files changed, 2776 insertions(+), 73 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.mcv
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index a86eae3..193e4b0 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -132,6 +132,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>max_mcv_items</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of MCV list items.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>mcv</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables MCV list for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index b8a264e..2d570ee 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -165,7 +165,9 @@ CREATE VIEW pg_mv_stats AS
S.staname AS staname,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 84a8b13..90bfaed 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -136,7 +136,13 @@ CreateStatistics(CreateStatsStmt *stmt)
ObjectAddress parentobject, childobject;
/* by default build nothing */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -212,6 +218,29 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -220,10 +249,16 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -243,8 +278,12 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 07206d7..333e24b 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2162,9 +2162,11 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
+ WRITE_BOOL_FIELD(mcv_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
+ WRITE_BOOL_FIELD(mcv_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 80708fe..977f88e 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -15,6 +15,7 @@
#include "postgres.h"
#include "access/sysattr.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
@@ -47,23 +48,51 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
-static bool clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum);
+static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
+ int type);
-static Bitmapset *collect_mv_attnums(List *clauses, Index relid);
+static Bitmapset *collect_mv_attnums(List *clauses, Index relid, int type);
-static int count_mv_attnums(List *clauses, Index relid);
+static int count_mv_attnums(List *clauses, Index relid, int type);
static int count_varnos(List *clauses, Index *relid);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Index relid, List *stats);
+static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
+
+static List *clauselist_mv_split(PlannerInfo *root, Index relid,
+ List *clauses, List **mvclauses,
+ MVStatisticInfo *mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
+
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,isor) \
+ (m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -89,11 +118,13 @@ static List * find_stats(PlannerInfo *root, Index relid);
* to verify that suitable multivariate statistics exist.
*
* If we identify such multivariate statistics apply, we try to apply them.
- * Currently we only have (soft) functional dependencies, so we try to reduce
- * the list of clauses.
*
- * Then we remove the clauses estimated using multivariate stats, and process
- * the rest of the clauses using the regular per-column stats.
+ * First we try to reduce the list of clauses by applying (soft) functional
+ * dependencies, and then we try to estimate the selectivity of the reduced
+ * list of clauses using the multivariate MCV list.
+ *
+ * Finally we remove the portion of clauses estimated using multivariate stats,
+ * and process the rest of the clauses using the regular per-column stats.
*
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
@@ -170,12 +201,46 @@ clauselist_selectivity(PlannerInfo *root,
* that need to be estimated by other types of stats (MCV, histograms etc).
*/
if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
- (count_mv_attnums(clauses, relid) >= 2))
+ (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_FDEP) >= 2))
{
clauses = clauselist_apply_dependencies(root, clauses, relid, stats);
}
/*
+ * Check that there are statistics with MCV list or histogram, and also the
+ * number of attributes covered by these types of statistics.
+ *
+ * If there are no such stats or not enough attributes, don't waste time
+ * with the multivariate code and simply skip to estimation using the
+ * regular per-column stats.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
+ (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV) >= 2))
+ {
+ /* collect attributes from the compatible conditions */
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV);
+
+ /* and search for the statistic covering the most attributes */
+ MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+
+ if (mvstat != NULL) /* we have a matching stats */
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
+ mvstat, MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -832,6 +897,69 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * estimate selectivity of clauses using multivariate statistic
+ *
+ * Perform estimation of the clauses using a MCV list.
+ *
+ * This assumes all the clauses are compatible with the selected statistics
+ * (e.g. only reference columns covered by the statistics, use supported
+ * operator, etc.).
+ *
+ * TODO We may support some additional conditions, most importantly those
+ * matching multiple columns (e.g. "a = b" or "a < b").
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities (i.e. the
+ * selectivity of the most restrictive clause), because that's the maximum
+ * we can ever get from ANDed list of clauses. This may probably prevent
+ * issues with hitting too many buckets and low precision histograms.
+ *
+ * TODO We may remember the lowest frequency in the MCV list, and then later use
+ * it as a upper boundary for the selectivity (had there been a more
+ * frequent item, it'd be in the MCV list). This might improve cases with
+ * low-detail histograms.
+ *
+ * TODO We may also derive some additional boundaries for the selectivity from
+ * the MCV list, because
+ *
+ * (a) if we have a "full equality condition" (one equality condition on
+ * each column of the statistic) and we found a match in the MCV list,
+ * then this is the final selectivity (and pretty accurate),
+ *
+ * (b) if we have a "full equality condition" and we haven't found a match
+ * in the MCV list, then the selectivity is below the lowest frequency
+ * found in the MCV list,
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive). But this
+ * requires really knowing the per-clause selectivities in advance,
+ * and that's not what we do now.
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Pull varattnos from the clauses, similarly to pull_varattnos() but:
*
@@ -869,28 +997,26 @@ get_varattnos(Node * node, Index relid)
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
-collect_mv_attnums(List *clauses, Index relid)
+collect_mv_attnums(List *clauses, Index relid, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
/*
- * Walk through the clauses and identify the ones we can estimate
- * using multivariate stats, and remember the relid/columns. We'll
- * then cross-check if we have suitable stats, and only if needed
- * we'll split the clauses into multivariate and regular lists.
+ * Walk through the clauses and identify the ones we can estimate using
+ * multivariate stats, and remember the relid/columns. We'll then
+ * cross-check if we have suitable stats, and only if needed we'll split
+ * the clauses into multivariate and regular lists.
*
- * For now we're only interested in RestrictInfo nodes with nested
- * OpExpr, using either a range or equality.
+ * For now we're only interested in RestrictInfo nodes with nested OpExpr,
+ * using either a range or equality.
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(clause, relid, &attnum))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, relid, &attnums, types);
}
/*
@@ -911,10 +1037,10 @@ collect_mv_attnums(List *clauses, Index relid)
* Count the number of attributes in clauses compatible with multivariate stats.
*/
static int
-count_mv_attnums(List *clauses, Index relid)
+count_mv_attnums(List *clauses, Index relid, int type)
{
int c;
- Bitmapset *attnums = collect_mv_attnums(clauses, relid);
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
c = bms_num_members(attnums);
@@ -944,9 +1070,183 @@ count_varnos(List *clauses, Index *relid)
return cnt;
}
+
+/*
+ * We're looking for statistics matching at least 2 attributes, referenced in
+ * clauses compatible with multivariate statistics. The current selection
+ * criteria is very simple - we choose the statistics referencing the most
+ * attributes.
+ *
+ * If there are multiple statistics referencing the same number of columns
+ * (from the clauses), the one with less source columns (as listed in the
+ * ADD STATISTICS when creating the statistics) wins. Else the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns, but one
+ * has 100 buckets and the other one has 1000 buckets (thus likely
+ * providing better estimates), this is not currently considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list, another
+ * one with just a histogram and a third one with both, we treat them equally.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts, so if
+ * there are multiple clauses on a single attribute, this still counts as
+ * a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example equality
+ * clauses probably work better with MCV lists than with histograms. But
+ * IS [NOT] NULL conditions may often work better with histograms (thanks
+ * to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be selected
+ * as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list of
+ * clauses into two parts - conditions that are compatible with the selected
+ * stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while the last
+ * condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting conditions
+ * instead of just referenced attributes), but eventually the best option should
+ * be to combine multiple statistics. But that's much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses, because
+ * 'dependencies' will probably work only with equality clauses.
+ */
+static MVStatisticInfo *
+choose_mv_statistics(List *stats, Bitmapset *attnums)
+{
+ int i;
+ ListCell *lc;
+
+ MVStatisticInfo *choice = NULL;
+
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements) and for
+ * each one count the referenced attributes (encoded in the 'attnums' bitmap).
+ */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = info->stakeys;
+ int numattrs = attrs->dim1;
+
+ /* skip dependencies-only stats */
+ if (! info->mcv_built)
+ continue;
+
+ /* count columns covered by the histogram */
+ for (i = 0; i < numattrs; i++)
+ if (bms_is_member(attrs->values[i], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = info;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses that
+ * will be evaluated using the chosen statistics, and the remaining clauses
+ * (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, Index relid,
+ List *clauses, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes, so we can do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(clause, relid, &attnums, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list of
+ * mv-compatible clauses. Otherwise, keep it in the list of 'regular'
+ * clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible with the chosen
+ * histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
typedef struct
{
+ int types; /* types of statistics ? */
Index varno; /* relid we're interested in */
Bitmapset *varattnos; /* attnums referenced by the clauses */
} mv_compatible_context;
@@ -964,23 +1264,66 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
{
if (node == NULL)
return false;
-
+
if (IsA(node, RestrictInfo))
{
RestrictInfo *rinfo = (RestrictInfo *) node;
-
+
/* Pseudoconstants are not really interesting here. */
if (rinfo->pseudoconstant)
return true;
-
+
/* clauses referencing multiple varnos are incompatible */
if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
return true;
-
+
/* check the clause inside the RestrictInfo */
return mv_compatible_walker((Node*)rinfo->clause, (void *) context);
}
+ if (or_clause(node) || and_clause(node) || not_clause(node))
+ {
+ /*
+ * AND/OR/NOT-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses are
+ * supported and some are not, and treat all supported subclauses
+ * as a single clause, compute it's selectivity using mv stats,
+ * and compute the total selectivity using the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the orclause
+ * with nested RestrictInfo - we won't have to call pull_varnos()
+ * for each clause, saving time.
+ *
+ * TODO Perhaps this needs a bit more thought for functional
+ * dependencies? Those don't quite work for NOT cases.
+ */
+ BoolExpr *expr = (BoolExpr *) node;
+ ListCell *lc;
+
+ foreach (lc, expr->args)
+ {
+ if (mv_compatible_walker((Node *) lfirst(lc), context))
+ return true;
+ }
+
+ return false;
+ }
+
+ if (IsA(node, NullTest))
+ {
+ NullTest* nt = (NullTest*)node;
+
+ /*
+ * Only simple (Var IS NULL) expressions supported for now. Maybe we could
+ * use examine_variable to fix this?
+ */
+ if (! IsA(nt->arg, Var))
+ return true;
+
+ return mv_compatible_walker((Node*)(nt->arg), context);
+ }
+
if (IsA(node, Var))
{
Var * var = (Var*)node;
@@ -1031,7 +1374,7 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
/* unsupported structure (two variables or so) */
if (! ok)
return true;
-
+
/*
* If it's not a "<" or ">" or "=" operator, just ignore the clause.
* Otherwise note the relid and attnum for the variable. This uses the
@@ -1041,10 +1384,18 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
switch (get_oprrest(expr->opno))
{
case F_EQSEL:
-
/* equality conditions are compatible with all statistics */
break;
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+
+ /* not compatible with functional dependencies */
+ if (! (context->types & MV_CLAUSE_TYPE_MCV))
+ return true; /* terminate */
+
+ break;
+
default:
/* unknown estimator */
@@ -1055,11 +1406,11 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
return mv_compatible_walker((Node *) var, context);
}
-
+
/* Node not explicitly supported, so terminate */
return true;
}
-
+
/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
@@ -1078,10 +1429,11 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
* evaluate them using multivariate stats.
*/
static bool
-clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
+clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums, int types)
{
mv_compatible_context context;
+ context.types = types;
context.varno = relid;
context.varattnos = NULL; /* no attnums */
@@ -1089,7 +1441,7 @@ clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
return false;
/* remember the newly collected attnums */
- *attnum = bms_singleton_member(context.varattnos);
+ *attnums = bms_add_members(*attnums, context.varattnos);
return true;
}
@@ -1394,24 +1746,39 @@ fdeps_filter_clauses(PlannerInfo *root,
foreach (lc, clauses)
{
- AttrNumber attnum;
+ Bitmapset *attnums = NULL;
Node *clause = (Node *) lfirst(lc);
- if (! clause_is_mv_compatible(clause, relid, &attnum))
+ if (! clause_is_mv_compatible(clause, relid, &attnums,
+ MV_CLAUSE_TYPE_FDEP))
/* clause incompatible with functional dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
- else if (! bms_is_member(attnum, deps_attnums))
+ else if (bms_num_members(attnums) > 1)
+
+ /*
+ * clause referencing multiple attributes (strange, should
+ * this be handled by clause_is_mv_compatible directly)
+ */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(bms_singleton_member(attnums), deps_attnums))
/* clause not covered by the dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
else
{
+ /* ok, clause compatible with existing dependencies */
+ Assert(bms_num_members(attnums) == 1);
+
*deps_clauses = lappend(*deps_clauses, clause);
- clause_attnums = bms_add_member(clause_attnums, attnum);
+ clause_attnums = bms_add_member(clause_attnums,
+ bms_singleton_member(attnums));
}
+
+ bms_free(attnums);
}
return clause_attnums;
@@ -1637,6 +2004,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
}
return false;
@@ -1652,3 +2022,392 @@ find_stats(PlannerInfo *root, Index relid)
return root->simple_rel_array[relid]->mvstatlist;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = load_mv_mcvlist(mvstats->mvoid);
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* used to 'scale' for MCV lists not covering all tuples */
+ u += mcvlist->items[i]->frequency;
+
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s*u;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /*
+ * find the lowest frequency in the MCV list
+ *
+ * We need to do that here, because we do various tricks in the following
+ * code - skipping items already ruled out, etc.
+ *
+ * XXX A loop is necessary because the MCV list is not sorted by frequency.
+ */
+ *lowsel = 1.0;
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+ }
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr *expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+ FmgrInfo opproc;
+
+ /* get procedure computing operator selectivity */
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ switch (oprrest)
+ {
+ case F_EQSEL:
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ mismatch = ! DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (! mismatch)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ break;
+
+ case F_SCALARLTSEL: /* column < constant */
+ case F_SCALARGTSEL: /* column > constant */
+
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ /* invert the result if isgt=true */
+ mismatch = (isgt) ? (! mismatch) : mismatch;
+ break;
+ }
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! item->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (item->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 31939dd..d807dc7 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -416,7 +416,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built)
+ if (mvstat->deps_built || mvstat->mcv_built)
{
info = makeNode(MVStatisticInfo);
@@ -425,9 +425,11 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
+ info->mcv_enabled = mvstat->mcv_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
+ info->mcv_built = mvstat->mcv_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..f9bf10c 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o dependencies.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.mcv b/src/backend/utils/mvstats/README.mcv
new file mode 100644
index 0000000..e93cfe4
--- /dev/null
+++ b/src/backend/utils/mvstats/README.mcv
@@ -0,0 +1,137 @@
+MCV lists
+=========
+
+Multivariate MCV (most-common values) lists are a straightforward extension of
+regular MCV list, tracking most frequent combinations of values for a group of
+attributes.
+
+This works particularly well for columns with a small number of distinct values,
+as the list may include all the combinations and approximate the distribution
+very accurately.
+
+For columns with large number of distinct values (e.g. those with continuous
+domains), the list will only track the most frequent combinations. If the
+distribution is mostly uniform (all combinations about equally frequent), the
+MCV list will be empty.
+
+Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
+than when using histograms.
+
+Also, MCV lists don't necessarily require sorting of the values (the fact that
+we use sorting when building them is implementation detail), but even more
+importantly the ordering is not built into the approximation (while histograms
+are built on ordering). So MCV lists work well even for attributes where the
+ordering of the data type is disconnected from the meaning of the data. For
+example we know how to sort strings, but it's unlikely to make much sense for
+city names (or other label-like attributes).
+
+
+Selectivity estimation
+----------------------
+
+The estimation, implemented in clauselist_mv_selectivity_mcvlist(), is quite
+simple in principle - we need to identify MCV items matching all the clauses
+and sum frequencies of all those items.
+
+Currently MCV lists support estimation of the following clause types:
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR clauses WHERE (a < 1) OR (b >= 2)
+
+It's possible to add support for additional clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and possibly others. These are tasks for the future, not yet implemented.
+
+
+Estimating equality clauses
+---------------------------
+
+When computing selectivity estimate for equality clauses
+
+ (a = 1) AND (b = 2)
+
+we can do this estimate pretty exactly assuming that two conditions are met:
+
+ (1) there's an equality condition on all attributes of the statistic
+
+ (2) we find a matching item in the MCV list
+
+In this case we know the MCV item represents all tuples matching the clauses,
+and the selectivity estimate is complete (i.e. we don't need to perform
+estimation using the histogram). This is what we call 'full match'.
+
+When only (1) holds, but there's no matching MCV item, we don't know whether
+there are no such rows or just are not very frequent. We can however use the
+frequency of the least frequent MCV item as an upper bound for the selectivity.
+
+For a combination of equality conditions (not full-match case) we can clamp the
+selectivity by the minimum of selectivities for each condition. For example if
+we know the number of distinct values for each column, we can use 1/ndistinct
+as a per-column estimate. Or rather 1/ndistinct + selectivity derived from the
+MCV list.
+
+We should also probably only use the 'residual ndistinct' by exluding the items
+included in the MCV list (and also residual frequency):
+
+ f = (1.0 - sum(MCV frequencies)) / (ndistinct - ndistinct(MCV list))
+
+but it's worth pointing out the ndistinct values are multi-variate for the
+columns referenced by the equality conditions.
+
+Note: Only the "full match" limit is currently implemented.
+
+
+Hashed MCV (not yet implemented)
+--------------------------------
+
+Regular MCV lists have to include actual values for each item, so if those items
+are large the list may be quite large. This is especially true for multi-variate
+MCV lists, although the current implementation partially mitigates this by
+performing de-duplicating the values before storing them on disk.
+
+It's possible to only store hashes (32-bit values) instead of the actual values,
+significantly reducing the space requirements. Obviously, this would only make
+the MCV lists useful for estimating equality conditions (assuming the 32-bit
+hashes make the collisions rare enough).
+
+This might also complicate matching the columns to available stats.
+
+
+TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
+ of the actual values. This type of MCV list will be useful only for
+ estimating equality clauses, and will reduce space requirements for large
+ varlena types (in such cases we usually only want equality anyway).
+
+TODO Currently there's no logic to consider building only a MCV list (and not
+ building the histogram at all), except for doing this decision manually in
+ ADD STATISTICS.
+
+
+Inspecting the MCV list
+-----------------------
+
+Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarrays, so we
+simply get the text representation of the arrays.
+
+With multivariate MCV lits it's not that simple due to the possible mix of
+data types. It might be possible to produce similar array-like representation,
+but that'd unnecessarily complicate further processing and analysis of the MCV
+list. Instead, there's a SRF function providing values, frequencies etc.
+
+ SELECT * FROM pg_mv_mcv_items();
+
+It has two input parameters:
+
+ oid - OID of the MCV list (pg_mv_statistic.staoid)
+
+and produces a table with these columns:
+
+ - item ID (0...nitems-1)
+ - values (string array)
+ - nulls only (boolean array)
+ - frequency (double precision)
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index a38ea7b..5c5c59a 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -8,9 +8,50 @@ not true, resulting in estimation errors.
Multivariate stats track different types of dependencies between the columns,
hopefully improving the estimates.
-Currently we only have one kind of multivariate statistics - soft functional
-dependencies, and we use it to improve estimates of equality clauses. See
-README.dependencies for details.
+
+Types of statistics
+-------------------
+
+Currently we only have two kinds of multivariate statistics
+
+ (a) soft functional dependencies (README.dependencies)
+
+ (b) MCV lists (README.mcv)
+
+
+Compatible clause types
+-----------------------
+
+Each type of statistics may be used to estimate some subset of clause types.
+
+ (a) functional dependencies - equality clauses (AND), possibly IS NULL
+
+ (b) MCV list - equality and inequality clauses, IS [NOT] NULL, AND/OR
+
+Currently only simple operator clauses (Var op Const) are supported, but it's
+possible to support more complex clause types, e.g. (Var op Var).
+
+
+Complex clauses
+---------------
+
+We also support estimating more complex clauses - essentially AND/OR clauses
+with (Var op Const) as leaves, as long as all the referenced attributes are
+covered by a single statistics.
+
+For example this condition
+
+ (a=1) AND ((b=2) OR ((c=3) AND (d=4)))
+
+may be estimated using statistics on (a,b,c,d). If we only have statistics on
+(b,c,d) we may estimate the second part, and estimate (a=1) using simple stats.
+
+If we only have statistics on (a,b,c) we can't apply it at all at this point,
+but it's worth pointing out clauselist_selectivity() works recursively and when
+handling the second part (the OR-clause), we'll be able to apply the statistics.
+
+Note: The multi-statistics estimation patch also makes it possible to pass some
+clauses as 'conditions' into the deeper parts of the expression tree.
Selectivity estimation
@@ -23,14 +64,48 @@ When estimating selectivity, we aim to achieve several things:
(b) minimize the overhead, especially when no suitable multivariate stats
exist (so if you are not using multivariate stats, there's no overhead)
-This clauselist_selectivity() performs several inexpensive checks first, before
+Thus clauselist_selectivity() performs several inexpensive checks first, before
even attempting to do the more expensive estimation.
(1) check if there are multivariate stats on the relation
- (2) check there are at least two attributes referenced by clauses compatible
- with multivariate statistics (equality clauses for func. dependencies)
+ (2) check that there are functional dependencies on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equality clauses for func. dependencies)
(3) perform reduction of equality clauses using func. dependencies
- (4) estimate the reduced list of clauses using regular statistics
+ (4) check that there are multivariate MCV lists on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equalities, inequalities, etc.)
+
+ (5) find the best multivariate statistics (matching the most conditions)
+ and use it to compute the estimate
+
+ (6) estimate the remaining clauses (not estimated using multivariate stats)
+ using the regular per-column statistics
+
+Whenever we find there are no suitable stats, we skip the expensive steps.
+
+
+Further (possibly crazy) ideas
+------------------------------
+
+Currently the clauses are only estimated using a single statistics, even if
+there are multiple candidate statistics - for example assume we have statistics
+on (a,b,c) and (b,c,d), and estimate conditions
+
+ (b = 1) AND (c = 2)
+
+Then both statistics may be used, but we only use one of them. Maybe we could
+use compute estimates using all candidate stats, and somehow aggregate them
+into the final estimate by using average or median.
+
+Some stats may give better estimates than others, but it's very difficult to say
+in advance which stats are the best (it depends on the number of buckets, number
+of additional columns not referenced in the clauses, type of condition etc.).
+
+But of course, this may result in expensive estimation (CPU-wise).
+
+So we might add a GUC to choose between a simple (single statistics) and thus
+multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index bd200bc..d1da714 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,12 +16,14 @@
#include "common.h"
+#include "utils/array.h"
+
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+ int natts,
+ VacAttrStats **vacattrstats);
static List* list_mv_stats(Oid relid);
-
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -49,6 +51,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int j;
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -87,8 +91,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (stat->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, attrs);
+ update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -166,6 +174,8 @@ list_mv_stats(Oid relid)
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
+ info->mcv_enabled = stats->mcv_enabled;
+ info->mcv_built = stats->mcv_built;
result = lappend(result, info);
}
@@ -180,8 +190,56 @@ list_mv_stats(Oid relid)
return result;
}
+
+/*
+ * Find attnims of MV stats using the mvoid.
+ */
+int2vector*
+find_mv_attnums(Oid mvoid, Oid *relid)
+{
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ HeapTuple htup;
+ int2vector *keys;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ htup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+ /* starelid */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_starelid, &isnull);
+ Assert(!isnull);
+
+ *relid = DatumGetObjectId(adatum);
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+ ReleaseSysCache(htup);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return keys;
+}
+
+
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -206,18 +264,29 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
@@ -246,6 +315,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -256,11 +340,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index 6d5465b..f4309f7 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -46,7 +46,15 @@ typedef struct
Datum value; /* a data value */
int tupno; /* position index for tuple it came from */
} ScalarItem;
-
+
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -71,5 +79,6 @@ int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..551c934
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1094 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(int32))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(int32) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/* pointers into a flat serialized item of ITEM_SIZE(n) bytes */
+#define ITEM_INDEXES(item) ((uint16*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+/*
+ * Builds MCV list from sample rows, and removes rows represented by
+ * the MCV list from the sample (the number of remaining sample rows is
+ * returned by the numrows_filtered parameter).
+ *
+ * The method is quite simple - in short it does about these steps:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * For more details, see the comments in the code.
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we
+ * want to check the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or
+ * float4). Maybe we could save some space here, but the bytea
+ * compression should handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed
+ * from the table, but rather estimate the number of distinct
+ * values in the table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * Preallocate space for all the items as a single chunk, and point
+ * the items to the appropriate parts of the array.
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ /* keep all the rows by default (as if there was no MCV list) */
+ *numrows_filtered = numrows;
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ items[j].values[i] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Count the number of distinct groups - just walk through the
+ * sorted list and count the number of key changes. We use this to
+ * determine the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array. We'll always
+ * require at least 4 rows per group.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e.
+ * if there are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS),
+ * we'll require only 2 rows per group.
+ *
+ * TODO For now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ * for the multivariate case.
+ *
+ * TODO We can do this only if we believe we got all the distinct
+ * values of the table.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog)
+ * instead of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /*
+ * Walk through the sorted data again, and see how many groups
+ * reach the mcv_threshold (and become an item in the MCV list).
+ */
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or new group, so check if we exceed mcv_threshold */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* group hits the threshold, count the group as MCV item */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* within group, so increase the number of items */
+ count += 1;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as
+ * we'll pass this outside this method and thus it needs to be
+ * easy to pfree() the data (and we wouldn't know where the
+ * arrays start).
+ *
+ * TODO Maybe the reasoning that we can't allocate a single
+ * piece because we're passing it out is bogus? Who'd
+ * free a single item of the MCV list, anyway?
+ *
+ * TODO Maybe with a proper encoding (stuffing all the values
+ * into a list-level array, this will be untrue)?
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /*
+ * Repeat the same loop as above, but this time copy the data
+ * into the MCV list (for items exceeding the threshold).
+ *
+ * TODO Maybe we could simply remember indexes of the last item
+ * in each group (from the previous loop)?
+ */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[nitems];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, items[(i-1)].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, items[(i-1)].isnull, sizeof(bool) * numattrs);
+
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ /* next item */
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows
+ * that are not represented by the MCV list).
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ *
+ * A better option would be keeping the ID of the row in
+ * the sort item, and then just walk through the items and
+ * mark rows to remove (in a bitmap of the same size).
+ * There's not space for that in SortItem at this moment,
+ * but it's trivial to add 'private' pointer, or just
+ * using another structure with extra field (starting with
+ * SortItem, so that the comparators etc. still work).
+ *
+ * Another option is to use the sorted array of items
+ * (because that's how we sorted the source data), and
+ * simply do a bsearch() into it. If we find a matching
+ * item, the row belongs to the MCV list.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* used for the searches */
+ SortItem item, mcvitem;;
+
+ item.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ item.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * FIXME we don't need to allocate this, we can reference
+ * the MCV item directly ...
+ */
+ mcvitem.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ mcvitem.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ item.values[j] = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &item.isnull[j]);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /*
+ * TODO Create a SortItem/MCVItem comparator so that
+ * we don't need to do memcpy() like crazy.
+ */
+ memcpy(mcvitem.values, mcvlist->items[j]->values,
+ numattrs * sizeof(Datum));
+ memcpy(mcvitem.isnull, mcvlist->items[j]->isnull,
+ numattrs * sizeof(bool));
+
+ if (multi_sort_compare(&item, &mcvitem, mss) == 0)
+ {
+ match = true;
+ break;
+ }
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the rows and remember how many rows we kept */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(rows_filtered);
+ pfree(item.values);
+ pfree(item.isnull);
+ pfree(mcvitem.values);
+ pfree(mcvitem.isnull);
+ }
+ }
+
+ pfree(values);
+ pfree(items);
+ pfree(isnull);
+
+ return mcvlist;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+MCVList
+load_mv_mcvlist(Oid mvoid)
+{
+ bool isnull = false;
+ Datum mcvlist;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->mcv_enabled && mvstat->mcv_built);
+#endif
+
+ mcvlist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_mcvlist(DatumGetByteaP(mcvlist));
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Serialize MCV list into a bytea value. The basic algorithm is simple:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use uint16 values for the indexes in step (3), as we don't
+ * allow more than 8k MCV items (see list max_mcv_items). We might
+ * increase this to 65k and still fit into uint16.
+ *
+ * We don't really expect the high compression as with histograms,
+ * because we're not doing any bucket splits etc. (which is the source
+ * of high redundancy there), but we need to do it anyway as we need
+ * to serialize varlena values etc. We might invent another way to
+ * serialize MCV lists, but let's keep it consistent.
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider using 16-bit values for the indexes in step (3).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ Size total_length = 0;
+
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for all values, including NULLs (won't use them) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ if (! mcvlist->items[j]->isnull[i]) /* skip NULL values */
+ {
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* do not exceed UINT16_MAX */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval || (info[i].typlen > 0))
+ /* by value pased by reference, but fixed length */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data.
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized MCV exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'ptr' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the values for each item */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! mcvlist->items[i]->isnull[j])
+ {
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ v = (Datum*)bsearch(&mcvlist->items[i]->values[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims),
+ mcvlist->items[i]->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims),
+ &mcvlist->items[i]->frequency, sizeof(double));
+
+ /* copy the item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/*
+ * Inverse to serialize_mv_mcvlist() - see the comment there.
+ *
+ * We'll do full deserialization, because we don't really expect high
+ * duplication of values so the caching may not be as efficient as with
+ * histograms.
+ */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ uint16 *indexes = NULL;
+ Datum **values = NULL;
+
+ /* local allocation buffer (used only for deserialization) */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* buffer used for the result */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert(nitems > 0);
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MCVListData,items) +
+ ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * We'll allocate one large chunk of memory for the intermediate
+ * data, needed only for deserializing the MCV list, and we'll pack
+ * use a local dense allocation to minimize the palloc overhead.
+ *
+ * Let's see how much space we'll actually need, and also include
+ * space for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ /* for full-size byval types, we reuse the serialized value */
+ if (! (info[i].typbyval && info[i].typlen == sizeof(Datum)))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ /* pased by reference, but fixed length (name, tid, ...) */
+ if (info[i].typlen > 0)
+ {
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should exhaust the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for the MCV items in a single piece */
+ rbufflen = (sizeof(MCVItem) + sizeof(MCVItemData) +
+ sizeof(Datum)*ndims + sizeof(bool)*ndims) * nitems;
+
+ rbuff = palloc(rbufflen);
+ rptr = rbuff;
+
+ mcvlist->items = (MCVItem*)rbuff;
+ rptr += (sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)rptr;
+ rptr += (sizeof(MCVItemData));
+
+ item->values = (Datum*)rptr;
+ rptr += (sizeof(Datum)*ndims);
+
+ item->isnull = (bool*)rptr;
+ rptr += (sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+ for (j = 0; j < ndims; j++)
+ Assert(indexes[j] <= UINT16_MAX);
+#endif
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ /* release the temporary buffer */
+ pfree(buff);
+
+ return mcvlist;
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_mcv_items);
+
+Datum
+pg_mv_mcv_items(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MCVList mcvlist;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ mcvlist = load_mv_mcvlist(PG_GETARG_OID(0));
+
+ funcctx->user_fctx = mcvlist;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = mcvlist->nitems;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MCVList mcvlist;
+ MCVItem item;
+
+ mcvlist = (MCVList)funcctx->user_fctx;
+
+ Assert(call_cntr < mcvlist->nitems);
+
+ item = mcvlist->items[call_cntr];
+
+ stakeys = find_mv_attnums(PG_GETARG_OID(0), &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(4 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+
+ /* frequency */
+ values[3] = (char *) palloc(64 * sizeof(char));
+
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * mcvlist->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* item ID */
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ Datum val, valout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == mcvlist->ndimensions-1)
+ format = "%s, %s}";
+
+ val = item->values[i];
+ valout = FunctionCall1(&fmgrinfo[i], val);
+
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4f106c3..6339631 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,8 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled,\n"
- " deps_built,\n"
+ " deps_enabled, mcv_enabled,\n"
+ " deps_built, mcv_built,\n"
+ " mcv_max_items,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2128,6 +2129,8 @@ describeOneTableDetails(const char *schemaname,
printTableAddFooter(&cont, _("Statistics:"));
for (i = 0; i < tuples; i++)
{
+ bool first = true;
+
printfPQExpBuffer(&buf, " ");
/* statistics name (qualified with namespace) */
@@ -2137,10 +2140,22 @@ describeOneTableDetails(const char *schemaname,
/* options */
if (!strcmp(PQgetvalue(result, i, 4), "t"))
- appendPQExpBuffer(&buf, "(dependencies)");
+ {
+ appendPQExpBuffer(&buf, "(dependencies");
+ first = false;
+ }
+
+ if (!strcmp(PQgetvalue(result, i, 5), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", mcv");
+ else
+ appendPQExpBuffer(&buf, "(mcv");
+ first = false;
+ }
- appendPQExpBuffer(&buf, " ON (%s)",
- PQgetvalue(result, i, 6));
+ appendPQExpBuffer(&buf, ") ON (%s)",
+ PQgetvalue(result, i, 9));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index a568a07..fd7107d 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -37,15 +37,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -61,13 +67,17 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 7
+#define Natts_pg_mv_statistic 11
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
-#define Anum_pg_mv_statistic_deps_built 5
-#define Anum_pg_mv_statistic_stakeys 6
-#define Anum_pg_mv_statistic_stadeps 7
+#define Anum_pg_mv_statistic_mcv_enabled 5
+#define Anum_pg_mv_statistic_mcv_max_items 6
+#define Anum_pg_mv_statistic_deps_built 7
+#define Anum_pg_mv_statistic_mcv_built 8
+#define Anum_pg_mv_statistic_stakeys 9
+#define Anum_pg_mv_statistic_stadeps 10
+#define Anum_pg_mv_statistic_stamcv 11
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index eecce40..b16eebc 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2670,6 +2670,10 @@ DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index e10dcf1..2bcd582 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -653,9 +653,11 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
+ bool mcv_enabled; /* MCV list enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
+ bool mcv_built; /* MCV list built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index cc43a79..4535db7 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -51,30 +51,89 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
*/
MVDependencies load_mv_dependencies(Oid mvoid);
+MCVList load_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..075320b
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s4 ON mcv_list (unknown_column) WITH (mcv);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s4 ON mcv_list (a) WITH (mcv);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a, b) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+ERROR: max number of MCV items is 8192
+-- correct command
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s5 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s6 ON mcv_list (a, b, c, d) WITH (mcv);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 84b4425..66071d8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1373,7 +1373,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
s.staname,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 4f2ffb8..85d94f1 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 097a04f..6584d73 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -163,3 +163,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..b31d32d
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,178 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s4 ON mcv_list (unknown_column) WITH (mcv);
+
+-- single column
+CREATE STATISTICS s4 ON mcv_list (a) WITH (mcv);
+
+-- single column, duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a) WITH (mcv);
+
+-- two columns, one duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a, b) WITH (mcv);
+
+-- unknown option
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (unknown_option);
+
+-- missing MCV statistics
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+
+-- correct command
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s5 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s6 ON mcv_list (a, b, c, d) WITH (mcv);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DROP TABLE mcv_list;
--
2.1.0
0005-multivariate-histograms.patchtext/x-patch; charset=UTF-8; name=0005-multivariate-histograms.patchDownload
From 355eb43e91c636e601c0581e6838b67d635a5981 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 5/9] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
doc/src/sgml/ref/create_statistics.sgml | 18 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 44 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 571 +++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.histogram | 287 ++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 37 +-
src/backend/utils/mvstats/histogram.c | 2032 ++++++++++++++++++++++++++++
src/bin/psql/describe.c | 17 +-
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 136 +-
src/test/regress/expected/mv_histogram.out | 207 +++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 176 +++
21 files changed, 3538 insertions(+), 38 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.histogram
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index 193e4b0..fd3382e 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -133,6 +133,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</varlistentry>
<varlistentry>
+ <term><literal>histogram</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables histogram for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>max_buckets</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of histogram buckets.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><literal>max_mcv_items</> (<type>integer</>)</term>
<listitem>
<para>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d570ee..6afdee0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -167,7 +167,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 90bfaed..b974655 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -137,12 +137,15 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -241,6 +244,29 @@ CreateStatistics(CreateStatsStmt *stmt)
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("maximum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -249,10 +275,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -260,6 +286,11 @@ CreateStatistics(CreateStatsStmt *stmt)
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -279,11 +310,14 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 333e24b..9172f21 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2163,10 +2163,12 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
WRITE_BOOL_FIELD(mcv_enabled);
+ WRITE_BOOL_FIELD(hist_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
WRITE_BOOL_FIELD(mcv_built);
+ WRITE_BOOL_FIELD(hist_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 977f88e..0de2418 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -49,6 +49,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
int type);
@@ -74,6 +75,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStatisticInfo *mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -81,6 +84,12 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
@@ -93,6 +102,7 @@ static List * find_stats(PlannerInfo *root, Index relid);
#define UPDATE_RESULT(m,r,isor) \
(m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -121,7 +131,7 @@ static List * find_stats(PlannerInfo *root, Index relid);
*
* First we try to reduce the list of clauses by applying (soft) functional
* dependencies, and then we try to estimate the selectivity of the reduced
- * list of clauses using the multivariate MCV list.
+ * list of clauses using the multivariate MCV list and histograms.
*
* Finally we remove the portion of clauses estimated using multivariate stats,
* and process the rest of the clauses using the regular per-column stats.
@@ -214,11 +224,13 @@ clauselist_selectivity(PlannerInfo *root,
* with the multivariate code and simply skip to estimation using the
* regular per-column stats.
*/
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
- (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV) >= 2))
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) &&
+ (count_mv_attnums(clauses, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
/* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV);
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* and search for the statistic covering the most attributes */
MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
@@ -230,7 +242,7 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
- mvstat, MV_CLAUSE_TYPE_MCV);
+ mvstat, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -942,6 +954,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -955,9 +968,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* TODO if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1160,7 +1188,7 @@ choose_mv_statistics(List *stats, Bitmapset *attnums)
int numattrs = attrs->dim1;
/* skip dependencies-only stats */
- if (! info->mcv_built)
+ if (! (info->mcv_built || info->hist_built))
continue;
/* count columns covered by the histogram */
@@ -1391,7 +1419,7 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (! (context->types & MV_CLAUSE_TYPE_MCV))
+ if (! (context->types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST)))
return true; /* terminate */
break;
@@ -2007,6 +2035,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
}
return false;
@@ -2411,3 +2442,525 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ int nmatches = 0;
+ char *matches = NULL;
+
+ MVSerializedHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = load_mv_histogram(mvstats->mvoid);
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * Find out what part of the data is covered by the histogram,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample got into the histogram, and the rest
+ * is in a MCV list).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole histogram, which might save us some time
+ * spent accessing the not-matching part of the histogram.
+ * Although it's likely in a cache, so it's very fast.
+ */
+ u += mvhist->buckets[i]->ntuples;
+
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+#ifdef DEBUG_MVHIST
+ debug_histogram_matches(mvhist, matches);
+#endif
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s * u;
+}
+
+/* cached result of bucket boundary comparison for a single dimension */
+
+#define HIST_CACHE_NOT_FOUND 0x00
+#define HIST_CACHE_FALSE 0x01
+#define HIST_CACHE_TRUE 0x03
+#define HIST_CACHE_MASK 0x02
+
+static char
+bucket_contains_value(FmgrInfo ltproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache)
+{
+ bool a, b;
+
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /*
+ * First some quick checks on equality - if any of the boundaries equals,
+ * we have a partial match (so no need to call the comparator).
+ */
+ if (((min_value == constvalue) && (min_include)) ||
+ ((max_value == constvalue) && (max_include)))
+ return MVSTATS_MATCH_PARTIAL;
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ /*
+ * If result for the bucket lower bound not in cache, evaluate the function
+ * and store the result in the cache.
+ */
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, min_value));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /* And do the same for the upper bound. */
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, max_value));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ return (a ^ b) ? MVSTATS_MATCH_PARTIAL : MVSTATS_MATCH_NONE;
+}
+
+static char
+bucket_is_smaller_than_value(FmgrInfo opproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache, bool isgt)
+{
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ bool a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ bool b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ min_value,
+ constvalue));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ max_value,
+ constvalue));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /*
+ * Now, we need to combine both results into the final answer, and we need
+ * to be careful about the 'isgt' variable which kinda inverts the meaning.
+ *
+ * First, we handle the case when each boundary returns different results.
+ * In that case the outcome can only be 'partial' match.
+ */
+ if (a != b)
+ return MVSTATS_MATCH_PARTIAL;
+
+ /*
+ * When the results are the same, then it depends on the 'isgt' value. There
+ * are four options:
+ *
+ * isgt=false a=b=true => full match
+ * isgt=false a=b=false => empty
+ * isgt=true a=b=true => empty
+ * isgt=true a=b=false => full match
+ *
+ * We'll cheat a bit, because we know that (a=b) so we'll use just one of them.
+ */
+ if (isgt)
+ return (!a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+ else
+ return ( a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ /*
+ * Used for caching function calls, only once per deduplicated value.
+ *
+ * We know may have up to (2 * nbuckets) values per dimension. It's
+ * probably overkill, but let's allocate that once for all clauses,
+ * to minimize overhead.
+ *
+ * Also, we only need two bits per value, but this allocates byte
+ * per value. Might be worth optimizing.
+ *
+ * 0x00 - not yet called
+ * 0x01 - called, result is 'false'
+ * 0x03 - called, result is 'true'
+ */
+ char *callcache = palloc(mvhist->nbuckets);
+
+ Assert(mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ /* reset the cache (per clause) */
+ memset(callcache, 0, mvhist->nbuckets);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_LT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ char res = MVSTATS_MATCH_NONE;
+
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /* histogram boundaries */
+ Datum minval, maxval;
+ bool mininclude, maxinclude;
+ int minidx, maxidx;
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* lookup the values and cache of function calls */
+ minidx = bucket->min[idx];
+ maxidx = bucket->max[idx];
+
+ minval = mvhist->values[idx][bucket->min[idx]];
+ maxval = mvhist->values[idx][bucket->max[idx]];
+
+ mininclude = bucket->min_inclusive[idx];
+ maxinclude = bucket->max_inclusive[idx];
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ */
+ switch (oprrest)
+ {
+ case F_SCALARLTSEL: /* Var < Const */
+ case F_SCALARGTSEL: /* Var > Const */
+
+ res = bucket_is_smaller_than_value(opproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache, isgt);
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the
+ * lt operator, and we also check for equality with the boundaries.
+ */
+
+ res = bucket_contains_value(ltproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache);
+ break;
+ }
+
+ UPDATE_RESULT(matches[i], res, is_or);
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the bucket, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+
+ /* free the call cache */
+ pfree(callcache);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index d807dc7..40145e7 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -416,7 +416,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
{
info = makeNode(MVStatisticInfo);
@@ -426,10 +426,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
+ info->hist_enabled = mvstat->hist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
+ info->hist_built = mvstat->hist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index f9bf10c..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.histogram b/src/backend/utils/mvstats/README.histogram
new file mode 100644
index 0000000..8234d2c
--- /dev/null
+++ b/src/backend/utils/mvstats/README.histogram
@@ -0,0 +1,287 @@
+Multivariate histograms
+=======================
+
+Histograms on individual attributes consist of buckets represented by ranges,
+covering the domain of the attribute. That is, each bucket is a [min,max]
+interval, and contains all values in this range. The histogram is built in such
+a way that all buckets have about the same frequency.
+
+Multivariate histograms are an extension into n-dimensional space - the buckets
+are n-dimensional intervals (i.e. n-dimensional rectagles), covering the domain
+of the combination of attributes. That is, each bucket has a vector of lower
+and upper boundaries, denoted min[i] and max[i] (where i = 1..n).
+
+In addition to the boundaries, each bucket tracks additional info:
+
+ * frequency (fraction of tuples in the bucket)
+ * whether the boundaries are inclusive or exclusive
+ * whether the dimension contains only NULL values
+ * number of distinct values in each dimension (for building only)
+
+It's possible that in the future we'll multiple histogram types, with different
+features. We do however expect all the types to share the same representation
+(buckets as ranges) and only differ in how we build them.
+
+The current implementation builds non-overlapping buckets, that may not be true
+for some histogram types and the code should not rely on this assumption. There
+are interesting types of histograms (or algorithms) with overlapping buckets.
+
+When used on low-cardinality data, histograms usually perform considerably worse
+than MCV lists (which are a good fit for this kind of data). This is especially
+true on label-like values, where ordering of the values is mostly unrelated to
+meaning of the data, as proper ordering is crucial for histograms.
+
+On high-cardinality data the histograms are usually a better choice, because MCV
+lists can't represent the distribution accurately enough.
+
+
+Selectivity estimation
+----------------------
+
+The estimation is implemented in clauselist_mv_selectivity_histogram(), and
+works very similarly to clauselist_mv_selectivity_mcvlist.
+
+The main difference is that while MCV lists support exact matches, histograms
+often result in approximate matches - e.g. with equality we can only say if
+the constant would be part of the bucket, but not whether it really is there
+or what fraction of the bucket it corresponds to. In this case we rely on
+some defaults just like in the per-column histograms.
+
+The current implementation uses histograms to estimates those types of clauses
+(think of WHERE conditions):
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR-clauses WHERE (a = 1) OR (b = 2)
+
+Similarly to MCV lists, it's possible to add support for additional types of
+clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and so on. These are tasks for the future, not yet implemented.
+
+
+When evaluating a clause on a bucket, we may get one of three results:
+
+ (a) FULL_MATCH - The bucket definitely matches the clause.
+
+ (b) PARTIAL_MATCH - The bucket matches the clause, but not necessarily all
+ the tuples it represents.
+
+ (c) NO_MATCH - The bucket definitely does not match the clause.
+
+This may be illustrated using a range [1, 5], which is essentially a 1-D bucket.
+With clause
+
+ WHERE (a < 10) => FULL_MATCH (all range values are below
+ 10, so the whole bucket matches)
+
+ WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ the clause, but we don't know how many)
+
+ WHERE (a < 0) => NO_MATCH (the whole range is above 1, so
+ no values from the bucket can match)
+
+Some clauses may produce only some of those results - for example equality
+clauses may never produce FULL_MATCH as we always hit only part of the bucket
+(we can't match both boundaries at the same time). This results in less accurate
+estimates compared to MCV lists, where we can hit a MCV items exactly (there's
+no PARTIAL match in MCV).
+
+There are also clauses that may not produce any PARTIAL_MATCH results. A nice
+example of that is 'IS [NOT] NULL' clause, which either matches the bucket
+completely (FULL_MATCH) or not at all (NO_MATCH), thanks to how the NULL-buckets
+are constructed.
+
+Computing the total selectivity estimate is trivial - simply sum selectivities
+from all the FULL_MATCH and PARTIAL_MATCH buckets (but for buckets marked with
+PARTIAL_MATCH, multiply the frequency by 0.5 to minimize the average error).
+
+
+Building a histogram
+---------------------
+
+The algorithm of building a histogram in general is quite simple:
+
+ (a) create an initial bucket (containing all sample rows)
+
+ (b) create NULL buckets (by splitting the initial bucket)
+
+ (c) repeat
+
+ (1) choose bucket to split next
+
+ (2) terminate if no bucket that might be split found, or if we've
+ reached the maximum number of buckets (16384)
+
+ (3) choose dimension to partition the bucket by
+
+ (4) partition the bucket by the selected dimension
+
+The main complexity is hidden in steps (c.1) and (c.3), i.e. how we choose the
+bucket and dimension for the split.
+
+Similarly to one-dimensional histograms, we want to produce buckets with roughly
+the same frequency. We also need to produce "regular" buckets, because buckets
+with one "side" much longer than the others are very likely to match a lot of
+conditions (which increases error, even if the bucket frequency is very low).
+
+To achieve this, we choose the largest bucket (containing the most sample rows),
+but we only choose buckets that can actually be split (have at least 3 different
+combinations of values).
+
+Then we choose the "longest" dimension of the bucket, which is computed by using
+the distinct values in the sample as a measure.
+
+For details see functions select_bucket_to_partition() and partition_bucket().
+
+The current limit on number of buckets (16384) is mostly arbitrary, but chosen
+so that it guarantees we don't exceed the number of distinct values indexable by
+uint16 in any of the dimensions. In practice we could handle more buckets as we
+index each dimension separately and the splits should use the dimensions evenly.
+
+Also, histograms this large (with 16k values in multiple dimensions) would be
+quite expensive to build and process, so the 16k limit is rather reasonable.
+
+The actual number of buckets is also related to statistics target, because we
+require MIN_BUCKET_ROWS (10) tuples per bucket before a split, so we can't have
+more than (2 * 300 * target / 10) buckets. For the default target (100) this
+evaluates to ~6k.
+
+
+NULL handling (create_null_buckets)
+-----------------------------------
+
+When building histograms on a single attribute, we first filter out NULL values.
+In the multivariate case, we can't really do that because the rows may contain
+a mix of NULL and non-NULL values in different columns (so we can't simply
+filter all of them out).
+
+For this reason, the histograms are built in a way so that for each bucket, each
+dimension only contains only NULL or non-NULL values. Building the NULL-buckets
+happens as the first step in the build, by the create_null_buckets() function.
+The number of NULL buckets, as produced by this function, has a clear upper
+boundary (2^N) where N is the number of dimensions (attributes the histogram is
+built on). Or rather 2^K where K is the number of attributes that are not marked
+as not-NULL.
+
+The buckets with NULL dimensions are then subject to the same build algorithm
+(i.e. may be split into smaller buckets) just like any other bucket, but may
+only be split by non-NULL dimension.
+
+
+Serialization
+-------------
+
+To store the histogram in pg_mv_statistic table, it is serialized into a more
+efficient form. We also use the representation for estimation, i.e. we don't
+fully deserialize the histogram.
+
+For example the boundary values are deduplicated to minimize the required space.
+How much redundancy is there, actually? Let's assume there are no NULL values,
+so we start with a single bucket - in that case we have 2*N boundaries. Each
+time we split a bucket we introduce one new value (in the "middle" of one of
+the dimensions), and keep boundries for all the other dimensions. So after K
+splits, we have up to
+
+ 2*N + K
+
+unique boundary values (we may have fewe values, if the same value is used for
+several splits). But after K splits we do have (K+1) buckets, so
+
+ (K+1) * 2 * N
+
+boundary values. Using e.g. N=4 and K=999, we arrive to those numbers:
+
+ 2*N + K = 1007
+ (K+1) * 2 * N = 8000
+
+wich means a lot of redundancy. It's somewhat counter-intuitive that the number
+of distinct values does not really depend on the number of dimensions (except
+for the initial bucket, but that's negligible compared to the total).
+
+By deduplicating the values and replacing them with 16-bit indexes (uint16), we
+reduce the required space to
+
+ 1007 * 8 + 8000 * 2 ~= 24kB
+
+which is significantly less than 64kB required for the 'raw' histogram (assuming
+the values are 8B).
+
+While the bytea compression (pglz) might achieve the same reduction of space,
+the deduplicated representation is used to optimize the estimation by caching
+results of function calls for already visited values. This significantly
+reduces the number of calls to (often quite expensive) operators.
+
+Note: Of course, this reasoning only holds for histograms built by the algorithm
+that simply splits the buckets in half. Other histograms types (e.g. containing
+overlapping buckets) may behave differently and require different serialization.
+
+Serialized histograms are marked with 'magic' constant, to make it easier to
+check the bytea value really is a serialized histogram.
+
+
+varlena compression
+-------------------
+
+This serialization may however disable automatic varlena compression, the array
+of unique values is placed at the beginning of the serialized form. Which is
+exactly the chunk used by pglz to check if the data is compressible, and it
+will probably decide it's not very compressible. This is similar to the issue
+we had with JSONB initially.
+
+Maybe storing buckets first would make it work, as the buckets may be better
+compressible.
+
+On the other hand the serialization is actually a context-aware compression,
+usually compressing to ~30% (or even less, with large data types). So the lack
+of additional pglz compression may be acceptable.
+
+
+Deserialization
+---------------
+
+The deserialization is not a perfect inverse of the serialization, as we keep
+the deduplicated arrays. This reduces the amount of memory and also allows
+optimizations during estimation (e.g. we can cache results for the distinct
+values, saving expensive function calls).
+
+
+Inspecting the histogram
+------------------------
+
+Inspecting the regular (per-attribute) histograms is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarray, so we
+simply get the text representation of the array.
+
+With multivariate histograms it's not that simple due to the possible mix of
+data types in the histogram. It might be possible to produce similar array-like
+text representation, but that'd unnecessarily complicate further processing
+and analysis of the histogram. Instead, there's a SRF function that allows
+access to lower/upper boundaries, frequencies etc.
+
+ SELECT * FROM pg_mv_histogram_buckets();
+
+It has two input parameters:
+
+ oid - OID of the histogram (pg_mv_statistic.staoid)
+ otype - type of output
+
+and produces a table with these columns:
+
+ - bucket ID (0...nbuckets-1)
+ - lower bucket boundaries (string array)
+ - upper bucket boundaries (string array)
+ - nulls only dimensions (boolean array)
+ - lower boundary inclusive (boolean array)
+ - upper boundary includive (boolean array)
+ - frequency (double precision)
+
+The 'otype' accepts three values, determining what will be returned in the
+lower/upper boundary arrays:
+
+ - 0 - values stored in the histogram, encoded as text
+ - 1 - indexes into the deduplicated arrays
+ - 2 - idnexes into the deduplicated arrays, scaled to [0,1]
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 5c5c59a..3e4f4d1 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -18,6 +18,8 @@ Currently we only have two kinds of multivariate statistics
(b) MCV lists (README.mcv)
+ (c) multivariate histograms (README.histogram)
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index d1da714..ffb76f4 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -13,11 +13,11 @@
*
*-------------------------------------------------------------------------
*/
+#include "postgres.h"
+#include "utils/array.h"
#include "common.h"
-#include "utils/array.h"
-
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts,
VacAttrStats **vacattrstats);
@@ -52,7 +52,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -95,8 +96,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (stat->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
}
}
@@ -176,6 +181,8 @@ list_mv_stats(Oid relid)
info->deps_built = stats->deps_built;
info->mcv_enabled = stats->mcv_enabled;
info->mcv_built = stats->mcv_built;
+ info->hist_enabled = stats->hist_enabled;
+ info->hist_built = stats->hist_built;
result = lappend(result, info);
}
@@ -190,7 +197,6 @@ list_mv_stats(Oid relid)
return result;
}
-
/*
* Find attnims of MV stats using the mvoid.
*/
@@ -236,9 +242,16 @@ find_mv_attnums(Oid mvoid, Oid *relid)
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -271,22 +284,34 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..9e5620a
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,2032 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+#include <math.h>
+
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(int32))
+ * - max boundary indexes (2 * ndim * sizeof(int32))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(int32) + 3 * sizeof(bool)) +
+ * 2 * sizeof(float)
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) ((float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* can't split bucket with less than 10 rows */
+#define MIN_BUCKET_ROWS 10
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size.
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket
+ * for more details about the algorithm.
+ *
+ * The current algorithm works like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (largest bucket)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (largest dimension)
+ * split the bucket into two buckets
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ int *ndistvalues;
+ Datum **distvalues;
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets
+ = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * Collect info on distinct values in each dimension (used later
+ * to select dimension to partition).
+ */
+ ndistvalues = (int*)palloc0(sizeof(int) * numattrs);
+ distvalues = (Datum**)palloc0(sizeof(Datum*) * numattrs);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ int j;
+ int nvals;
+ Datum *tmp;
+
+ SortSupportData ssup;
+ StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ nvals = 0;
+ tmp = (Datum*)palloc0(sizeof(Datum) * numrows);
+
+ for (j = 0; j < numrows; j++)
+ {
+ bool isnull;
+
+ /* remember the index of the sample row, to make the partitioning simpler */
+ Datum value = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+
+ if (isnull)
+ continue;
+
+ tmp[nvals++] = value;
+ }
+
+ /* do the sort and stuff only if there are non-NULL values */
+ if (nvals > 0)
+ {
+ /* sort the array of values */
+ qsort_arg((void *) tmp, nvals, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /* count distinct values */
+ ndistvalues[i] = 1;
+ for (j = 1; j < nvals; j++)
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ ndistvalues[i] += 1;
+
+ /* FIXME allocate only needed space (count ndistinct first) */
+ distvalues[i] = (Datum*)palloc0(sizeof(Datum) * ndistvalues[i]);
+
+ /* now collect distinct values into the array */
+ distvalues[i][0] = tmp[0];
+ ndistvalues[i] = 1;
+
+ for (j = 1; j < nvals; j++)
+ {
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ {
+ distvalues[i][ndistvalues[i]] = tmp[j];
+ ndistvalues[i] += 1;
+ }
+ }
+ }
+
+ pfree(tmp);
+ }
+
+ /*
+ * The initial bucket may contain NULL values, so we have to create
+ * buckets with NULL-only dimensions.
+ *
+ * FIXME We may need up to 2^ndims buckets - check that there are
+ * enough buckets (MVSTAT_HIST_MAX_BUCKETS >= 2^ndims).
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets]
+ = partition_bucket(bucket, attrs, stats,
+ ndistvalues, distvalues);
+
+ histogram->nbuckets += 1;
+ }
+
+ /* finalize the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data
+ = ((HistogramBuild)histogram->buckets[i]->build_data);
+
+ /*
+ * The frequency has to be computed from the whole sample, in
+ * case some of the rows were used for MCV (and thus are missing
+ * from the histogram).
+ */
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVSerializedHistogram
+load_mv_histogram(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram(DatumGetByteaP(histogram));
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVSerializedHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm is quite
+ * simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ *
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ *
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we're mixing different
+ * datatypes, and we we need to use the right operators to compare/sort them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs
+ * (we won't use them, but we don't know how many are there),
+ * and then collect all non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* make sure we fit into uint16 */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typlen > 0)
+ /* byval or byref, but with fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (10 * 1024 * 1024))
+ elog(ERROR, "serialized histogram exceeds 10MB (%ld > %d)",
+ total_length, (10 * 1024 * 1024));
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typlen > 0)
+ {
+ /* pased by value or reference, but fixed length */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the histogram buckets */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ *BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ uint16 idx;
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ /* min boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* FIXME free the values/counts arrays here */
+
+ return output;
+}
+
+/*
+ * Returns histogram in a partially-serialized form (keeps the boundary
+ * values deduplicated, so that it's possible to optimize the estimation
+ * part by caching function call results between buckets etc.).
+ */
+MVSerializedHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+
+ MVSerializedHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram
+ = (MVSerializedHistogram)palloc(sizeof(MVSerializedHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVSerializedHistogramData, buckets));
+ tmp += offsetof(MVSerializedHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVSerializedHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* now let's allocate a single buffer for all the values and counts */
+
+ bufflen = (sizeof(int) + sizeof(Datum*)) * ndims;
+ for (i = 0; i < ndims; i++)
+ {
+ /* don't allocate space for byval types, matching Datum */
+ if (! (info[i].typbyval && (info[i].typlen == sizeof(Datum))))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+ }
+
+ /* also, include space for the result, tracking the buckets */
+ bufflen += nbuckets * (
+ sizeof(MVSerializedBucket) + /* bucket pointer */
+ sizeof(MVSerializedBucketData)); /* bucket data */
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ histogram->nvalues = (int*)ptr;
+ ptr += (sizeof(int) * ndims);
+
+ histogram->values = (Datum**)ptr;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ histogram->nvalues[i] = info[i].nvalues;
+
+ if (info[i].typbyval && info[i].typlen == sizeof(Datum))
+ {
+ /* passed by value / Datum - simply reuse the array */
+ histogram->values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typbyval)
+ {
+ /* pased by value, but smaller than Datum */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&histogram->values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ histogram->buckets = (MVSerializedBucket*)ptr;
+ ptr += (sizeof(MVSerializedBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVSerializedBucket bucket = (MVSerializedBucket)ptr;
+ ptr += sizeof(MVSerializedBucketData);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+ bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+ bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+ bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+ bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+ bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ /* we should exhaust the output buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller ones.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size. We
+ * select the bucket with the highest number of distinct values, and
+ * then split it by the longest dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this
+ * is used to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this
+ * contains values for all the tuples from the sample, not just
+ * the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ * For example use the "bucket volume" (product of dimension
+ * lengths) to select the bucket.
+ *
+ * We need buckets containing about the same number of tuples (so
+ * about the same frequency), as that limits the error when we
+ * match the bucket partially (in that case use 1/2 the bucket).
+ *
+ * We also need buckets with "regular" size, i.e. not "narrow" in
+ * some dimensions and "wide" in the others, because that makes
+ * partial matches more likely and increases the estimation error,
+ * especially when the clauses match many buckets partially. This
+ * is especially serious for OR-clauses, because in that case any
+ * of them may add the bucket as a (partial) match. With AND-clauses
+ * all the clauses have to match the bucket, which makes this issue
+ * somewhat less pressing.
+ *
+ * For example this table:
+ *
+ * CREATE TABLE t AS SELECT i AS a, i AS b
+ * FROM generate_series(1,1000000) s(i);
+ * ALTER TABLE t ADD STATISTICS (histogram) ON (a,b);
+ * ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because
+ * every bucket always has exactly the same number of distinct
+ * values in all dimensions, which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ * SELECT * FROM t WHERE a < 10 AND b < 10;
+ *
+ * is estimated to return ~120 rows, while in reality it returns 9.
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=117 width=8)
+ * (actual time=0.185..270.774 rows=9 loops=1)
+ * Filter: ((a < 10) AND (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * while the query using OR clauses is estimated like this:
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=8100 width=8)
+ * (actual time=0.118..189.919 rows=9 loops=1)
+ * Filter: ((a < 10) OR (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * which is clearly much worse. This happens because the histogram
+ * contains buckets like this:
+ *
+ * bucket 592 [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the
+ * length of "b" is (30593-30134)=459. So the "b" dimension is much
+ * narrower than "a". Of course, there are buckets where "b" is the
+ * wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension
+ * in partition_bucket() but that only happens after we already
+ * selected the bucket. So if we never select the bucket, we can't
+ * really fix it there.
+ *
+ * The other reason why this particular example behaves so poorly
+ * is due to the way we split the partition in partition_bucket().
+ * Currently we attempt to divide the bucket into two parts with
+ * the same number of sampled tuples (frequency), but that does not
+ * work well when all the tuples are squashed on one end of the
+ * bucket (e.g. exactly at the diagonal, as a=b). In that case we
+ * split the bucket into a tiny bucket on the diagonal, and a huge
+ * remaining part of the bucket, which is still going to be narrow
+ * and we're unlikely to fix that.
+ *
+ * So perhaps we need two partitioning strategies - one aiming to
+ * split buckets with high frequency (number of sampled rows), the
+ * other aiming to split "large" buckets. And alternating between
+ * them, somehow.
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ *
+ * TODO Consider using similar lower boundary for row count as for simple
+ * histograms, i.e. 300 tuples per bucket.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int numrows = 0;
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+ /* if the number of rows is higher, use this bucket */
+ if ((data->ndistinct > 2) &&
+ (data->numrows > numrows) &&
+ (data->numrows >= MIN_BUCKET_ROWS)) {
+ bucket = buckets[i];
+ numrows = data->numrows;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest
+ * bucket dimension, measured using the array of distinct values built
+ * at the very beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly
+ * distributed, and then use this to measure length. It's essentially
+ * a number of distinct values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts
+ * with roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning
+ * the new bucket (essentially shrinking the existing one in-place and
+ * returning the other "half" as a new bucket). The caller is responsible
+ * for adding the new bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case
+ * of strongly dependent columns - e.g. y=x). The current algorithm
+ * minimizes this, but may still happen for perfectly dependent
+ * examples (when all the dimensions have equal length, the first
+ * one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ // int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+ double delta;
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split.
+ */
+ delta = 0.0;
+ dimension = -1;
+
+ for (i = 0; i < numattrs; i++)
+ {
+ Datum *a, *b;
+
+ mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ /* can't split NULL-only dimension */
+ if (bucket->nullsonly[i])
+ continue;
+
+ /* can't split dimension with a single ndistinct value */
+ if (data->ndistincts[i] <= 1)
+ continue;
+
+ /* sort support for the bsearch_comparator */
+ ssup_private = &ssup;
+
+ /* search for min boundary in the distinct list */
+ a = (Datum*)bsearch(&bucket->min[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ b = (Datum*)bsearch(&bucket->max[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ /* if this dimension is 'larger' then partition by it */
+ if (((b-a)*1.0 / ndistvalues[i]) > delta)
+ {
+ delta = ((b-a)*1.0 / ndistvalues[i]);
+ dimension = i;
+ }
+ }
+
+ /*
+ * If we haven't found a dimension here, we've done something
+ * wrong in select_bucket_to_partition.
+ */
+ Assert(dimension != -1);
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ delta = fabs(data->numrows);
+ split_value = values[0].value;
+
+ for (i = 1; i < data->numrows; i++)
+ {
+ if (values[i].value != values[i-1].value)
+ {
+ /* are we closer to splitting the bucket in half? */
+ if (fabs(i - data->numrows/2.0) < delta)
+ {
+ /* let's assume we'll use this value for the split */
+ split_value = values[i].value;
+ delta = fabs(i - data->numrows/2.0);
+ nrows = i;
+ }
+ }
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and
+ * non-NULL values in a single dimension. Each dimension may either be
+ * marked as 'nulls only', and thus containing only NULL values, or
+ * it must not contain any NULL values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns,
+ * it's necessary to build those NULL-buckets. This is done in an
+ * iterative way using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL
+ * and non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not
+ * marked as NULL-only, mark it as NULL-only and run the
+ * algorithm again (on this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the
+ * bucket into two parts - one with NULL values, one with
+ * non-NULL values (replacing the current one). Then run
+ * the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions
+ * should be quite low - limited by the number of NULL-buckets. Also,
+ * in each branch the number of nested calls is limited by the number
+ * of dimensions (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The
+ * number of buckets produced by this algorithm is rather limited - with
+ * N dimensions, there may be only 2^N such buckets (each dimension may
+ * be either NULL or non-NULL). So with 8 dimensions (current value of
+ * MVSTATS_MAX_DIMENSIONS) there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further
+ * optimizing the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL
+ * in a dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ /*
+ * FIXME We don't need to start from the first attribute
+ * here - we can start from the last known dimension.
+ */
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only,
+ * but is not yet marked like that. It's enough to mark it and
+ * repeat the process recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in
+ * the dimension, one with non-NULL values. We don't need to sort
+ * the data or anything, but otherwise it's similar to what's done
+ * in partition_bucket().
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each
+ * bucket (NULL is not a value, so 0, and the other bucket got
+ * all the ndistinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram (or if there's no
+ * statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ * - prints actual values
+ * - using the output function of the data type (as string)
+ * - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ * - prints index of the distinct value (into the serialized array)
+ * - makes it easier to spot neighbor buckets, etc.
+ * - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ * - prints index of the distinct value, but normalized into [0,1]
+ * - similar to 1, but shows how 'long' the bucket range is
+ * - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options
+ * skew the lengths by distributing the distinct values uniformly. For
+ * data types without a clear meaning of 'distance' (e.g. strings) that
+ * is not a big deal, but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_histogram_buckets);
+
+Datum
+pg_mv_histogram_buckets(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ Oid mvoid = PG_GETARG_OID(0);
+ int otype = PG_GETARG_INT32(1);
+
+ if ((otype < 0) || (otype > 2))
+ elog(ERROR, "invalid output type specified");
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MVSerializedHistogram histogram;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ histogram = load_mv_histogram(mvoid);
+
+ funcctx->user_fctx = histogram;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = histogram->nbuckets;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+ double bucket_size = 1.0;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MVSerializedHistogram histogram;
+ MVSerializedBucket bucket;
+
+ histogram = (MVSerializedHistogram)funcctx->user_fctx;
+
+ Assert(call_cntr < histogram->nbuckets);
+
+ bucket = histogram->buckets[call_cntr];
+
+ stakeys = find_mv_attnums(mvoid, &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(9 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+ values[3] = (char *) palloc0(1024 * sizeof(char));
+ values[4] = (char *) palloc0(1024 * sizeof(char));
+ values[5] = (char *) palloc0(1024 * sizeof(char));
+
+ values[6] = (char *) palloc(64 * sizeof(char));
+ values[7] = (char *) palloc(64 * sizeof(char));
+ values[8] = (char *) palloc(64 * sizeof(char));
+
+ /* we need to do this only when printing the actual values */
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * histogram->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* bucket ID */
+
+ /*
+ * currently we only print array of indexes, but the deduplicated
+ * values should be sorted, so this is actually quite useful
+ *
+ * TODO print the actual min/max values, using the output
+ * function of the attribute type
+ */
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bucket_size *= (bucket->max[i] - bucket->min[i]) * 1.0
+ / (histogram->nvalues[i]-1);
+
+ /* print the actual values, i.e. use output function etc. */
+ if (otype == 0)
+ {
+ Datum minval, maxval;
+ Datum minout, maxout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ minval = histogram->values[i][bucket->min[i]];
+ minout = FunctionCall1(&fmgrinfo[i], minval);
+
+ maxval = histogram->values[i][bucket->max[i]];
+ maxout = FunctionCall1(&fmgrinfo[i], maxval);
+
+ // snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(minout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ // snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ snprintf(buff, 1024, format, values[2], DatumGetPointer(maxout));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else if (otype == 1)
+ {
+ format = "%s, %d";
+ if (i == 0)
+ format = "{%s%d";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %d}";
+
+ snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else
+ {
+ format = "%s, %f";
+ if (i == 0)
+ format = "{%s%f";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %f}";
+
+ snprintf(buff, 1024, format, values[1],
+ bucket->min[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2],
+ bucket->max[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ snprintf(buff, 1024, format, values[3], bucket->nullsonly[i] ? "t" : "f");
+ strncpy(values[3], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[4], bucket->min_inclusive[i] ? "t" : "f");
+ strncpy(values[4], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[5], bucket->max_inclusive[i] ? "t" : "f");
+ strncpy(values[5], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[6], 64, "%f", bucket->ntuples); /* frequency */
+ snprintf(values[7], 64, "%f", bucket->ntuples / bucket_size); /* density */
+ snprintf(values[8], 64, "%f", bucket_size); /* bucket_size */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+ pfree(values[4]);
+ pfree(values[5]);
+ pfree(values[6]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
+
+#ifdef DEBUG_MVHIST
+/*
+ * prints debugging info about matched histogram buckets (full/partial)
+ *
+ * XXX Currently works only for INT data type.
+ */
+void
+debug_histogram_matches(MVSerializedHistogram mvhist, char *matches)
+{
+ int i, j;
+
+ float ffull = 0, fpartial = 0;
+ int nfull = 0, npartial = 0;
+
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ char ranges[1024];
+
+ if (! matches[i])
+ continue;
+
+ /* increment the counters */
+ nfull += (matches[i] == MVSTATS_MATCH_FULL) ? 1 : 0;
+ npartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? 1 : 0;
+
+ /* and also update the frequencies */
+ ffull += (matches[i] == MVSTATS_MATCH_FULL) ? bucket->ntuples : 0;
+ fpartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? bucket->ntuples : 0;
+
+ memset(ranges, 0, sizeof(ranges));
+
+ /* build ranges for all the dimentions */
+ for (j = 0; j < mvhist->ndimensions; j++)
+ {
+ sprintf(ranges, "%s [%d %d]", ranges,
+ DatumGetInt32(mvhist->values[j][bucket->min[j]]),
+ DatumGetInt32(mvhist->values[j][bucket->max[j]]));
+ }
+
+ elog(WARNING, "bucket %d %s => %d [%f]", i, ranges, matches[i], bucket->ntuples);
+ }
+
+ elog(WARNING, "full=%f partial=%f (%f)", ffull, fpartial, (ffull + 0.5 * fpartial));
+}
+#endif
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 6339631..3543239 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,9 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled, mcv_enabled,\n"
- " deps_built, mcv_built,\n"
- " mcv_max_items,\n"
+ " deps_enabled, mcv_enabled, hist_enabled,\n"
+ " deps_built, mcv_built, hist_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2154,8 +2154,17 @@ describeOneTableDetails(const char *schemaname,
first = false;
}
+ if (!strcmp(PQgetvalue(result, i, 6), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", histogram");
+ else
+ appendPQExpBuffer(&buf, "(histogram");
+ first = false;
+ }
+
appendPQExpBuffer(&buf, ") ON (%s)",
- PQgetvalue(result, i, 9));
+ PQgetvalue(result, i, 12));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index fd7107d..a5945af 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -38,13 +38,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -52,6 +55,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -67,17 +71,21 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 11
+#define Natts_pg_mv_statistic 15
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
#define Anum_pg_mv_statistic_mcv_enabled 5
-#define Anum_pg_mv_statistic_mcv_max_items 6
-#define Anum_pg_mv_statistic_deps_built 7
-#define Anum_pg_mv_statistic_mcv_built 8
-#define Anum_pg_mv_statistic_stakeys 9
-#define Anum_pg_mv_statistic_stadeps 10
-#define Anum_pg_mv_statistic_stamcv 11
+#define Anum_pg_mv_statistic_hist_enabled 6
+#define Anum_pg_mv_statistic_mcv_max_items 7
+#define Anum_pg_mv_statistic_hist_max_buckets 8
+#define Anum_pg_mv_statistic_deps_built 9
+#define Anum_pg_mv_statistic_mcv_built 10
+#define Anum_pg_mv_statistic_hist_built 11
+#define Anum_pg_mv_statistic_stakeys 12
+#define Anum_pg_mv_statistic_stadeps 13
+#define Anum_pg_mv_statistic_stamcv 14
+#define Anum_pg_mv_statistic_stahist 15
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index b16eebc..19a490a 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2674,6 +2674,10 @@ DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f
DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
DESCR("details about MCV list items");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3374 ( pg_mv_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_size}" _null_ _null_ pg_mv_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 2bcd582..8c50bfb 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -654,10 +654,12 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
+ bool hist_enabled; /* histogram enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
+ bool hist_built; /* histogram built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 4535db7..f05a517 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -92,6 +92,123 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ *
+ * TODO add more detailed description here
+ */
+
+typedef struct MVSerializedBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ uint16 *min;
+ bool *min_inclusive;
+
+ /* indexes of upper boundaries - values and information about the
+ * inequalities (exclusive vs. inclusive) */
+ uint16 *max;
+ bool *max_inclusive;
+
+} MVSerializedBucketData;
+
+typedef MVSerializedBucketData *MVSerializedBucket;
+
+typedef struct MVSerializedHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ /*
+ * keep this the same with MVHistogramData, because of
+ * deserialization (same offset)
+ */
+ MVSerializedBucket *buckets; /* array of buckets */
+
+ /*
+ * serialized boundary values, one array per dimension, deduplicated
+ * (the min/max indexes point into these arrays)
+ */
+ int *nvalues;
+ Datum **values;
+
+} MVSerializedHistogramData;
+
+typedef MVSerializedHistogramData *MVSerializedHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -99,20 +216,25 @@ typedef MCVListData *MCVList;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
+MVSerializedHistogram load_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVSerializedHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
* dimension within the stats).
*/
int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
@@ -121,6 +243,8 @@ extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_histogram_buckets(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -130,10 +254,20 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
+#ifdef DEBUG_MVHIST
+extern void debug_histogram_matches(MVSerializedHistogram mvhist, char *matches);
+#endif
+
+
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..e830816
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s7 ON mv_histogram (unknown_column) WITH (histogram);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s7 ON mv_histogram (a) WITH (histogram);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a, b) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+ERROR: maximum number of buckets is 16384
+-- correct command
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s8 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s9 ON mv_histogram (a, b, c, d) WITH (histogram);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 66071d8..1a1a4ca 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1375,7 +1375,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 85d94f1..a885235 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 6584d73..2efdcd7 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -164,3 +164,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..27c2510
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,176 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s7 ON mv_histogram (unknown_column) WITH (histogram);
+
+-- single column
+CREATE STATISTICS s7 ON mv_histogram (a) WITH (histogram);
+
+-- single column, duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a) WITH (histogram);
+
+-- two columns, one duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a, b) WITH (histogram);
+
+-- unknown option
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (unknown_option);
+
+-- missing histogram statistics
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+
+-- invalid max_buckets value / too low
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+
+-- invalid max_buckets value / too high
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+
+-- correct command
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s8 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s9 ON mv_histogram (a, b, c, d) WITH (histogram);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DROP TABLE mv_histogram;
--
2.1.0
0006-multi-statistics-estimation.patchtext/x-patch; charset=UTF-8; name=0006-multi-statistics-estimation.patchDownload
From 04b77a1750694b49ee6f3db9400980b20ae307cd Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 6/9] multi-statistics estimation
The general idea is that a probability (which is what selectivity is)
can be split into a product of conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part may be
simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute the original
probability.
The implementation works in the other direction, though. We know what
probability P(A & B & C) we need to compute, and also what statistics
are available.
So we search for a combinations of statistics, covering the clauses in
an optimal way (most clauses covered, most dependencies exploited).
There are two possible approaches - exhaustive and greedy. The
exhaustive one walks through all permutations of stats using dynamic
programming, so it's guaranteed to find the optimal solution, but it
soon gets very slow as it's roughly O(N!). The dynamic programming may
improve that a bit, but it's still far too expensive for large numbers
of statistics (on a single table).
The greedy algorithm is very simple - in every step choose the best
solution. That may not guarantee the best solution globally (but maybe
it does?), but it only needs N steps to find the solution, so it's very
fast (processing the selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with respect to
runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply them to the
clauses using the conditional probabilities. We process the selected
stats one by one, and for each we select the estimated clauses and
conditions. See clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to be covered by
a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single multivariate
statistics.
Clauses not covered by a single statistics at this level will be passed
to clause_selectivity() but this will treat them as a collection of
simpler clauses (connected by AND or OR), and the clauses from the
previous level will be used as conditions.
So using the same example, the last clause will be passed to
clause_selectivity() with 'clause1' and 'clause2' as conditions, and it
will be processed using multivariate stats if possible.
The other limitation is that all the expressions have to be
mv-compatible, i.e. there can't be a mix of expressions. If this is
violated, the clause may be passed to the next level (just like with
list of clauses not covered by a single statistics), which splits that
into clauses handled by multivariate stats and clauses handler by
regular statistics.
rework clauselist_selectivity_or to handle OR-clauses correctly
We might invent a completely new set of functions here, resembling
clauselist_selectivity but adapting the ideas to OR-clauses.
But luckily we know that each OR-clause
(a OR b OR c)
may be rewritten as an equivalent AND-clause using negation:
NOT ((NOT a) AND (NOT b) AND (NOT c))
And that's something we can pass to clauselist_selectivity.
---
contrib/file_fdw/file_fdw.c | 3 +-
contrib/postgres_fdw/postgres_fdw.c | 11 +-
src/backend/optimizer/path/clausesel.c | 1990 ++++++++++++++++++++++++++------
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/backend/utils/mvstats/README.stats | 166 +++
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
10 files changed, 1890 insertions(+), 358 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index dc035d7..8f11b7a 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -969,7 +969,8 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
baserel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 76d0e15..e78f140 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -498,7 +498,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
@@ -2149,7 +2150,8 @@ estimate_path_cost_size(PlannerInfo *root,
local_param_join_conds,
foreignrel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
@@ -3618,7 +3620,8 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
fpinfo->local_conds,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
/*
@@ -3637,7 +3640,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
*/
fpinfo->joinclause_sel = clauselist_selectivity(root, fpinfo->joinclauses,
0, fpinfo->jointype,
- extra->sjinfo);
+ extra->sjinfo, NIL);
}
fpinfo->server = GetForeignServer(joinrel->serverid);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 0de2418..c1b8999 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -29,6 +29,8 @@
#include "utils/selfuncs.h"
#include "utils/typcache.h"
+#include "miscadmin.h"
+
/*
* Data structure for accumulating info about possible range-query
@@ -44,6 +46,13 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
+
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -60,23 +69,25 @@ static int count_mv_attnums(List *clauses, Index relid, int type);
static int count_varnos(List *clauses, Index *relid);
+static List *clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ Index relid, int types, bool remove);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Index relid, List *stats);
-static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
-
-static List *clauselist_mv_split(PlannerInfo *root, Index relid,
- List *clauses, List **mvclauses,
- MVStatisticInfo *mvstats, int types);
-
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats, List *clauses,
+ List *conditions, bool is_or);
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats,
- bool *fullmatch, Selectivity *lowsel);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or, bool *fullmatch,
+ Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -90,10 +101,33 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static List *choose_mv_statistics(PlannerInfo *root, Index relid,
+ List *mvstats, List *clauses, List *conditions);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
+static bool stats_type_matches(MVStatisticInfo *stat, int type);
+
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
@@ -168,14 +202,15 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
/* processing mv stats */
- Oid relid = InvalidOid;
+ Index relid = InvalidOid;
/* list of multivariate stats on the relation */
List *stats = NIL;
@@ -191,12 +226,13 @@ clauselist_selectivity(PlannerInfo *root,
stats = find_stats(root, relid);
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, or matching multivariate statistics, so just go directly
+ * to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
+ varRelid, jointype, sjinfo, conditions);
/*
* Apply functional dependencies, but first check that there are some stats
@@ -228,31 +264,96 @@ clauselist_selectivity(PlannerInfo *root,
(count_mv_attnums(clauses, relid,
MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
- /* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, relid,
- MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ ListCell *s;
+
+ /*
+ * Copy the conditions we got from the upper part of the expression tree
+ * so that we can add local conditions to it (we need to keep the
+ * original list intact, for sibling expressions - other expressions
+ * at the same level).
+ */
+ List *conditions_local = list_copy(conditions);
- /* and search for the statistic covering the most attributes */
- MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+ /* find the best combination of statistics */
+ List *solution = choose_mv_statistics(root, relid, stats,
+ clauses, conditions);
- if (mvstat != NULL) /* we have a matching stats */
+ /*
+ * We have a good solution, which is merely a list of statistics that
+ * we need to apply. We'll apply the statistics one by one (in the order
+ * as they appear in the list), and for each statistic we'll
+ *
+ * (1) find clauses compatible with the statistic (and remove them
+ * from the list)
+ *
+ * (2) find local conditions compatible with the statistic
+ *
+ * (3) do the estimation P(clauses | conditions)
+ *
+ * (4) append the estimated clauses to local conditions
+ *
+ * continuously modify
+ */
+ foreach (s, solution)
{
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
- mvstat, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ /* clauses compatible with the statistic we're applying right now */
+ List *stat_clauses = NIL;
+ List *stat_conditions = NIL;
- /* we've chosen the histogram to match the clauses */
- Assert(mvclauses != NIL);
+ /*
+ * Find clauses and conditions matching the statistic - the clauses
+ * need to be removed from the list, while conditions should remain
+ * there (so that we can apply them repeatedly).
+ */
+ stat_clauses
+ = clauses_matching_statistic(&clauses, mvstat, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ true);
+
+ stat_conditions
+ = clauses_matching_statistic(&conditions_local, mvstat, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ false);
+
+ /*
+ * If we got no clauses to estimate, we've done something wrong,
+ * either during the optimization, detecting compatible clause, or
+ * somewhere else.
+ *
+ * Also, we need at least two attributes in clauses and conditions.
+ */
+ Assert(stat_clauses != NIL);
+ Assert(count_mv_attnums(list_union(stat_clauses, stat_conditions),
+ relid, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2);
/* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ stat_clauses, stat_conditions,
+ false); /* AND */
+
+ /*
+ * Add the new clauses to the local conditions, so that we can use
+ * them for the subsequent statistics. We only add the clauses,
+ * because the conditions are already there (or should be).
+ */
+ conditions_local = list_concat(conditions_local, stat_clauses);
}
+
+ /* from now on, work only with the 'local' list of conditions */
+ conditions = conditions_local;
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ return s1 * clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo, conditions);
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -264,7 +365,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions);
/*
* Check for being passed a RestrictInfo.
@@ -423,6 +525,55 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Similar to clauselist_selectivity(), but for OR-clauses. We can't simply use
+ * the same multi-statistic estimation logic for AND-clauses, at least not
+ * directly, because there are a few key differences:
+ *
+ * - functional dependencies don't really apply to OR-clauses
+ *
+ * - clauselist_selectivity() is based on decomposing the selectivity into
+ * a sequence of conditional probabilities (selectivities), but that can
+ * be done only for AND-clauses
+ *
+ * We might invent a similar infrastructure for optimizing OR-clauses, doing
+ * something similar to what clause_selectivity does for AND-clauses, but
+ * luckily we know that each disjunctive normal form (aka OR-clause)
+ *
+ * (a OR b OR c)
+ *
+ * may be rewritten as an equivalent conjunctive normal form (aka AND-clause)
+ * by using negation:
+ *
+ * NOT ((NOT a) AND (NOT b) AND (NOT c))
+ *
+ * And that's something we can pass to clauselist_selectivity and let it do
+ * all the heavy lifting.
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
+{
+ List *args = NIL;
+ ListCell *l;
+ Expr *expr;
+
+ /* build arguments for the AND-clause by negating args of the OR-clause */
+ foreach (l, clauses)
+ args = lappend(args, makeBoolExpr(NOT_EXPR, list_make1(lfirst(l)), -1));
+
+ /* and then the actual OR-clause on the negated args */
+ expr = makeBoolExpr(AND_EXPR, args, -1);
+
+ /* instead of constructing NOT expression, just do (1.0 - s) */
+ return 1.0 - clauselist_selectivity(root, list_make1(expr), varRelid,
+ jointype, sjinfo, conditions);
+}
+
+/*
* addRangeClause --- add a new range clause for clauselist_selectivity
*
* Here is where we try to match up pairs of range-query clauses
@@ -629,7 +780,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -749,7 +901,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -758,29 +911,18 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
- /*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
- */
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(arg),
- varRelid,
- jointype,
- sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ /* just call to clauselist_selectivity_or() */
+ s1 = clauselist_selectivity_or(root,
+ ((BoolExpr *) clause)->args,
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -870,7 +1012,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -879,7 +1022,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else
{
@@ -943,15 +1087,16 @@ clause_selectivity(PlannerInfo *root,
* in the MCV list, then the selectivity is below the lowest frequency
* found in the MCV list,
*
- * TODO When applying the clauses to the histogram/MCV list, we can do
- * that from the most selective clauses first, because that'll
- * eliminate the buckets/items sooner (so we'll be able to skip
- * them without inspection, which is more expensive). But this
- * requires really knowing the per-clause selectivities in advance,
- * and that's not what we do now.
+ * TODO When applying the clauses to the histogram/MCV list, we can do that from
+ * the most selective clauses first, because that'll eliminate the
+ * buckets/items sooner (so we'll be able to skip them without inspection,
+ * which is more expensive). But this requires really knowing the
+ * per-clause selectivities in advance, and that's not what we do now.
+ *
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
@@ -969,7 +1114,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
*/
/* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions, is_or,
&fullmatch, &mcv_low);
/*
@@ -982,7 +1128,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
/* TODO if (fullmatch) without matching MCV item, use the mcv_low
* selectivity as upper bound */
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions, is_or);
/* TODO clamp to <= 1.0 (or more strictly, when possible) */
return s1 + s2;
@@ -1016,260 +1163,1325 @@ get_varattnos(Node * node, Index relid)
k + FirstLowInvalidHeapAttributeNumber);
}
- bms_free(varattnos);
+ bms_free(varattnos);
+
+ return result;
+}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Index relid, int types)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, relid, &attnums, types);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ bms_free(attnums);
+ attnums = NULL;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Index relid, int type)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Count varnos referenced in the clauses, and if there's a single varno then
+ * return the index in 'relid'.
+ */
+static int
+count_varnos(List *clauses, Index *relid)
+{
+ int cnt;
+ Bitmapset *varnos = NULL;
+
+ varnos = pull_varnos((Node *) clauses);
+ cnt = bms_num_members(varnos);
+
+ /* if there's a single varno in the clauses, remember it */
+ if (bms_num_members(varnos) == 1)
+ *relid = bms_singleton_member(varnos);
+
+ bms_free(varnos);
+
+ return cnt;
+}
+
+static List *
+clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ Index relid, int types, bool remove)
+{
+ int i;
+ Bitmapset *stat_attnums = NULL;
+ List *matching_clauses = NIL;
+ ListCell *lc;
+
+ /* build attnum bitmapset for this statistics */
+ for (i = 0; i < statistic->stakeys->dim1; i++)
+ stat_attnums = bms_add_member(stat_attnums,
+ statistic->stakeys->values[i]);
+
+ /*
+ * We can't use foreach here, because we may need to remove some of the
+ * clauses if (remove=true).
+ */
+ lc = list_head(*clauses);
+ while (lc)
+ {
+ Node *clause = (Node*)lfirst(lc);
+ Bitmapset *attnums = NULL;
+
+ /* must advance lc before list_delete possibly pfree's it */
+ lc = lnext(lc);
+
+ /*
+ * skip clauses that are not compatible with stats (just leave them
+ * in the original list)
+ *
+ * XXX Perhaps this should check what stats are actually available in
+ * the statistics (not a big deal now, because MCV and histograms
+ * handle the same types of conditions).
+ */
+ if (! clause_is_mv_compatible(clause, relid, &attnums, types))
+ {
+ bms_free(attnums);
+ continue;
+ }
+
+ /* if the clause is covered by the statistic, add it to the list */
+ if (bms_is_subset(attnums, stat_attnums))
+ {
+ matching_clauses = lappend(matching_clauses, clause);
+
+ /* if remove=true, remove the matching item from the main list */
+ if (remove)
+ *clauses = list_delete_ptr(*clauses, clause);
+ }
+
+ bms_free(attnums);
+ }
+
+ bms_free(stat_attnums);
+
+ return matching_clauses;
+}
+
+/*
+ * Selects the best combination of multivariate statistics, in an exhaustive
+ * way, where 'best' means:
+ *
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ * (c) using the most conditions to exploit dependency
+ *
+ * Don't call this directly but through choose_mv_statistics(), which does some
+ * additional tricks to minimize the runtime.
+ *
+ *
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with maximum
+ * depth equal to the number of multi-variate statistics available on the table.
+ * It actually explores all valid combinations of stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it matches are
+ * divided into 'conditions' (clauses already matched by at least one previous
+ * statistics) and clauses that are estimated.
+ *
+ * Then several checks are performed:
+ *
+ * (a) The statistics covers at least 2 columns, referenced in the estimated
+ * clauses (otherwise multi-variate stats are useless).
+ *
+ * (b) The statistics covers at least 1 new column, i.e. column not refefenced
+ * by the already used stats (and the new column has to be referenced by
+ * the clauses, of couse). Otherwise the statistics would not add any new
+ * information.
+ *
+ * There are some other sanity checks (e.g. stats must not be used twice etc.).
+ *
+ *
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a rather simple optimality criteria, so it may
+ * not do the best choice when
+ *
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but with
+ * statistics in a different order). It's unclear which solution in the best
+ * one - in a sense all of them are equal.
+ *
+ * TODO It might be possible to compute estimate for each of those solutions,
+ * and then combine them to get the final estimate (e.g. by using average
+ * or median).
+ *
+ * (b) Does not consider that some types of stats are a better match for some
+ * types of clauses (e.g. MCV list is generally a better match for equality
+ * conditions than a histogram).
+ *
+ * But maybe this is pointless - generally, each column is either a label
+ * (it's not important whether because of the data type or how it's used),
+ * or a value with ordering that makes sense. So either a MCV list is more
+ * appropriate (labels) or a histogram (values with orderings).
+ *
+ * Now sure what to do with statistics on columns mixing both types of data
+ * (some columns would work best with MCVs, some with histograms). Maybe we
+ * could invent a new type of statistics combining MCV list and histogram
+ * (keeping a small histogram for each MCV item, and a separate histogram
+ * for values not on the MCV list).
+ *
+ * TODO The algorithm should probably count number of Vars (not just attnums)
+ * when computing the 'score' of each solution. Computing the ratio of
+ * (num of all vars) / (num of condition vars) as a measure of how well
+ * the solution uses conditions might be useful.
+ */
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* this may run for a long sime, so let's make it interruptible */
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ /*
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int c;
+
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
+
+ Bitmapset *all_attnums = NULL;
+ Bitmapset *new_attnums = NULL;
+
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nclauses; c++)
+ {
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
+
+ if (covered)
+ break;
+ }
+
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
+
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
+
+ /* add the attnums into attnums from 'new clauses' */
+ // new_attnums = bms_union(new_attnums, clause_attnums);
+ }
+
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+ bms_free(new_attnums);
+
+ all_attnums = NULL;
+ new_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
+ {
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+ }
+
+ /*
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
+ */
+ ruled_out[i] = step;
+
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
+
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
+
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics covering
+ * the clauses. This chooses the "best" statistics at each step, so the
+ * resulting solution may not be the best solution globally, but this produces
+ * the solution in only N steps (where N is the number of statistics), while
+ * the exhaustive approach may have to walk through ~N! combinations (although
+ * some of those are terminated early).
+ *
+ * See the comments at choose_mv_statistics_exhaustive() as this does the same
+ * thing (but in a different way).
+ *
+ * Don't call this directly, but through choose_mv_statistics().
+ *
+ * TODO There are probably other metrics we might use - e.g. using number of
+ * columns (num_cond_columns / num_cov_columns), which might work better
+ * with a mix of simple and complex clauses.
+ *
+ * TODO Also the choice at the very first step should be handled in a special
+ * way, because there will be 0 conditions at that moment, so there needs
+ * to be some other criteria - e.g. using the simplest (or most complex?)
+ * clause might be a good idea.
+ *
+ * TODO We might also select multiple stats using different criteria, and branch
+ * the search. This is however tricky, because if we choose k statistics at
+ * each step, we get k^N branches to walk through (with N steps). That's
+ * not really good with large number of stats (yet better than exhaustive
+ * search).
+ */
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
+
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
+
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
+ */
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *attnums_new
+ = bms_union(attnums_covered, clauses_attnums[j]);
+
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = attnums_new;
+
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
+
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ attnums_new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = attnums_new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
+
+ if (gain > max_gain)
+ {
+ max_gain = gain;
+ best_stat = i;
+ }
+ }
+
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
+ }
+
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
+}
+
+/*
+ * Remove clauses not covered by any of the available statistics
+ *
+ * This helps us to reduce the amount of work done in choose_mv_statistics()
+ * by not having to deal with clauses that can't possibly be useful.
+ */
+static List *
+filter_clauses(PlannerInfo *root, Index relid, int type,
+ List *stats, List *clauses, Bitmapset **attnums)
+{
+ ListCell *c;
+ ListCell *s;
+
+ /* results (list of compatible clauses, attnums) */
+ List *rclauses = NIL;
+
+ foreach (c, clauses)
+ {
+ Node *clause = (Node*)lfirst(c);
+ Bitmapset *clause_attnums = NULL;
+
+ /*
+ * We do assume that thanks to previous checks, we should not run into
+ * clauses that are incompatible with multivariate stats here. We also
+ * need to collect the attnums for the clause.
+ *
+ * XXX Maybe turn this into an assert?
+ */
+ if (! clause_is_mv_compatible(clause, relid, &clause_attnums, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ /* Is there a multivariate statistics covering the clause? */
+ foreach (s, stats)
+ {
+ int k, matches = 0;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* skip statistics not matching the required type */
+ if (! stats_type_matches(stat, type))
+ continue;
+
+ /*
+ * see if all clause attributes are covered by the statistic
+ *
+ * We'll do that in the opposite direction, i.e. we'll see how many
+ * attributes of the statistic are referenced in the clause, and then
+ * compare the counts.
+ */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ if (bms_is_member(stat->stakeys->values[k], clause_attnums))
+ matches += 1;
+
+ /*
+ * If the number of matches is equal to attributes referenced by the
+ * clause, then the clause is covered by the statistic.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ *attnums = bms_union(*attnums, clause_attnums);
+ rclauses = lappend(rclauses, clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(clauses) >= list_length(rclauses));
+
+ return rclauses;
+}
+
+/*
+ * Remove statistics not covering any new clauses
+ *
+ * Statistics not covering any new clauses (conditions don't count) are not
+ * really useful, so let's ignore them. Also, we need the statistics to
+ * reference at least two different attributes (both in conditions and clauses
+ * combined), and at least one of them in the clauses alone.
+ *
+ * This check might be made more strict by checking against individual clauses,
+ * because by using the bitmapsets of all attnums we may actually use attnums
+ * from clauses that are not covered by the statistics. For example, we may
+ * have a condition
+ *
+ * (a=1 AND b=2)
+ *
+ * and a new clause
+ *
+ * (c=1 AND d=1)
+ *
+ * With only bitmapsets, statistics on [b,c] will pass through this (assuming
+ * there are some statistics covering both clases).
+ *
+ * Parameters:
+ *
+ * stats - list of statistics to filter
+ * new_attnums - attnums referenced in new clauses
+ * all_attnums - attnums referenced by contidions and new clauses combined
+ *
+ * Returns filtered list of statistics.
+ *
+ * TODO Do the more strict check, i.e. walk through individual clauses and
+ * conditions and only use those covered by the statistics.
+ */
+static List *
+filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+{
+ ListCell *s;
+ List *stats_filtered = NIL;
+
+ foreach (s, stats)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* see how many attributes the statistics covers */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ /* attributes from new clauses */
+ if (bms_is_member(stat->stakeys->values[k], new_attnums))
+ matches_new += 1;
+
+ /* attributes from onditions */
+ if (bms_is_member(stat->stakeys->values[k], all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ stats_filtered = lappend(stats_filtered, stat);
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(list_length(stats) >= list_length(stats_filtered));
+
+ return stats_filtered;
+}
+
+static MVStatisticInfo *
+make_stats_array(List *stats, int *nmvstats)
+{
+ int i;
+ ListCell *l;
+
+ MVStatisticInfo *mvstats = NULL;
+ *nmvstats = list_length(stats);
+
+ mvstats
+ = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
+
+ i = 0;
+ foreach (l, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
+ memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
+ }
+
+ return mvstats;
+}
+
+static Bitmapset **
+make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
+{
+ int i, j;
+ Bitmapset **stats_attnums = NULL;
+
+ Assert(nmvstats > 0);
- return result;
+ /* build bitmaps of attnums for the stats (easier to compare) */
+ stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i]
+ = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+
+ return stats_attnums;
}
+
/*
- * Collect attributes from mv-compatible clauses.
+ * Remove redundant statistics
+ *
+ * If there are multiple statistics covering the same set of columns (counting
+ * only those referenced by clauses and conditions), we can apply one of those
+ * anyway and further reduce the size of the optimization problem.
+ *
+ * Thus when redundant stats are detected, we keep the smaller one (the one with
+ * fewer columns), based on the assumption that it's more accurate and also
+ * faster to process. That may be untrue for two reasons - first, the accuracy
+ * really depends on number of buckets/MCV items, not the number of columns.
+ * Second, some types of statistics may work better for certain types of clauses
+ * (e.g. MCV lists for equality conditions) etc.
*/
-static Bitmapset *
-collect_mv_attnums(List *clauses, Index relid, int types)
+static List*
+filter_redundant_stats(List *stats, List *clauses, List *conditions)
{
- Bitmapset *attnums = NULL;
- ListCell *l;
+ int i, j, nmvstats;
+
+ MVStatisticInfo *mvstats;
+ bool *redundant;
+ Bitmapset **stats_attnums;
+ Bitmapset *varattnos;
+ Index relid;
+
+ Assert(list_length(stats) > 0);
+ Assert(list_length(clauses) > 0);
/*
- * Walk through the clauses and identify the ones we can estimate using
- * multivariate stats, and remember the relid/columns. We'll then
- * cross-check if we have suitable stats, and only if needed we'll split
- * the clauses into multivariate and regular lists.
+ * We'll convert the list of statistics into an array now, because
+ * the reduction of redundant statistics is easier to do that way
+ * (we can mark previous stats as redundant, etc.).
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* by default, none of the stats is redundant (so palloc0) */
+ redundant = palloc0(nmvstats * sizeof(bool));
+
+ /*
+ * We only expect a single relid here, and also we should get the
+ * same relid from clauses and conditions (but we get it from
+ * clauses, because those are certainly non-empty).
+ */
+ relid = bms_singleton_member(pull_varnos((Node*)clauses));
+
+ /*
+ * Get the varattnos from both conditions and clauses.
+ *
+ * This skips system attributes, although that should be impossible
+ * thanks to previous filtering out of incompatible clauses.
*
- * For now we're only interested in RestrictInfo nodes with nested OpExpr,
- * using either a range or equality.
+ * XXX Is that really true?
*/
- foreach (l, clauses)
+ varattnos = bms_union(get_varattnos((Node*)clauses, relid),
+ get_varattnos((Node*)conditions, relid));
+
+ for (i = 1; i < nmvstats; i++)
{
- Node *clause = (Node *) lfirst(l);
+ /* intersect with current statistics */
+ Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
- /* ignore the result here - we only need the attnums */
- clause_is_mv_compatible(clause, relid, &attnums, types);
+ /* walk through 'previous' stats and check redundancy */
+ for (j = 0; j < i; j++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *prev;
+
+ /* skip stats already identified as redundant */
+ if (redundant[j])
+ continue;
+
+ prev = bms_intersect(stats_attnums[j], varattnos);
+
+ switch (bms_subset_compare(curr, prev))
+ {
+ case BMS_EQUAL:
+ /*
+ * Use the smaller one (hopefully more accurate).
+ * If both have the same size, use the first one.
+ */
+ if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
+ redundant[i] = TRUE;
+ else
+ redundant[j] = TRUE;
+
+ break;
+
+ case BMS_SUBSET1: /* curr is subset of prev */
+ redundant[i] = TRUE;
+ break;
+
+ case BMS_SUBSET2: /* prev is subset of curr */
+ redundant[j] = TRUE;
+ break;
+
+ case BMS_DIFFERENT:
+ /* do nothing - keep both stats */
+ break;
+ }
+
+ bms_free(prev);
+ }
+
+ bms_free(curr);
}
- /*
- * If there are not at least two attributes referenced by the clause(s),
- * we can throw everything out (as we'll revert to simple stats).
- */
- if (bms_num_members(attnums) <= 1)
+ /* can't reduce all statistics (at least one has to remain) */
+ Assert(nmvstats > 0);
+
+ /* now, let's remove the reduced statistics from the arrays */
+ list_free(stats);
+ stats = NIL;
+
+ for (i = 0; i < nmvstats; i++)
{
- if (attnums != NULL)
- pfree(attnums);
- attnums = NULL;
+ MVStatisticInfo *info;
+
+ pfree(stats_attnums[i]);
+
+ if (redundant[i])
+ continue;
+
+ info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
+
+ stats = lappend(stats, info);
}
- return attnums;
+ pfree(mvstats);
+ pfree(stats_attnums);
+ pfree(redundant);
+
+ return stats;
}
-/*
- * Count the number of attributes in clauses compatible with multivariate stats.
- */
-static int
-count_mv_attnums(List *clauses, Index relid, int type)
+static Node**
+make_clauses_array(List *clauses, int *nclauses)
{
- int c;
- Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
+ int i;
+ ListCell *l;
- c = bms_num_members(attnums);
+ Node** clauses_array;
- bms_free(attnums);
+ *nclauses = list_length(clauses);
+ clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
- return c;
+ i = 0;
+ foreach (l, clauses)
+ clauses_array[i++] = (Node *)lfirst(l);
+
+ *nclauses = i;
+
+ return clauses_array;
}
-/*
- * Count varnos referenced in the clauses, and if there's a single varno then
- * return the index in 'relid'.
- */
-static int
-count_varnos(List *clauses, Index *relid)
+static Bitmapset **
+make_clauses_attnums(PlannerInfo *root, Index relid,
+ int type, Node **clauses, int nclauses)
{
- int cnt;
- Bitmapset *varnos = NULL;
+ int i;
+ Bitmapset **clauses_attnums
+ = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
- varnos = pull_varnos((Node *) clauses);
- cnt = bms_num_members(varnos);
+ for (i = 0; i < nclauses; i++)
+ {
+ Bitmapset * attnums = NULL;
- /* if there's a single varno in the clauses, remember it */
- if (bms_num_members(varnos) == 1)
- *relid = bms_singleton_member(varnos);
+ if (! clause_is_mv_compatible(clauses[i], relid, &attnums, type))
+ elog(ERROR, "should not get non-mv-compatible clause");
- bms_free(varnos);
+ clauses_attnums[i] = attnums;
+ }
- return cnt;
+ return clauses_attnums;
}
-
+
+static bool*
+make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses)
+{
+ int i, j;
+ bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < nclauses; j++)
+ cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+
+ return cover_map;
+}
+
/*
- * We're looking for statistics matching at least 2 attributes, referenced in
- * clauses compatible with multivariate statistics. The current selection
- * criteria is very simple - we choose the statistics referencing the most
- * attributes.
- *
- * If there are multiple statistics referencing the same number of columns
- * (from the clauses), the one with less source columns (as listed in the
- * ADD STATISTICS when creating the statistics) wins. Else the first one wins.
- *
- * This is a very simple criteria, and has several weaknesses:
- *
- * (a) does not consider the accuracy of the statistics
- *
- * If there are two histograms built on the same set of columns, but one
- * has 100 buckets and the other one has 1000 buckets (thus likely
- * providing better estimates), this is not currently considered.
- *
- * (b) does not consider the type of statistics
- *
- * If there are three statistics - one containing just a MCV list, another
- * one with just a histogram and a third one with both, we treat them equally.
+ * Chooses the combination of statistics, optimal for estimation of a particular
+ * clause list.
*
- * (c) does not consider the number of clauses
+ * This only handles a 'preparation' shared by the exhaustive and greedy
+ * implementations (see the previous methods), mostly trying to reduce the size
+ * of the problem (eliminate clauses/statistics that can't be really used in
+ * the solution).
*
- * As explained, only the number of referenced attributes counts, so if
- * there are multiple clauses on a single attribute, this still counts as
- * a single attribute.
+ * It also precomputes bitmaps for attributes covered by clauses and statistics,
+ * so that we don't need to do that over and over in the actual optimizations
+ * (as it's both CPU and memory intensive).
*
- * (d) does not consider type of condition
*
- * Some clauses may work better with some statistics - for example equality
- * clauses probably work better with MCV lists than with histograms. But
- * IS [NOT] NULL conditions may often work better with histograms (thanks
- * to NULL-buckets).
+ * TODO Another way to make the optimization problems smaller might be splitting
+ * the statistics into several disjoint subsets, i.e. if we can split the
+ * graph of statistics (after the elimination) into multiple components
+ * (so that stats in different components share no attributes), we can do
+ * the optimization for each component separately.
*
- * So for example with five WHERE conditions
- *
- * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
- *
- * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be selected
- * as it references the most columns.
- *
- * Once we have selected the multivariate statistics, we split the list of
- * clauses into two parts - conditions that are compatible with the selected
- * stats, and conditions are estimated using simple statistics.
- *
- * From the example above, conditions
- *
- * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
- *
- * will be estimated using the multivariate statistics (a,b,c,d) while the last
- * condition (e = 1) will get estimated using the regular ones.
- *
- * There are various alternative selection criteria (e.g. counting conditions
- * instead of just referenced attributes), but eventually the best option should
- * be to combine multiple statistics. But that's much harder to do correctly.
- *
- * TODO Select multiple statistics and combine them when computing the estimate.
- *
- * TODO This will probably have to consider compatibility of clauses, because
- * 'dependencies' will probably work only with equality clauses.
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew that we
+ * can cover 10 clauses and reuse 8 dependencies, maybe covering 9 clauses
+ * and 7 dependencies would be OK?
*/
-static MVStatisticInfo *
-choose_mv_statistics(List *stats, Bitmapset *attnums)
+static List*
+choose_mv_statistics(PlannerInfo *root, Index relid, List *stats,
+ List *clauses, List *conditions)
{
int i;
- ListCell *lc;
+ mv_solution_t *best = NULL;
+ List *result = NIL;
+
+ int nmvstats;
+ MVStatisticInfo *mvstats;
+
+ /* we only work with MCV lists and histograms here */
+ int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
- MVStatisticInfo *choice = NULL;
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
+ stats = list_copy(stats);
/*
- * Walk through the statistics (simple array with nmvstats elements) and for
- * each one count the referenced attributes (encoded in the 'attnums' bitmap).
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
*/
- foreach (lc, stats)
+ while (true)
{
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
-
- /* columns matching this statistics */
- int matches = 0;
+ List *tmp;
- int2vector * attrs = info->stakeys;
- int numattrs = attrs->dim1;
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
- /* skip dependencies-only stats */
- if (! (info->mcv_built || info->hist_built))
- continue;
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. We'll also keep info
+ * about attnums in clauses (without conditions) so that we can
+ * ignore stats covering just conditions (which is pointless).
+ */
+ tmp = filter_clauses(root, relid, type,
+ stats, clauses, &compatible_attnums);
- /* count columns covered by the histogram */
- for (i = 0; i < numattrs; i++)
- if (bms_is_member(attrs->values[i], attnums))
- matches++;
+ /* discard the original list */
+ list_free(clauses);
+ clauses = tmp;
/*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new
+ * attribute (by comparing with clauses).
*/
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
+ if (conditions != NIL)
{
- choice = info;
- current_matches = matches;
- current_dims = numattrs;
+ tmp = filter_clauses(root, relid, type,
+ stats, conditions, &condition_attnums);
+
+ /* discard the original list */
+ list_free(conditions);
+ conditions = tmp;
}
- }
- return choice;
-}
+ /* get a union of attnums (from conditions and new clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ tmp = filter_stats(stats, compatible_attnums, all_attnums);
+ /* if we've not eliminated anything, terminate */
+ if (list_length(stats) == list_length(tmp))
+ break;
-/*
- * This splits the clauses list into two parts - one containing clauses that
- * will be evaluated using the chosen statistics, and the remaining clauses
- * (either non-mvcompatible, or not related to the histogram).
- */
-static List *
-clauselist_mv_split(PlannerInfo *root, Index relid,
- List *clauses, List **mvclauses,
- MVStatisticInfo *mvstats, int types)
-{
- int i;
- ListCell *l;
- List *non_mvclauses = NIL;
+ /* work only with filtered statistics from now */
+ list_free(stats);
+ stats = tmp;
+ }
- /* FIXME is there a better way to get info on int2vector? */
- int2vector * attrs = mvstats->stakeys;
- int numattrs = mvstats->stakeys->dim1;
+ /* only do the optimization if we have clauses/statistics */
+ if ((list_length(stats) == 0) || (list_length(clauses) == 0))
+ return NULL;
- Bitmapset *mvattnums = NULL;
+ /* remove redundant stats (stats covered by another stats) */
+ stats = filter_redundant_stats(stats, clauses, conditions);
- /* build bitmap of attributes, so we can do bms_is_subset later */
- for (i = 0; i < numattrs; i++)
- mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
- /* erase the list of mv-compatible clauses */
- *mvclauses = NIL;
+ /* collect clauses an bitmap of attnums */
+ clauses_array = make_clauses_array(clauses, &nclauses);
+ clauses_attnums = make_clauses_attnums(root, relid, type,
+ clauses_array, nclauses);
- foreach (l, clauses)
- {
- bool match = false; /* by default not mv-compatible */
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(l);
+ /* collect conditions and bitmap of attnums */
+ conditions_array = make_clauses_array(conditions, &nconditions);
+ conditions_attnums = make_clauses_attnums(root, relid, type,
+ conditions_array, nconditions);
- if (clause_is_mv_compatible(clause, relid, &attnums, types))
+ /*
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
+ */
+ clause_cover_map = make_cover_map(stats_attnums, nmvstats,
+ clauses_attnums, nclauses);
+
+ condition_cover_map = make_cover_map(stats_attnums, nmvstats,
+ conditions_attnums, nconditions);
+
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ /* no stats are ruled out by default */
+ for (i = 0; i < nmvstats; i++)
+ ruled_out[i] = -1;
+
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* create a list of statistics from the array */
+ if (best != NULL)
+ {
+ for (i = 0; i < best->nstats; i++)
{
- /* are all the attributes part of the selected stats? */
- if (bms_is_subset(attnums, mvattnums))
- match = true;
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
+ result = lappend(result, info);
}
- /*
- * The clause matches the selected stats, so put it to the list of
- * mv-compatible clauses. Otherwise, keep it in the list of 'regular'
- * clauses (that may be selected later).
- */
- if (match)
- *mvclauses = lappend(*mvclauses, clause);
- else
- non_mvclauses = lappend(non_mvclauses, clause);
+ pfree(best);
}
- /*
- * Perform regular estimation using the clauses incompatible with the chosen
- * histogram (or MV stats in general).
- */
- return non_mvclauses;
+ /* cleanup (maybe leave it up to the memory context?) */
+ for (i = 0; i < nmvstats; i++)
+ bms_free(stats_attnums[i]);
+
+ for (i = 0; i < nclauses; i++)
+ bms_free(clauses_attnums[i]);
+
+ for (i = 0; i < nconditions; i++)
+ bms_free(conditions_attnums[i]);
+
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(conditions_attnums);
+ pfree(clauses_array);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+ pfree(mvstats);
+
+ list_free(clauses);
+ list_free(conditions);
+ list_free(stats);
+
+ return result;
}
typedef struct
@@ -1474,6 +2686,7 @@ clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums, int type
return true;
}
+
/*
* collect attnums from functional dependencies
*
@@ -2022,6 +3235,24 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
* Check that there are stats with at least one of the requested types.
*/
static bool
+stats_type_matches(MVStatisticInfo *stat, int type)
+{
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
+
+ return false;
+}
+
+/*
+ * Check that there are stats with at least one of the requested types.
+ */
+static bool
has_stats(List *stats, int type)
{
ListCell *s;
@@ -2030,13 +3261,8 @@ has_stats(List *stats, int type)
{
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
- if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
- return true;
-
- if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
- return true;
-
- if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ /* terminate if we've found at least one matching statistics */
+ if (stats_type_matches(stat, type))
return true;
}
@@ -2087,22 +3313,26 @@ find_stats(PlannerInfo *root, Index relid)
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -2113,32 +3343,85 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
- nmatches, matches,
- lowsel, fullmatch, false);
+ ((is_or) ? 0 : nmatches), matches,
+ lowsel, fullmatch, is_or);
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
{
- /* used to 'scale' for MCV lists not covering all tuples */
+ /*
+ * Find out what part of the data is covered by the MCV list,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample items got into the MCV, and the rest
+ * is either in a histogram, or not covered by stats).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole list, which might save us a bit of time
+ * spent on accessing the not-matching part of the MCV list.
+ * Although it's likely in a cache, so it's very fast.
+ */
u += mcvlist->items[i]->frequency;
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s*u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -2369,64 +3652,57 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
}
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each MCV item */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching MCV items */
- or_nmatches = mcvlist->nitems;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
/* by default none of the MCV items matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
stakeys, mcvlist,
- or_nmatches, or_matches,
+ tmp_nmatches, tmp_matches,
lowsel, fullmatch, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mcvlist->nitems; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
+ pfree(tmp_matches);
}
else
- {
elog(ERROR, "unknown clause type: %d", clause->type);
- }
}
/*
@@ -2484,15 +3760,18 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
MVSerializedHistogram mvhist = NULL;
@@ -2505,25 +3784,55 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
Assert (mvhist != NULL);
Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
/*
- * Bitmap of bucket matches (mismatch, partial, full). by default
- * all buckets fully match (and we'll eliminate them).
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+ matches = palloc0(sizeof(char) * nmatches);
- nmatches = mvhist->nbuckets;
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, false);
- /* build the match bitmap */
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
- nmatches, matches, false);
+ ((is_or) ? 0 : nmatches), matches,
+ is_or);
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
/*
* Find out what part of the data is covered by the histogram,
* so that we can 'scale' the selectivity properly (e.g. when
@@ -2537,10 +3846,23 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
*/
u += mvhist->buckets[i]->ntuples;
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
#ifdef DEBUG_MVHIST
@@ -2549,9 +3871,14 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s * u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/* cached result of bucket boundary comparison for a single dimension */
@@ -2699,7 +4026,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
{
int i;
ListCell * l;
-
+
/*
* Used for caching function calls, only once per deduplicated value.
*
@@ -2742,7 +4069,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
FmgrInfo opproc; /* operator */
fmgr_info(get_opcode(expr->opno), &opproc);
-
+
/* reset the cache (per clause) */
memset(callcache, 0, mvhist->nbuckets);
@@ -2902,64 +4229,57 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each bucket */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching buckets */
- or_nmatches = mvhist->nbuckets;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mvhist->nbuckets;
- /* by default none of the buckets matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ /* by default none of the buckets matches the clauses (OR clause) */
+ tmp_matches = palloc0(sizeof(char) * mvhist->nbuckets);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* but AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ tmp_nmatches = update_match_bitmap_histogram(root, tmp_clauses,
stakeys, mvhist,
- or_nmatches, or_matches, or_clause(clause));
+ tmp_nmatches, tmp_matches, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mvhist->nbuckets; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
-
+ pfree(tmp_matches);
}
else
elog(ERROR, "unknown clause type: %d", clause->type);
}
- /* free the call cache */
pfree(callcache);
return nmatches;
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 5350329..57214e0 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3518,7 +3518,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3541,7 +3542,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3708,7 +3710,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3744,7 +3746,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3781,7 +3784,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3919,12 +3923,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3936,7 +3942,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index ea831f5..6299e75 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 46c95b0..7d0a3a1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1627,13 +1627,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6259,7 +6261,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6579,7 +6582,8 @@ btcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7330,7 +7334,8 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7560,7 +7565,7 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ea5a09a..27a8de5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -393,6 +394,15 @@ static const struct config_enum_entry force_parallel_mode_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3707,6 +3717,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 3e4f4d1..d404914 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -90,6 +90,137 @@ even attempting to do the more expensive estimation.
Whenever we find there are no suitable stats, we skip the expensive steps.
+Combining multiple statistics
+-----------------------------
+
+When estimating selectivity of a list of clauses, there may exist no statistics
+covering all of them. If there are multiple statistics, each covering some
+subset of the attributes, the optimizer needs to figure out which of those
+statistics to apply.
+
+When the statistics do not overlap, the solution is trivial - we can simply
+split the groups of conditions by the matching statistics, and then multiply the
+selectivities. For example assume multivariate statistics on (b,c) and (d,e),
+and a condition like this:
+
+ (a=1) AND (b=2) AND (c=3) AND (d=4) AND (e=5)
+
+Then (a=1) is not covered by any of the statistics, so will be estimated using
+the regular per-column statistics. The two conditions ((b=2) AND (c=3)) will be
+estimated using the (b,c) statistics, and ((d=4) AND (e=5)) will be estimated
+using (d,e) statistics. And the resulting selectivities will be estimated.
+
+Now, what if the statistics overlap? For example assume the same condition as
+above, but let's say we have statistics on (a,b,c) and (a,c,d,e). What then?
+
+As selectivity is just a probability that the condition holds for a random row,
+we can write the selectivity like this:
+
+ P(a=1 & b=2 & c=3 & d=4 & e=5)
+
+and we can rewrite it using conditional probability like this
+
+ P(a=1 & b=2 & c=3) * P(d=4 & e=5 | a=1 & b=2 & c=3)
+
+Notice that the first part already matches to (a,b,c) statistics. If we assume
+that columns that are not referenced by the same statistics are independent, we
+may rewrite the second half like this
+
+ P(d=4 & e=5 | a=1 & b=2 & c=3) = P(d=4 & e=5 | a=1 & c=3)
+
+which corresponds to the statistics on (a,c,d,e).
+
+If there are multiple statistics defined on a table, it's not difficult to come
+up with examples when there are multiple ways to combine them to cover a list of
+clauses. We need a way to find the best combination of statistics.
+
+This is the purpose of choose_mv_statistics(). It searches through the possible
+combinations of statistics, and searches such combination that
+
+ (a) covers the most clauses of the list
+
+ (b) reuses the maximum number of clauses as conditions
+ (in conditional probabilities)
+
+While (a) criteria seems natural, the (b) may seem a bit awkward at first. The
+idea is that conditions in a way of transfering information about dependencies
+between statistics.
+
+There are two alternative implementations of choose_mv_statistics() - greedy
+and exhaustive. Exhaustive actually searches through all possible combinations
+of statistics, and for larger numbers of statistics may get quite expensive
+(as it, unsurprisingly, has exponential cost). Greedy terminates in less than
+K steps (when K is the number of clauses), and in each step chooses the best
+next statistics. I've been unable to come up with an example where those two
+approaches would produce different combinations.
+
+It's possible to choose the optimization using mvstat_search_type, with either
+'greedy' or 'exhaustive' values (default is 'greedy').
+
+ SET mvstat_search_type = 'exhaustive';
+
+Note: This is meant mostly for experimentation. I do expect we'll choose one of
+the algorithms and remove the GUC before commit.
+
+
+Limitations of combining statistics
+-----------------------------------
+
+As described in the section 'Combining multiple statistics', the current appoach
+is based on transfering information between statistics by means of conditional
+probabilities. This is a relatively cheap and efficient approach, but it is
+based on two assumptions:
+
+ (1) The overlap between the statistics needs to be sufficiently large, i.e.
+ there needs to be enough columns shared by the statistics to transfer
+ information about dependencies between the remaining columns.
+
+ (2) The query needs to include sufficient clauses on the shared columns.
+
+How a violation of those assumptions may be a problem can be illustrated by
+a simple example. Assume a table with three columns (a,b,c) containing exactly
+the same values, and statistics on (a,b) and (b,c):
+
+ CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+
+ CREATE STATISTICS s1 ON test (a,b) WITH (mcv);
+ CREATE STATISTICS s2 ON test (b,c) WITH (mcv);
+
+ ANALYZE test;
+
+First, let's estimate this query:
+
+ SELECT * FROM test WHERE (a < 10) AND (c < 10);
+
+Clearly, there are no conditions on 'b' (which is the only column shared by the
+two statistics), so we'll end up with an estimate based on assumption of
+independence:
+
+ P(a < 10) * P(c < 10) = 0.01 * 0.01 = 0.0001
+
+Which is a significant under-estimate, as the proper selectivity is 0.01.
+
+But let's estimate another query:
+
+ SELECT * FROM test WHERE (a < 10) AND (b < 500) AND (c < 10);
+
+In this case, the estimate may be computed for example like this:
+
+ P[(a < 10) & (b < 500) & (c < 10)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (a < 10) & (b < 500)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (b < 500)]
+
+The trouble is the probability P(c < 10 | b < 500) evaluates to 0.02, because
+we have assumed (a) and (c) are independent because there is no statistic
+containing both these columns, and the condition on (b) does not transfer
+sufficient amount of information between the two statistics.
+
+Currently, the only solution is to build statistics on all three columns, but
+see the 'combining statistics using convolution' section for ideas on how to
+improve this.
+
+
Further (possibly crazy) ideas
------------------------------
@@ -111,3 +242,38 @@ But of course, this may result in expensive estimation (CPU-wise).
So we might add a GUC to choose between a simple (single statistics) and thus
multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
+
+
+Combining stats using convolution
+---------------------------------
+
+While the current approach for combining statistics is based on conditional
+probabilities, and thus only works when the query includes conditions on the
+overlapping parts of the statistics. But there may be other ways to combine
+statistics, relaxing this requirement.
+
+Let's assume two histograms H1 and H2 - then combining them might work about
+like this:
+
+
+ for (buckets of H1, satisfying local conditions)
+ {
+ for (buckets of H2, overlapping with H1 bucket)
+ {
+ mark H2 bucket as 'valid'
+ }
+ }
+
+ s1 = s2 = 0.0
+ for (buckets of H2 marked as valid)
+ {
+ s1 += frequency
+
+ if (bucket satistifes local conditions)
+ s2 += frequency
+ }
+
+ s = (s2 / s1) /* final selectivity estimate */
+
+However this may quickly get non-trivial, e.g. when combining two statistics
+of different types (histogram vs. MCV).
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index fea2bb7..33f5a1b 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -192,11 +192,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index f05a517..35b2f8e 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,6 +17,14 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
--
2.1.0
0007-multivariate-ndistinct-coefficients.patchtext/x-patch; charset=UTF-8; name=0007-multivariate-ndistinct-coefficients.patchDownload
From bcc5f072c0d14e824c9f50b2b6f5f31e864d92e6 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Wed, 23 Dec 2015 02:07:58 +0100
Subject: [PATCH 7/9] multivariate ndistinct coefficients
---
doc/src/sgml/ref/create_statistics.sgml | 9 ++
src/backend/catalog/system_views.sql | 3 +-
src/backend/commands/analyze.c | 2 +-
src/backend/commands/statscmds.c | 11 +-
src/backend/optimizer/path/clausesel.c | 4 +
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/adt/selfuncs.c | 93 +++++++++++++++-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.ndistinct | 83 ++++++++++++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 23 +++-
src/backend/utils/mvstats/mvdist.c | 171 +++++++++++++++++++++++++++++
src/include/catalog/pg_mv_statistic.h | 26 +++--
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 9 +-
src/test/regress/expected/rules.out | 3 +-
16 files changed, 424 insertions(+), 23 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.ndistinct
create mode 100644 src/backend/utils/mvstats/mvdist.c
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index fd3382e..80360a6 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -168,6 +168,15 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>ndistinct</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables ndistinct coefficients for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 6afdee0..a550141 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -169,7 +169,8 @@ CREATE VIEW pg_mv_stats AS
length(S.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
length(S.stahist) AS histbytes,
- pg_mv_stats_histogram_info(S.stahist) AS histinfo
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo,
+ standcoeff AS ndcoeff
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8ac9915..b4f5927 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -582,7 +582,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
}
/* Build multivariate stats (if there are any). */
- build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
+ build_mv_stats(onerel, totalrows, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index b974655..6ea0e13 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -138,7 +138,8 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
build_mcv = false,
- build_histogram = false;
+ build_histogram = false,
+ build_ndistinct = false;
int32 max_buckets = -1,
max_mcv_items = -1;
@@ -221,6 +222,8 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "ndistinct") == 0)
+ build_ndistinct = defGetBoolean(opt);
else if (strcmp(opt->defname, "mcv") == 0)
build_mcv = defGetBoolean(opt);
else if (strcmp(opt->defname, "max_mcv_items") == 0)
@@ -275,10 +278,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv || build_histogram))
+ if (! (build_dependencies || build_mcv || build_histogram || build_ndistinct))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram, ndistinct) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -311,6 +314,7 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+ values[Anum_pg_mv_statistic_ndist_enabled-1] = BoolGetDatum(build_ndistinct);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
@@ -318,6 +322,7 @@ CreateStatistics(CreateStatsStmt *stmt)
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
nulls[Anum_pg_mv_statistic_stahist -1] = true;
+ nulls[Anum_pg_mv_statistic_standist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index c1b8999..2540da9 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -59,6 +59,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
#define MV_CLAUSE_TYPE_HIST 0x04
+#define MV_CLAUSE_TYPE_NDIST 0x08
static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
int type);
@@ -3246,6 +3247,9 @@ stats_type_matches(MVStatisticInfo *stat, int type)
if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
return true;
+ if ((type & MV_CLAUSE_TYPE_NDIST) && stat->ndist_built)
+ return true;
+
return false;
}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 40145e7..328633e 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -416,7 +416,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built || mvstat->ndist_built)
{
info = makeNode(MVStatisticInfo);
@@ -427,11 +427,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
info->hist_enabled = mvstat->hist_enabled;
+ info->ndist_enabled = mvstat->ndist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
info->hist_built = mvstat->hist_built;
+ info->ndist_built = mvstat->ndist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 7d0a3a1..a84dd2b 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -132,6 +132,7 @@
#include "utils/fmgroids.h"
#include "utils/index_selfuncs.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/nabstime.h"
#include "utils/pg_locale.h"
#include "utils/rel.h"
@@ -206,6 +207,7 @@ static Const *string_to_const(const char *str, Oid datatype);
static Const *string_to_bytea_const(const char *str, size_t str_len);
static List *add_predicate_to_quals(IndexOptInfo *index, List *indexQuals);
+static Oid find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos);
/*
* eqsel - Selectivity of "=" for any data types.
@@ -3422,12 +3424,26 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
* don't know by how much. We should never clamp to less than the
* largest ndistinct value for any of the Vars, though, since
* there will surely be at least that many groups.
+ *
+ * However we don't need to do this if we have ndistinct stats on
+ * the columns - in that case we can simply use the coefficient
+ * to get the (probably way more accurate) estimate.
+ *
+ * XXX Probably needs refactoring (don't like to mix with clamp
+ * and coeff at the same time).
*/
double clamp = rel->tuples;
+ double coeff = 1.0;
if (relvarcount > 1)
{
- clamp *= 0.1;
+ Oid oid = find_ndistinct_coeff(root, rel, varinfos);
+
+ if (oid != InvalidOid)
+ coeff = load_mv_ndistinct(oid);
+ else
+ clamp *= 0.1;
+
if (clamp < relmaxndistinct)
{
clamp = relmaxndistinct;
@@ -3436,6 +3452,13 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
clamp = rel->tuples;
}
}
+
+ /*
+ * Apply ndistinct coefficient from multivar stats (we must do this
+ * before clamping the estimate in any way.
+ */
+ reldistinct /= coeff;
+
if (reldistinct > clamp)
reldistinct = clamp;
@@ -7582,3 +7605,71 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
/* XXX what about pages_per_range? */
}
+
+/*
+ * Find applicable ndistinct statistics and compute the coefficient to
+ * correct the estimate (simply a product of per-column ndistincts).
+ *
+ * Currently we only look for a perfect match, i.e. a single ndistinct
+ * estimate exactly matching all the columns of the statistics.
+ */
+static Oid
+find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+ VariableStatData vardata;
+
+ foreach(lc, varinfos)
+ {
+ GroupVarInfo *varinfo = (GroupVarInfo *) lfirst(lc);
+
+ if (varinfo->rel != rel)
+ continue;
+
+ /* FIXME handle expressions in general only */
+
+ /*
+ * examine the variable (or expression) so that we know which
+ * attribute we're dealing with - we need this for matching the
+ * ndistinct coefficient
+ *
+ * FIXME probably might remember this from estimate_num_groups
+ */
+ examine_variable(root, varinfo->var, 0, &vardata);
+
+ if (HeapTupleIsValid(vardata.statsTuple))
+ {
+ Form_pg_statistic stats
+ = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+
+ attnums = bms_add_member(attnums, stats->staattnum);
+
+ ReleaseVariableStats(vardata);
+ }
+ }
+
+ /* look for a matching ndistinct statistics */
+ foreach (lc, rel->mvstatlist)
+ {
+ int i;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* skip statistics without ndistinct coefficient built */
+ if (!info->ndist_built)
+ continue;
+
+ /* only exact matches for now (same set of columns) */
+ if (bms_num_members(attnums) != info->stakeys->dim1)
+ continue;
+
+ /* check that the columns match */
+ for (i = 0; i < info->stakeys->dim1; i++)
+ if (bms_is_member(info->stakeys->values[i], attnums))
+ continue;
+
+ return info->mvoid;
+ }
+
+ return InvalidOid;
+}
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 9dbb3b6..d4b88e9 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o histogram.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o mvdist.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.ndistinct b/src/backend/utils/mvstats/README.ndistinct
new file mode 100644
index 0000000..32d1624
--- /dev/null
+++ b/src/backend/utils/mvstats/README.ndistinct
@@ -0,0 +1,83 @@
+ndistinct coefficients
+======================
+
+Estimating number of distinct groups in a combination of columns is tricky,
+and the estimation error is often significant. By ndistinct coefficient we
+mean a ratio
+
+ q = ndistinct(a) * ndistinct(b) / ndistinct(a,b)
+
+where 'a' and 'b' are columns, ndistinct(a) is (an estimate of) a number of
+distinct values in column 'a'. And ndistinct(a,b) is the same thing for the
+pair of columns.
+
+The meaning of the coefficient may be illustrated by answering the following
+question: Given a combination of columns (a,b), how many distinct values of 'b'
+matches a chosen value of 'a' on average?
+
+Let's assume we know ndistinct(a) and ndistinct(a,b). Then the answer to the
+question clearly is
+
+ ndistinct(a,b) / ndistinct(a)
+
+and by using 'q' we may rewrite this as
+
+ ndistinct(b) / q
+
+so 'q' may be considered as a correction factor of the ndistinct estimate given
+a condition on one of the columns.
+
+This may be generalized to a combination of 'n' columns
+
+ [ndistinct(c1) * ... * ndistinct(cn)] / ndistinct(c1, ..., cn)
+
+and the meaning is very similar, except that we need to use conditions on (n-1)
+of the columns.
+
+
+Selectivity estimation
+----------------------
+
+As explained in the previous paragraph, ndistinct coefficients may be used to
+estimate cardinality of a column, given some apriori knowledge. Let's assume
+we need to estimate selectivity of a condition
+
+ (a=1) AND (b=2)
+
+which we can expand like this
+
+ P(a=1 & b=2) = P(a=1) * P(b=2 | a=1)
+
+Let's also assume that the distribution of 'b' is uniform, i.e. that
+
+ P(a=1) = 1/ndistinct(a)
+ P(b=2) = 1/ndistinct(b)
+ P(a=1 & b=2) = 1/ndistinct(a,b)
+
+ P(b=2 | a=1) = ndistinct(a) / ndistinct(a,b)
+
+which may be rewritten like
+
+ P(b=2 | a=1)
+ = ndistinct(a,b) / ndistinct(a)
+ = (1/ndistinct(b)) * [(ndistinct(a) * ndistinct(b)) / ndistinct(a,b)]
+ = (1/ndistinct(b)) * q
+
+and therefore
+
+ P(a=1 & b=2) = (1/ndistinct(a)) * (1/ndistinct(b)) * q
+
+This also illustrates 'q' as a correction coefficient.
+
+It also explains why we store the coefficient and not simply ndistinct(a,b).
+This way we can simply estimate individual clauses and then simply correct
+the estimate by multiplying the result with 'q' - we don't have to mess with
+ndistinct estimates at all.
+
+Naturally, as the coefficient is derives from ndistinct(a,b), it may be also
+used to estimate GROUP BY clauses on the combination of columns, replacing the
+existing heuristics in estimate_num_groups().
+
+Note: Currently only the GROUP BY estimation is implemented. It's a bit unclear
+how to implement the clause estimation when there are other statistics (esp.
+MCV lists and/or functional dependencies) available.
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index d404914..6d4b09b 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -20,6 +20,8 @@ Currently we only have two kinds of multivariate statistics
(c) multivariate histograms (README.histogram)
+ (d) ndistinct coefficients
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index ffb76f4..2be980d 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -32,7 +32,8 @@ static List* list_mv_stats(Oid relid);
* and serializes them back into the catalog (as bytea values).
*/
void
-build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats)
{
ListCell *lc;
@@ -53,6 +54,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
MVHistogram histogram = NULL;
+ double ndist = -1;
int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
@@ -92,6 +94,9 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->ndist_enabled)
+ ndist = build_mv_ndistinct(totalrows, numrows, rows, attrs, stats);
+
/* build the MCV list */
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
@@ -101,7 +106,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, ndist, attrs, stats);
}
}
@@ -183,6 +188,8 @@ list_mv_stats(Oid relid)
info->mcv_built = stats->mcv_built;
info->hist_enabled = stats->hist_enabled;
info->hist_built = stats->hist_built;
+ info->ndist_enabled = stats->ndist_enabled;
+ info->ndist_built = stats->ndist_built;
result = lappend(result, info);
}
@@ -252,7 +259,7 @@ find_mv_attnums(Oid mvoid, Oid *relid)
void
update_mv_stats(Oid mvoid,
MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
- int2vector *attrs, VacAttrStats **stats)
+ double ndistcoeff, int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -292,26 +299,36 @@ update_mv_stats(Oid mvoid,
= PointerGetDatum(data);
}
+ if (ndistcoeff > 1.0)
+ {
+ nulls[Anum_pg_mv_statistic_standist -1] = false;
+ values[Anum_pg_mv_statistic_standist-1] = Float8GetDatum(ndistcoeff);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
replaces[Anum_pg_mv_statistic_stahist-1] = true;
+ replaces[Anum_pg_mv_statistic_standist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_hist_built-1] = false;
+ nulls[Anum_pg_mv_statistic_ndist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_ndist_built-1] = true;
replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
+ values[Anum_pg_mv_statistic_ndist_built-1] = BoolGetDatum(ndistcoeff > 1.0);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/mvdist.c b/src/backend/utils/mvstats/mvdist.c
new file mode 100644
index 0000000..59b8358
--- /dev/null
+++ b/src/backend/utils/mvstats/mvdist.c
@@ -0,0 +1,171 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvdist.c
+ * POSTGRES multivariate distinct coefficients
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mvdist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include <math.h>
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+static double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
+
+/*
+ * Compute ndistinct coefficient for the combination of attributes. This
+ * computes the ndistinct estimate using the same estimator used in analyze.c
+ * and then computes the coefficient.
+ */
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats)
+{
+ int i, j;
+ int f1, cnt, d;
+ int nmultiple, summultiple;
+ int numattrs = attrs->dim1;
+ MultiSortSupport mss = multi_sort_init(numattrs);
+ double ndistcoeff;
+
+ /*
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simpler / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ Assert(numattrs >= 2);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* accumulate all the data into the array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ items[j].values[i]
+ = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+ }
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /* count number of distinct combinations */
+
+ f1 = 0;
+ cnt = 1;
+ d = 1;
+ for (i = 1; i < numrows; i++)
+ {
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ {
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ d++;
+ cnt = 0;
+ }
+
+ cnt += 1;
+ }
+
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ ndistcoeff = 1 / estimate_ndistinct(totalrows, numrows, d, f1);
+
+ /*
+ * now count distinct values for each attribute and incrementally
+ * compute ndistinct(a,b) / (ndistinct(a) * ndistinct(b))
+ *
+ * FIXME Probably need to handle cases when one of the ndistinct
+ * estimates is negative, and also check that the combined
+ * ndistinct is greater than any of those partial values.
+ */
+ for (i = 0; i < numattrs; i++)
+ ndistcoeff *= stats[i]->stadistinct;
+
+ return ndistcoeff;
+}
+
+double
+load_mv_ndistinct(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->ndist_enabled && mvstat->ndist_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_standist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return DatumGetFloat8(deps);
+}
+
+/* The Duj1 estimator (already used in analyze.c). */
+static double
+estimate_ndistinct(double totalrows, int numrows, int d, int f1)
+{
+ double numer,
+ denom,
+ ndistinct;
+
+ numer = (double) numrows *(double) d;
+
+ denom = (double) (numrows - f1) +
+ (double) f1 * (double) numrows / totalrows;
+
+ ndistinct = numer / denom;
+
+ /* Clamp to sane range in case of roundoff error */
+ if (ndistinct < (double) d)
+ ndistinct = (double) d;
+
+ if (ndistinct > totalrows)
+ ndistinct = totalrows;
+
+ return floor(ndistinct + 0.5);
+}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index a5945af..ee353da 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -39,6 +39,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
bool hist_enabled; /* build histogram? */
+ bool ndist_enabled; /* build ndist coefficient? */
/* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
@@ -48,6 +49,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
bool hist_built; /* histogram was built */
+ bool ndist_built; /* ndistinct coeff built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -56,6 +58,7 @@ CATALOG(pg_mv_statistic,3381)
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
bytea stahist; /* MV histogram (serialized) */
+ float8 standcoeff; /* ndistinct coeff (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -71,21 +74,24 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 15
+#define Natts_pg_mv_statistic 18
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_deps_enabled 4
#define Anum_pg_mv_statistic_mcv_enabled 5
#define Anum_pg_mv_statistic_hist_enabled 6
-#define Anum_pg_mv_statistic_mcv_max_items 7
-#define Anum_pg_mv_statistic_hist_max_buckets 8
-#define Anum_pg_mv_statistic_deps_built 9
-#define Anum_pg_mv_statistic_mcv_built 10
-#define Anum_pg_mv_statistic_hist_built 11
-#define Anum_pg_mv_statistic_stakeys 12
-#define Anum_pg_mv_statistic_stadeps 13
-#define Anum_pg_mv_statistic_stamcv 14
-#define Anum_pg_mv_statistic_stahist 15
+#define Anum_pg_mv_statistic_ndist_enabled 7
+#define Anum_pg_mv_statistic_mcv_max_items 8
+#define Anum_pg_mv_statistic_hist_max_buckets 9
+#define Anum_pg_mv_statistic_deps_built 10
+#define Anum_pg_mv_statistic_mcv_built 11
+#define Anum_pg_mv_statistic_hist_built 12
+#define Anum_pg_mv_statistic_ndist_built 13
+#define Anum_pg_mv_statistic_stakeys 14
+#define Anum_pg_mv_statistic_stadeps 15
+#define Anum_pg_mv_statistic_stamcv 16
+#define Anum_pg_mv_statistic_stahist 17
+#define Anum_pg_mv_statistic_standist 18
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 8c50bfb..1923f2b 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -655,11 +655,13 @@ typedef struct MVStatisticInfo
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
bool hist_enabled; /* histogram enabled */
+ bool ndist_enabled; /* ndistinct coefficient enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
bool hist_built; /* histogram built */
+ bool ndist_built; /* ndistinct coefficient built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 35b2f8e..fb2c5d8 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -225,6 +225,7 @@ typedef MVSerializedHistogramData *MVSerializedHistogram;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
MVSerializedHistogram load_mv_histogram(Oid mvoid);
+double load_mv_ndistinct(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
@@ -266,11 +267,17 @@ MVHistogram
build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int numrows_total);
-void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
void update_mv_stats(Oid relid, MVDependencies dependencies,
MCVList mcvlist, MVHistogram histogram,
+ double ndistcoeff,
int2vector *attrs, VacAttrStats **stats);
#ifdef DEBUG_MVHIST
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 1a1a4ca..0ad935e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1377,7 +1377,8 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
length(s.stahist) AS histbytes,
- pg_mv_stats_histogram_info(s.stahist) AS histinfo
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo,
+ s.standcoeff AS ndcoeff
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
--
2.1.0
0008-change-how-we-apply-selectivity-to-number-of-groups-.patchtext/x-patch; charset=UTF-8; name=0008-change-how-we-apply-selectivity-to-number-of-groups-.patchDownload
From 19fae36e03b6e2b4cd2ea1702ffbe9676c0aca52 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 26 Jan 2016 18:14:33 +0100
Subject: [PATCH 8/9] change how we apply selectivity to number of groups
estimate
Instead of simply multiplying the ndistinct estimate with selecticity,
we instead use the formula for the expected number of distinct values
observed in 'k' rows when there are 'd' distinct values in the bin
d * (1 - ((d - 1) / d)^k)
This is 'with replacements' which seems appropriate for the use, and it
mostly assumes uniform distribution of the distinct values. So if the
distribution is not uniform (e.g. there are very frequent groups) this
may be less accurate than the current algorithm in some cases, giving
over-estimates. But that's probably better than OOM.
---
src/backend/utils/adt/selfuncs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index a84dd2b..ce3ad19 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3465,7 +3465,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
/*
* Multiply by restriction selectivity.
*/
- reldistinct *= rel->rows / rel->tuples;
+ reldistinct = reldistinct * (1 - powl((reldistinct - 1) / reldistinct,rel->rows));
/*
* Update estimate of total distinct groups.
--
2.1.0
0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchtext/x-patch; charset=UTF-8; name=0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchDownload
From d37345b7e2a8868c5ca44507c3402affaaa0cb07 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Sun, 28 Feb 2016 21:16:40 +0100
Subject: [PATCH 9/9] fixup of regression tests (plans changes by group by
estimation)
---
src/test/regress/expected/join.out | 18 ++++++++++--------
src/test/regress/expected/subselect.out | 25 +++++++++++--------------
src/test/regress/expected/union.out | 16 ++++++++--------
3 files changed, 29 insertions(+), 30 deletions(-)
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index cafbc5e..151402d 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3965,18 +3965,20 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
explain (costs off)
select d.* from d left join (select distinct * from b) s
on d.a = s.id;
- QUERY PLAN
---------------------------------------
+ QUERY PLAN
+---------------------------------------------
Merge Right Join
- Merge Cond: (b.id = d.a)
- -> Unique
- -> Sort
- Sort Key: b.id, b.c_id
- -> Seq Scan on b
+ Merge Cond: (s.id = d.a)
+ -> Sort
+ Sort Key: s.id
+ -> Subquery Scan on s
+ -> HashAggregate
+ Group Key: b.id, b.c_id
+ -> Seq Scan on b
-> Sort
Sort Key: d.a
-> Seq Scan on d
-(9 rows)
+(11 rows)
-- check join removal works when uniqueness of the join condition is enforced
-- by a UNION
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index de64ca7..0fc93d9 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -807,27 +807,24 @@ select * from int4_tbl where
explain (verbose, costs off)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
- QUERY PLAN
-----------------------------------------------------------------------
- Hash Join
+ QUERY PLAN
+----------------------------------------------------------------
+ Hash Semi Join
Output: o.f1
Hash Cond: (o.f1 = "ANY_subquery".f1)
-> Seq Scan on public.int4_tbl o
Output: o.f1
-> Hash
Output: "ANY_subquery".f1, "ANY_subquery".g
- -> HashAggregate
+ -> Subquery Scan on "ANY_subquery"
Output: "ANY_subquery".f1, "ANY_subquery".g
- Group Key: "ANY_subquery".f1, "ANY_subquery".g
- -> Subquery Scan on "ANY_subquery"
- Output: "ANY_subquery".f1, "ANY_subquery".g
- Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
- -> HashAggregate
- Output: i.f1, (generate_series(1, 2) / 10)
- Group Key: i.f1
- -> Seq Scan on public.int4_tbl i
- Output: i.f1
-(18 rows)
+ Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
+ -> HashAggregate
+ Output: i.f1, (generate_series(1, 2) / 10)
+ Group Key: i.f1
+ -> Seq Scan on public.int4_tbl i
+ Output: i.f1
+(15 rows)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 016571b..f2e297e 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -263,16 +263,16 @@ ORDER BY 1;
SELECT q2 FROM int8_tbl INTERSECT SELECT q1 FROM int8_tbl;
q2
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
SELECT q2 FROM int8_tbl INTERSECT ALL SELECT q1 FROM int8_tbl;
q2
------------------
+ 123
4567890123456789
4567890123456789
- 123
(3 rows)
SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
@@ -305,16 +305,16 @@ SELECT q1 FROM int8_tbl EXCEPT SELECT q2 FROM int8_tbl;
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q2 FROM int8_tbl;
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT DISTINCT q2 FROM int8_tbl;
q1
------------------
+ 123
4567890123456789
4567890123456789
- 123
(3 rows)
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q1 FROM int8_tbl FOR NO KEY UPDATE;
@@ -343,8 +343,8 @@ SELECT f1 FROM float8_tbl EXCEPT SELECT f1 FROM int4_tbl ORDER BY 1;
SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl;
q1
-------------------
- 4567890123456789
123
+ 4567890123456789
456
4567890123456789
123
@@ -355,15 +355,15 @@ SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FR
SELECT q1 FROM int8_tbl INTERSECT (((SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl)));
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
(((SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl))) UNION ALL SELECT q2 FROM int8_tbl;
q1
-------------------
- 4567890123456789
123
+ 4567890123456789
456
4567890123456789
123
@@ -419,8 +419,8 @@ HINT: There is a column named "q2" in table "*SELECT* 2", but it cannot be refe
SELECT q1 FROM int8_tbl EXCEPT (((SELECT q2 FROM int8_tbl ORDER BY q2 LIMIT 1)));
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
--
--
2.1.0
Hi,
I gave a very quick skim to patch 0002. Not a real review yet. But
there are a few trivial points to fix:
* You still have empty sections in the SGML docs (such as the EXAMPLES).
I suppose the syntax is now firm enough that we can get some. (I looked
at the other patches to see whether it was filled in, but couldn't find
any additional text there.)
* check_object_ownership() needs to be filled in
* Since you're adding a new object type, please add a case to cover it
in the object_address.sql pg_regress test.
* in analyze.c (and elsewhere), please put new #include lines sorted.
* I think the AT_PASS_ADD_STATS is a leftover which should be removed.
* The XXX comment in get_relation_info should probably be handled
differently (namely, in a way that makes the syscache not contain OIDs
of dropped stats)
* The README.dependencies has a lot of TODOs. Do we need to get them
done during the first cut? If not, I suggest creating a new section
"Future work" in the file.
* Please put the common.h header in src/include. Make sure not to
include "postgres.h" in it -- our policy is that postgres.h goes at the
top of every .c file and never in any .h file. Also please find a
better name for it; even mvstats_common.h would be a lot more
convincing. However:
* ISTM that the code in common.c properly belongs in
src/backend/catalog/pg_mvstats.c instead (or more properly
catalog/pg_mv_statistics.c), which probably means the common.h file
should be named something else; perhaps some of it could become
pg_mv_statistic_fn.h, while the rest continues to be
src/include/utils/mvstats_common.h? Not sure.
* The version check in psql/describe.c uses 90500; should probably be
updated to 90600.
* _copyCreateStatsStmt is missing if_not_exists
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
thanks for the feedback. Attached is v14 of the patch series, fixing
most of the points you've raised.
On Wed, 2016-03-09 at 09:22 -0300, Alvaro Herrera wrote:
Hi,
I gave a very quick skim to patch 0002. Not a real review yet. But
there are a few trivial points to fix:* You still have empty sections in the SGML docs (such as the EXAMPLES).
I suppose the syntax is now firm enough that we can get some. (I looked
at the other patches to see whether it was filled in, but couldn't find
any additional text there.)
Yes, that's one of the items I plan to work on next. Until now the
regression tests were a sufficient source of examples, but it's time to
do the SGML piece.
* check_object_ownership() needs to be filled in
Done.
I've added pg_statistics_ownercheck, which also required adding OID of
the owner to the catalog. Initially the plan was to use the same owner
as for the table, but now that we've switched to CREATE STATISTICS
partially because it will allow multi-table stats, that does not make
sense (multiple tables with different owners).
This probably means we also need an 'ALTER STATISTICS ... OWNER TO'
command, which does not exist at this point.
* Since you're adding a new object type, please add a case to cover it
in the object_address.sql pg_regress test.
Done.
Apparently there was a bunch of missing pieces in objectaddress.c, so
this adds them too.
* in analyze.c (and elsewhere), please put new #include lines sorted.
Done.
I've also significantly reduced the excessive list of includes in
statscmds.c. I expect the headers to require a bit more love, especially
in the subsequent patches (MCV, histograms etc.).
* I think the AT_PASS_ADD_STATS is a leftover which should be removed.
Yeah. Now that we've invented CREATE TABLE, all the changes to
tablecmds.c were just unnecessary leftovers. Removed.
* The XXX comment in get_relation_info should probably be handled
differently (namely, in a way that makes the syscache not contain OIDs
of dropped stats)
I believe that was actually an obsolete comment. Removed.
* The README.dependencies has a lot of TODOs. Do we need to get them
done during the first cut? If not, I suggest creating a new section
"Future work" in the file.
Right. Most of those TODOs are future work, or rather ideas (more or
less crazy). The one thing I definitely want to address now is support
for dependencies with multiple columns on the left side, because that
requires changes to serialized format. I might also look at handling IS
NULL clauses, but that may wait.
* Please put the common.h header in src/include. Make sure not to
include "postgres.h" in it -- our policy is that postgres.h goes at the
top of every .c file and never in any .h file. Also please find a
better name for it; even mvstats_common.h would be a lot more
convincing. However:* ISTM that the code in common.c properly belongs in
src/backend/catalog/pg_mvstats.c instead (or more properly
catalog/pg_mv_statistics.c), which probably means the common.h file
should be named something else; perhaps some of it could become
pg_mv_statistic_fn.h, while the rest continues to be
src/include/utils/mvstats_common.h? Not sure.
Hmmm, not sure either. The idea was that the "common.h" is pretty much
just a private header with stuff that's not very useful anywhere else.
No changes here, for now.
* The version check in psql/describe.c uses 90500; should probably be
updated to 90600.
Fixed.
* _copyCreateStatsStmt is missing if_not_exists
Fixed.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchtext/x-patch; charset=UTF-8; name=0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchDownload
From 5c28e5ca8feb2c2010d98bc69de952355bd6f3a5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 1/9] teach pull_(varno|varattno)_walker about RestrictInfo
otherwise pull_varnos fails when processing OR clauses
---
src/backend/optimizer/util/var.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index dff52c4..80d01bd 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -197,6 +197,13 @@ pull_varnos_walker(Node *node, pull_varnos_context *context)
context->sublevels_up--;
return result;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo*)node;
+ context->varnos = bms_add_members(context->varnos,
+ rinfo->clause_relids);
+ return false;
+ }
return expression_tree_walker(node, pull_varnos_walker,
(void *) context);
}
@@ -245,6 +252,15 @@ pull_varattnos_walker(Node *node, pull_varattnos_context *context)
return false;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *)node;
+
+ return expression_tree_walker((Node*)rinfo->clause,
+ pull_varattnos_walker,
+ (void*) context);
+ }
+
/* Should not find an unplanned subquery */
Assert(!IsA(node, Query));
--
2.1.0
0002-shared-infrastructure-and-functional-dependencies.patchtext/x-patch; charset=UTF-8; name=0002-shared-infrastructure-and-functional-dependencies.patchDownload
From 1c42a02189088ba194e30f5878bb67bc61953a11 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 2/9] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate stats, most
importantly:
- adds a new system catalog (pg_mv_statistic)
- CREATE STATISTICS name ON table (columns) WITH (options)
- DROP STATISTICS name
- implementation of functional dependencies (the simplest type of
multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e. it does not
influence the query planning (subject to follow-up patches).
The current implementation requires a valid 'ltopr' for the columns, so
that we can sort the sample rows in various ways, both in this patch
and other kinds of statistics. Maybe this restriction could be relaxed
in the future, requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
Maybe some of the stats (functional dependencies and MCV list with
limited functionality) might be made to work with hashes of the values,
which is sufficient for equality comparisons. But the queries would
require the equality operator anyway, so it's not really a weaker
requirement. The hashes might reduce space requirements, though.
The algorithm detecting the dependencies is rather simple and probably
needs improvements, so that it detects more complicated dependencies,
and also validation of the math.
The name 'functional dependencies' is more correct (than 'association
rules') as it's exactly the name used in relational theory (esp. Normal
Forms) for tracking column-level dependencies.
The multivariate statistics are automatically removed in two situations
(a) after a DROP TABLE (obviously)
(b) after ALTER TABLE ... DROP COLUMN, if the statistics would be
defined on less than 2 columns (remaining)
If there are more at least two remaining columns, we keep the
statistics but perform cleanup on the next ANALYZE. The dropped columns
are removed from stakeys, and the new statistics is built on the
smaller set.
We can't do this at DROP COLUMN, because that'd leave us with invalid
statistics, or we'd have to throw it away although we can still use it.
This lazy approach lets us use the statistics although some of the
columns are dead.
This also adds a simple list of statistics to \d in psql.
This means the statistics are created within a schema by using a
qualified name (or using the default schema)
CREATE STATISTICS schema.statistics ON ...
and then dropped by specifying qualified name
DROP STATISTICS schema.statistics
or searching through search_path (just like with other objects).
This also gets rid of the "(opt_)stats_name" definitions in gram.y and
instead replaces them with just "opt_any_name", although the optional
case is not really handled currently - there's no generated name yet
(so either we should drop it or implement it).
I'm not entirely sure making statistics schema-specific is that a great
idea. Maybe it should be "global", but that does not seem right (e.g.
it makes multi-tenant systems based on schemas more difficult to
manage, because tenants would interact).
---
doc/src/sgml/ref/allfiles.sgml | 2 +
doc/src/sgml/ref/create_statistics.sgml | 174 ++++++++++
doc/src/sgml/ref/drop_statistics.sgml | 90 ++++++
doc/src/sgml/reference.sgml | 2 +
src/backend/catalog/Makefile | 1 +
src/backend/catalog/aclchk.c | 27 ++
src/backend/catalog/dependency.c | 11 +-
src/backend/catalog/heap.c | 102 ++++++
src/backend/catalog/namespace.c | 51 +++
src/backend/catalog/objectaddress.c | 54 ++++
src/backend/catalog/system_views.sql | 11 +
src/backend/commands/Makefile | 6 +-
src/backend/commands/analyze.c | 21 ++
src/backend/commands/dropcmds.c | 4 +
src/backend/commands/event_trigger.c | 3 +
src/backend/commands/statscmds.c | 266 ++++++++++++++++
src/backend/nodes/copyfuncs.c | 17 +
src/backend/nodes/outfuncs.c | 18 ++
src/backend/optimizer/util/plancat.c | 59 ++++
src/backend/parser/gram.y | 34 +-
src/backend/tcop/utility.c | 11 +
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/relcache.c | 59 ++++
src/backend/utils/cache/syscache.c | 23 ++
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/README.dependencies | 222 +++++++++++++
src/backend/utils/mvstats/common.c | 356 +++++++++++++++++++++
src/backend/utils/mvstats/common.h | 75 +++++
src/backend/utils/mvstats/dependencies.c | 437 ++++++++++++++++++++++++++
src/bin/psql/describe.c | 44 +++
src/include/catalog/dependency.h | 5 +-
src/include/catalog/heap.h | 1 +
src/include/catalog/indexing.h | 7 +
src/include/catalog/namespace.h | 2 +
src/include/catalog/pg_mv_statistic.h | 75 +++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/commands/defrem.h | 4 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 12 +
src/include/nodes/relation.h | 28 ++
src/include/utils/acl.h | 1 +
src/include/utils/mvstats.h | 70 +++++
src/include/utils/rel.h | 4 +
src/include/utils/relcache.h | 1 +
src/include/utils/syscache.h | 2 +
src/test/regress/expected/object_address.out | 7 +-
src/test/regress/expected/rules.out | 9 +
src/test/regress/expected/sanity_check.out | 1 +
src/test/regress/sql/object_address.sql | 4 +-
50 files changed, 2429 insertions(+), 11 deletions(-)
create mode 100644 doc/src/sgml/ref/create_statistics.sgml
create mode 100644 doc/src/sgml/ref/drop_statistics.sgml
create mode 100644 src/backend/commands/statscmds.c
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/README.dependencies
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index bf95453..c0f7653 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -76,6 +76,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY createSchema SYSTEM "create_schema.sgml">
<!ENTITY createSequence SYSTEM "create_sequence.sgml">
<!ENTITY createServer SYSTEM "create_server.sgml">
+<!ENTITY createStatistics SYSTEM "create_statistics.sgml">
<!ENTITY createTable SYSTEM "create_table.sgml">
<!ENTITY createTableAs SYSTEM "create_table_as.sgml">
<!ENTITY createTableSpace SYSTEM "create_tablespace.sgml">
@@ -119,6 +120,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY dropSchema SYSTEM "drop_schema.sgml">
<!ENTITY dropSequence SYSTEM "drop_sequence.sgml">
<!ENTITY dropServer SYSTEM "drop_server.sgml">
+<!ENTITY dropStatistics SYSTEM "drop_statistics.sgml">
<!ENTITY dropTable SYSTEM "drop_table.sgml">
<!ENTITY dropTableSpace SYSTEM "drop_tablespace.sgml">
<!ENTITY dropTransform SYSTEM "drop_transform.sgml">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
new file mode 100644
index 0000000..a86eae3
--- /dev/null
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -0,0 +1,174 @@
+<!--
+doc/src/sgml/ref/create_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-CREATESTATISTICS">
+ <indexterm zone="sql-createstatistics">
+ <primary>CREATE STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>CREATE STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>CREATE STATISTICS</refname>
+ <refpurpose>define a new statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable> ON <replaceable class="PARAMETER">table_name</replaceable> ( [
+ { <replaceable class="PARAMETER">column_name</replaceable> } ] [, ...])
+[ WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )
+</synopsis>
+
+ </refsynopsisdiv>
+
+ <refsect1 id="SQL-CREATESTATISTICS-description">
+ <title>Description</title>
+
+ <para>
+ <command>CREATE STATISTICS</command> will create a new multivariate
+ statistics on the table. The statistics will be created in the in the
+ current database. The statistics will be owned by the user issuing
+ the command.
+ </para>
+
+ <para>
+ If a schema name is given (for example, <literal>CREATE STATISTICS
+ myschema.mystat ...</>) then the statistics is created in the specified
+ schema. Otherwise it is created in the current schema. The name of
+ the table must be distinct from the name of any other statistics in the
+ same schema.
+ </para>
+
+ <para>
+ To be able to create a table, you must have <literal>USAGE</literal>
+ privilege on all column types or the type in the <literal>OF</literal>
+ clause, respectively.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>IF NOT EXISTS</></term>
+ <listitem>
+ <para>
+ Do not throw an error if a statistics with the same name already exists.
+ A notice is issued in this case. Note that there is no guarantee that
+ the existing statistics is anything like the one that would have been
+ created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">statistics_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to be created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">table_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the table the statistics should
+ be created on.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name</replaceable></term>
+ <listitem>
+ <para>
+ The name of a column to be included in the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ ...
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <refsect2 id="SQL-CREATESTATISTICS-parameters">
+ <title id="SQL-CREATESTATISTICS-parameters-title">Statistics Parameters</title>
+
+ <indexterm zone="sql-createstatistics-parameters">
+ <primary>statistics parameters</primary>
+ </indexterm>
+
+ <para>
+ The <literal>WITH</> clause can specify <firstterm>statistics parameters</>
+ for statistics. The currently available parameters are listed below.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>dependencies</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables functional dependencies for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1 id="SQL-CREATESTATISTICS-notes">
+ <title>Notes</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+
+ <refsect1 id="SQL-CREATESTATISTICS-examples">
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>CREATE STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-dropstatistics"></member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/doc/src/sgml/ref/drop_statistics.sgml b/doc/src/sgml/ref/drop_statistics.sgml
new file mode 100644
index 0000000..4cc0b70
--- /dev/null
+++ b/doc/src/sgml/ref/drop_statistics.sgml
@@ -0,0 +1,90 @@
+<!--
+doc/src/sgml/ref/drop_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-DROPSTATISTICS">
+ <indexterm zone="sql-dropstatistics">
+ <primary>DROP STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>DROP STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>DROP STATISTICS</refname>
+ <refpurpose>remove a statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> [, ...]
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <command>DROP STATISTICS</command> removes statistics from the database.
+ Only the statistics owner, the schema owner, and superuser can drop a
+ statistics.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>IF EXISTS</literal></term>
+ <listitem>
+ <para>
+ Do not throw an error if the statistics does not exist. A notice is
+ issued in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to drop.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>DROP STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-createstatistics"></member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 03020df..2b07b2d 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -104,6 +104,7 @@
&createSchema;
&createSequence;
&createServer;
+ &createStatistics;
&createTable;
&createTableAs;
&createTableSpace;
@@ -147,6 +148,7 @@
&dropSchema;
&dropSequence;
&dropServer;
+ &dropStatistics;
&dropTable;
&dropTableSpace;
&dropTSConfig;
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 25130ec..058b8a9 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 0f3bc07..e21aacd 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -38,6 +38,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -5021,6 +5022,32 @@ pg_extension_ownercheck(Oid ext_oid, Oid roleid)
}
/*
+ * Ownership check for a multivariate statistics (specified by OID).
+ */
+bool
+pg_statistics_ownercheck(Oid stat_oid, Oid roleid)
+{
+ HeapTuple tuple;
+ Oid ownerId;
+
+ /* Superusers bypass all permission checking. */
+ if (superuser_arg(roleid))
+ return true;
+
+ tuple = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(stat_oid));
+ if (!HeapTupleIsValid(tuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics with OID %u does not exist", stat_oid)));
+
+ ownerId = ((Form_pg_mv_statistic) GETSTRUCT(tuple))->staowner;
+
+ ReleaseSysCache(tuple);
+
+ return has_privs_of_role(roleid, ownerId);
+}
+
+/*
* Check whether specified role has CREATEROLE privilege (or is a superuser)
*
* Note: roles do not have owners per se; instead we use this test in
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index c48e37b..8200454 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -40,6 +40,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -160,7 +161,8 @@ static const Oid object_classes[] = {
ExtensionRelationId, /* OCLASS_EXTENSION */
EventTriggerRelationId, /* OCLASS_EVENT_TRIGGER */
PolicyRelationId, /* OCLASS_POLICY */
- TransformRelationId /* OCLASS_TRANSFORM */
+ TransformRelationId, /* OCLASS_TRANSFORM */
+ MvStatisticRelationId /* OCLASS_STATISTICS */
};
@@ -1272,6 +1274,10 @@ doDeletion(const ObjectAddress *object, int flags)
DropTransformById(object->objectId);
break;
+ case OCLASS_STATISTICS:
+ RemoveStatisticsById(object->objectId);
+ break;
+
default:
elog(ERROR, "unrecognized object class: %u",
object->classId);
@@ -2415,6 +2421,9 @@ getObjectClass(const ObjectAddress *object)
case TransformRelationId:
return OCLASS_TRANSFORM;
+
+ case MvStatisticRelationId:
+ return OCLASS_STATISTICS;
}
/* shouldn't get here */
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 6a4a9d9..e7d9aaa 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -47,6 +47,7 @@
#include "catalog/pg_constraint_fn.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_tablespace.h"
@@ -1613,7 +1614,10 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
if (attnum > 0)
+ {
RemoveStatistics(relid, attnum);
+ RemoveMVStatistics(relid, attnum);
+ }
relation_close(rel, NoLock);
}
@@ -1841,6 +1845,11 @@ heap_drop_with_catalog(Oid relid)
RemoveStatistics(relid, 0);
/*
+ * delete multi-variate statistics
+ */
+ RemoveMVStatistics(relid, 0);
+
+ /*
* delete attribute tuples
*/
DeleteAttributeTuples(relid);
@@ -2696,6 +2705,99 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
/*
+ * RemoveMVStatistics --- remove entries in pg_mv_statistic for a rel
+ *
+ * If attnum is zero, remove all entries for rel; else remove only the one(s)
+ * for that column.
+ */
+void
+RemoveMVStatistics(Oid relid, AttrNumber attnum)
+{
+ Relation pgmvstatistic;
+ TupleDesc tupdesc = NULL;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ /*
+ * When dropping a column, we'll drop statistics with a single
+ * remaining (undropped column). To do that, we need the tuple
+ * descriptor.
+ *
+ * We already have the relation locked (as we're running ALTER
+ * TABLE ... DROP COLUMN), so we'll just get the descriptor here.
+ */
+ if (attnum != 0)
+ {
+ Relation rel = relation_open(relid, NoLock);
+
+ /* multivariate stats are supported on tables and matviews */
+ if (rel->rd_rel->relkind == RELKIND_RELATION ||
+ rel->rd_rel->relkind == RELKIND_MATVIEW)
+ tupdesc = RelationGetDescr(rel);
+
+ relation_close(rel, NoLock);
+ }
+
+ if (tupdesc == NULL)
+ return;
+
+ pgmvstatistic = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ scan = systable_beginscan(pgmvstatistic,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ bool delete = true;
+
+ if (attnum != 0)
+ {
+ Datum adatum;
+ bool isnull;
+ int i;
+ int ncolumns = 0;
+ ArrayType *arr;
+ int16 *attnums;
+
+ /* get the columns */
+ adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+ attnums = (int16*)ARR_DATA_PTR(arr);
+
+ for (i = 0; i < ARR_DIMS(arr)[0]; i++)
+ {
+ /* count the column unless it's has been / is being dropped */
+ if ((! tupdesc->attrs[attnums[i]-1]->attisdropped) &&
+ (attnums[i] != attnum))
+ ncolumns += 1;
+ }
+
+ /* delete if there are less than two attributes */
+ delete = (ncolumns < 2);
+ }
+
+ if (delete)
+ simple_heap_delete(pgmvstatistic, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(pgmvstatistic, RowExclusiveLock);
+}
+
+
+/*
* RelationTruncateIndexes - truncate all indexes associated
* with the heap relation to zero tuples.
*
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 446b2ac..dfd5bef 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -4201,3 +4201,54 @@ pg_is_other_temp_schema(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(isOtherTempNamespace(oid));
}
+
+Oid
+get_statistics_oid(List *names, bool missing_ok)
+{
+ char *schemaname;
+ char *stats_name;
+ Oid namespaceId;
+ Oid stats_oid = InvalidOid;
+ ListCell *l;
+
+ /* deconstruct the name list */
+ DeconstructQualifiedName(names, &schemaname, &stats_name);
+
+ if (schemaname)
+ {
+ /* use exact schema given */
+ namespaceId = LookupExplicitNamespace(schemaname, missing_ok);
+ if (missing_ok && !OidIsValid(namespaceId))
+ stats_oid = InvalidOid;
+ else
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ }
+ else
+ {
+ /* search for it in search path */
+ recomputeNamespacePath();
+
+ foreach(l, activeSearchPath)
+ {
+ namespaceId = lfirst_oid(l);
+
+ if (namespaceId == myTempNamespace)
+ continue; /* do not look in temp namespace */
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ if (OidIsValid(stats_oid))
+ break;
+ }
+ }
+
+ if (!OidIsValid(stats_oid) && !missing_ok)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics \"%s\" does not exist",
+ NameListToString(names))));
+
+ return stats_oid;
+}
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index d2aaa6d..85841e1 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -39,6 +39,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_opfamily.h"
@@ -438,9 +439,22 @@ static const ObjectPropertyType ObjectProperty[] =
Anum_pg_type_typacl,
ACL_KIND_TYPE,
true
+ },
+ {
+ MvStatisticRelationId,
+ MvStatisticOidIndexId,
+ MVSTATOID,
+ MVSTATNAMENSP,
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ InvalidAttrNumber, /* XXX same owner as relation */
+ InvalidAttrNumber, /* no ACL (same as relation) */
+ -1, /* no ACL */
+ true
}
};
+
/*
* This struct maps the string object types as returned by
* getObjectTypeDescription into ObjType enum values. Note that some enum
@@ -640,6 +654,10 @@ static const struct object_type_map
/* OCLASS_TRANSFORM */
{
"transform", OBJECT_TRANSFORM
+ },
+ /* OBJECT_STATISTICS */
+ {
+ "statistics", OBJECT_STATISTICS
}
};
@@ -913,6 +931,11 @@ get_object_address(ObjectType objtype, List *objname, List *objargs,
address = get_object_address_defacl(objname, objargs,
missing_ok);
break;
+ case OBJECT_STATISTICS:
+ address.classId = MvStatisticRelationId;
+ address.objectId = get_statistics_oid(objname, missing_ok);
+ address.objectSubId = 0;
+ break;
default:
elog(ERROR, "unrecognized objtype: %d", (int) objtype);
/* placate compiler, in case it thinks elog might return */
@@ -2185,6 +2208,10 @@ check_object_ownership(Oid roleid, ObjectType objtype, ObjectAddress address,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser")));
break;
+ case OBJECT_STATISTICS:
+ if (!pg_statistics_ownercheck(address.objectId, roleid))
+ aclcheck_error_type(ACLCHECK_NOT_OWNER, address.objectId);
+ break;
default:
elog(ERROR, "unrecognized object type: %d",
(int) objtype);
@@ -3610,6 +3637,10 @@ getObjectTypeDescription(const ObjectAddress *object)
appendStringInfoString(&buffer, "transform");
break;
+ case OCLASS_STATISTICS:
+ appendStringInfoString(&buffer, "statistics");
+ break;
+
default:
appendStringInfo(&buffer, "unrecognized %u", object->classId);
break;
@@ -4566,6 +4597,29 @@ getObjectIdentityParts(const ObjectAddress *object,
}
break;
+ case OCLASS_STATISTICS:
+ {
+ HeapTuple tup;
+ Form_pg_mv_statistic formStatistic;
+ char *schema;
+
+ tup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(object->objectId));
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "cache lookup failed for statistics %u",
+ object->objectId);
+ formStatistic = (Form_pg_mv_statistic) GETSTRUCT(tup);
+ schema = get_namespace_name_or_temp(formStatistic->stanamespace);
+ appendStringInfoString(&buffer,
+ quote_qualified_identifier(schema,
+ NameStr(formStatistic->staname)));
+ if (objname)
+ *objname = list_make2(schema,
+ pstrdup(NameStr(formStatistic->staname)));
+ ReleaseSysCache(tup);
+ break;
+ }
+
default:
appendStringInfo(&buffer, "unrecognized object %u %u %d",
object->classId,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index abf9a70..b8a264e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,6 +158,17 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.staname AS staname,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats WITH (security_barrier) AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index b1ac704..5151001 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -18,8 +18,8 @@ OBJS = aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
event_trigger.o explain.o extension.o foreigncmds.o functioncmds.o \
indexcmds.o lockcmds.o matview.o operatorcmds.o opclasscmds.o \
policy.o portalcmds.o prepare.o proclang.o \
- schemacmds.o seclabel.o sequence.o tablecmds.o tablespace.o trigger.o \
- tsearchcmds.o typecmds.o user.o vacuum.o vacuumlazy.o \
- variable.o view.o
+ schemacmds.o seclabel.o sequence.o statscmds.o \
+ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o \
+ user.o vacuum.o vacuumlazy.o variable.o view.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8a5f07c..9087532 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -17,6 +17,7 @@
#include <math.h>
#include "access/multixact.h"
+#include "access/sysattr.h"
#include "access/transam.h"
#include "access/tupconvert.h"
#include "access/tuptoaster.h"
@@ -27,6 +28,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -45,10 +47,13 @@
#include "storage/procarray.h"
#include "utils/acl.h"
#include "utils/attoptcache.h"
+#include "utils/builtins.h"
#include "utils/datum.h"
+#include "utils/fmgroids.h"
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_rusage.h"
#include "utils/sampling.h"
#include "utils/sortsupport.h"
@@ -460,6 +465,19 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, because we need to build more
+ * detailed stats (more MCV items / histogram buckets) to get
+ * good accuracy. Maybe it'd be appropriate to use samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -562,6 +580,9 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/dropcmds.c b/src/backend/commands/dropcmds.c
index 522027a..cd65b58 100644
--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -292,6 +292,10 @@ does_not_exist_skipping(ObjectType objtype, List *objname, List *objargs)
msg = gettext_noop("schema \"%s\" does not exist, skipping");
name = NameListToString(objname);
break;
+ case OBJECT_STATISTICS:
+ msg = gettext_noop("statistics \"%s\" does not exist, skipping");
+ name = NameListToString(objname);
+ break;
case OBJECT_TSPARSER:
if (!schema_does_not_exist_skipping(objname, &msg, &name))
{
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 9e32f8d..09061bb 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -110,6 +110,7 @@ static event_trigger_support_data event_trigger_support[] = {
{"SCHEMA", true},
{"SEQUENCE", true},
{"SERVER", true},
+ {"STATISTICS", true},
{"TABLE", true},
{"TABLESPACE", false},
{"TRANSFORM", true},
@@ -1106,6 +1107,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
case OBJECT_RULE:
case OBJECT_SCHEMA:
case OBJECT_SEQUENCE:
+ case OBJECT_STATISTICS:
case OBJECT_TABCONSTRAINT:
case OBJECT_TABLE:
case OBJECT_TRANSFORM:
@@ -1167,6 +1169,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
case OCLASS_DEFACL:
case OCLASS_EXTENSION:
case OCLASS_POLICY:
+ case OCLASS_STATISTICS:
return true;
}
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
new file mode 100644
index 0000000..1b89bbe
--- /dev/null
+++ b/src/backend/commands/statscmds.c
@@ -0,0 +1,266 @@
+/*-------------------------------------------------------------------------
+ *
+ * statscmds.c
+ * Commands for creating and altering multivariate statistics
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/commands/statscmds.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/relscan.h"
+#include "catalog/dependency.h"
+#include "catalog/indexing.h"
+#include "catalog/namespace.h"
+#include "catalog/pg_mv_statistic.h"
+#include "catalog/pg_namespace.h"
+#include "commands/defrem.h"
+#include "miscadmin.h"
+#include "utils/builtins.h"
+#include "utils/inval.h"
+#include "utils/memutils.h"
+#include "utils/mvstats.h"
+#include "utils/rel.h"
+#include "utils/syscache.h"
+
+
+/* used for sorting the attnums in ExecCreateStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the CREATE STATISTICS name ON table (columns) WITH (options)
+ *
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+ObjectAddress
+CreateStatistics(CreateStatsStmt *stmt)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+ ObjectAddress address = InvalidObjectAddress;
+ char *namestr;
+ NameData staname;
+ Oid statoid;
+ Oid namespaceId;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+ Relation rel;
+ ObjectAddress parentobject, childobject;
+
+ /* by default build nothing */
+ bool build_dependencies = false;
+
+ Assert(IsA(stmt, CreateStatsStmt));
+
+ /* resolve the pieces of the name (namespace etc.) */
+ namespaceId = QualifiedNameGetCreationNamespace(stmt->defnames, &namestr);
+ namestrcpy(&staname, namestr);
+
+ /*
+ * If if_not_exists was given and the statistics already exists, bail out.
+ */
+ if (stmt->if_not_exists &&
+ SearchSysCacheExists2(MVSTATNAMENSP,
+ PointerGetDatum(&staname),
+ ObjectIdGetDatum(namespaceId)))
+ {
+ ereport(NOTICE,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("statistics \"%s\" already exists, skipping",
+ namestr)));
+ return InvalidObjectAddress;
+ }
+
+ rel = heap_openrv(stmt->relation, AccessExclusiveLock);
+
+ /* transform the column names to attnum values */
+
+ foreach(l, stmt->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, stmt->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+ values[Anum_pg_mv_statistic_staname -1] = NameGetDatum(&staname);
+ values[Anum_pg_mv_statistic_stanamespace -1] = ObjectIdGetDatum(namespaceId);
+ values[Anum_pg_mv_statistic_staowner-1] = ObjectIdGetDatum(GetUserId());
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ statoid = HeapTupleGetOid(htup);
+
+ heap_freetuple(htup);
+
+
+ /*
+ * Store a dependency too, so that statistics are dropped on DROP TABLE
+ */
+ parentobject.classId = RelationRelationId;
+ parentobject.objectId = ObjectIdGetDatum(RelationGetRelid(rel));
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+ /*
+ * Also record dependency on the schema (to drop statistics on DROP SCHEMA)
+ */
+ parentobject.classId = NamespaceRelationId;
+ parentobject.objectId = ObjectIdGetDatum(namespaceId);
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ relation_close(rel, NoLock);
+
+ /*
+ * Invalidate relcache so that others see the new statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ ObjectAddressSet(address, MvStatisticRelationId, statoid);
+
+ return address;
+}
+
+
+/*
+ * Implements the DROP STATISTICS
+ *
+ * DROP STATISTICS stats_name ON table_name
+ *
+ * The first one requires an exact match, the second one just drops
+ * all the statistics on a table.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+ Relation relation;
+ HeapTuple tup;
+
+ /*
+ * Delete the pg_proc tuple.
+ */
+ relation = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ tup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(statsOid));
+ if (!HeapTupleIsValid(tup)) /* should not happen */
+ elog(ERROR, "cache lookup failed for statistics %u", statsOid);
+
+ simple_heap_delete(relation, &tup->t_self);
+
+ ReleaseSysCache(tup);
+
+ heap_close(relation, RowExclusiveLock);
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df7c2fa..3b7c87f 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4124,6 +4124,20 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static CreateStatsStmt *
+_copyCreateStatsStmt(const CreateStatsStmt *from)
+{
+ CreateStatsStmt *newnode = makeNode(CreateStatsStmt);
+
+ COPY_NODE_FIELD(defnames);
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+ COPY_SCALAR_FIELD(if_not_exists);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4999,6 +5013,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_CreateStatsStmt:
+ retval = _copyCreateStatsStmt(from);
+ break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index eb0fc1e..07206d7 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2153,6 +2153,21 @@ _outIndexOptInfo(StringInfo str, const IndexOptInfo *node)
}
static void
+_outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
+{
+ WRITE_NODE_TYPE("MVSTATISTICINFO");
+
+ /* NB: this isn't a complete set of fields */
+ WRITE_OID_FIELD(mvoid);
+
+ /* enabled statistics */
+ WRITE_BOOL_FIELD(deps_enabled);
+
+ /* built/available statistics */
+ WRITE_BOOL_FIELD(deps_built);
+}
+
+static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
@@ -3636,6 +3651,9 @@ _outNode(StringInfo str, const void *obj)
case T_PlannerParamItem:
_outPlannerParamItem(str, obj);
break;
+ case T_MVStatisticInfo:
+ _outMVStatisticInfo(str, obj);
+ break;
case T_ExtensibleNode:
_outExtensibleNode(str, obj);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index ad715bb..7fb2088 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -28,6 +28,7 @@
#include "catalog/dependency.h"
#include "catalog/heap.h"
#include "catalog/pg_am.h"
+#include "catalog/pg_mv_statistic.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -40,7 +41,9 @@
#include "parser/parsetree.h"
#include "rewrite/rewriteManip.h"
#include "storage/bufmgr.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
+#include "utils/syscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -94,6 +97,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
Relation relation;
bool hasindex;
List *indexinfos = NIL;
+ List *stainfos = NIL;
/*
* We need not lock the relation since it was already locked, either by
@@ -387,6 +391,61 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->indexlist = indexinfos;
+ if (true)
+ {
+ List *mvstatoidlist;
+ ListCell *l;
+
+ mvstatoidlist = RelationGetMVStatList(relation);
+
+ foreach(l, mvstatoidlist)
+ {
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ Oid mvoid = lfirst_oid(l);
+ Form_pg_mv_statistic mvstat;
+ MVStatisticInfo *info;
+
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /* unavailable stats are not interesting for the planner */
+ if (mvstat->deps_built)
+ {
+ info = makeNode(MVStatisticInfo);
+
+ info->mvoid = mvoid;
+ info->rel = rel;
+
+ /* enabled statistics */
+ info->deps_enabled = mvstat->deps_enabled;
+
+ /* built/available statistics */
+ info->deps_built = mvstat->deps_built;
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ info->stakeys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+
+ stainfos = lcons(info, stainfos);
+ }
+
+ ReleaseSysCache(htup);
+ }
+
+ list_free(mvstatoidlist);
+ }
+
+ rel->mvstatlist = stainfos;
+
/* Grab foreign-table info using the relcache, while we have it */
if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b307b48..3be3f02 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -241,7 +241,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
ConstraintsSetStmt CopyStmt CreateAsStmt CreateCastStmt
CreateDomainStmt CreateExtensionStmt CreateGroupStmt CreateOpClassStmt
CreateOpFamilyStmt AlterOpFamilyStmt CreatePLangStmt
- CreateSchemaStmt CreateSeqStmt CreateStmt CreateTableSpaceStmt
+ CreateSchemaStmt CreateSeqStmt CreateStmt CreateStatsStmt CreateTableSpaceStmt
CreateFdwStmt CreateForeignServerStmt CreateForeignTableStmt
CreateAssertStmt CreateTransformStmt CreateTrigStmt CreateEventTrigStmt
CreateUserStmt CreateUserMappingStmt CreateRoleStmt CreatePolicyStmt
@@ -809,6 +809,7 @@ stmt :
| CreateSchemaStmt
| CreateSeqStmt
| CreateStmt
+ | CreateStatsStmt
| CreateTableSpaceStmt
| CreateTransformStmt
| CreateTrigStmt
@@ -3436,6 +3437,36 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * CREATE STATISTICS stats_name ON relname (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+
+CreateStatsStmt: CREATE STATISTICS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $3;
+ n->relation = $5;
+ n->keys = $7;
+ n->options = $9;
+ n->if_not_exists = false;
+ $$ = (Node *)n;
+ }
+ | CREATE STATISTICS IF_P NOT EXISTS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $6;
+ n->relation = $8;
+ n->keys = $10;
+ n->options = $12;
+ n->if_not_exists = true;
+ $$ = (Node *)n;
+ }
+ ;
+
/*****************************************************************************
*
@@ -5621,6 +5652,7 @@ drop_type: TABLE { $$ = OBJECT_TABLE; }
| TEXT_P SEARCH DICTIONARY { $$ = OBJECT_TSDICTIONARY; }
| TEXT_P SEARCH TEMPLATE { $$ = OBJECT_TSTEMPLATE; }
| TEXT_P SEARCH CONFIGURATION { $$ = OBJECT_TSCONFIGURATION; }
+ | STATISTICS { $$ = OBJECT_STATISTICS; }
;
any_name_list:
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 045f7f0..2ba88e2 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1520,6 +1520,10 @@ ProcessUtilitySlow(Node *parsetree,
address = ExecSecLabelStmt((SecLabelStmt *) parsetree);
break;
+ case T_CreateStatsStmt: /* CREATE STATISTICS */
+ address = CreateStatistics((CreateStatsStmt *) parsetree);
+ break;
+
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(parsetree));
@@ -2160,6 +2164,9 @@ CreateCommandTag(Node *parsetree)
case OBJECT_TRANSFORM:
tag = "DROP TRANSFORM";
break;
+ case OBJECT_STATISTICS:
+ tag = "DROP STATISTICS";
+ break;
default:
tag = "???";
}
@@ -2527,6 +2534,10 @@ CreateCommandTag(Node *parsetree)
tag = "EXECUTE";
break;
+ case T_CreateStatsStmt:
+ tag = "CREATE STATISTICS";
+ break;
+
case T_DeallocateStmt:
{
DeallocateStmt *stmt = (DeallocateStmt *) parsetree;
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 130c06d..3bc4c8a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -47,6 +47,7 @@
#include "catalog/pg_auth_members.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
@@ -3956,6 +3957,62 @@ RelationGetIndexList(Relation relation)
return result;
}
+
+List *
+RelationGetMVStatList(Relation relation)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result;
+ List *oldlist;
+ MemoryContext oldcxt;
+
+ /* Quick exit if we already computed the list. */
+ if (relation->rd_mvstatvalid != 0)
+ return list_copy(relation->rd_mvstatlist);
+
+ /*
+ * We build the list we intend to return (in the caller's context) while
+ * doing the scan. After successfully completing the scan, we copy that
+ * list into the relcache entry. This avoids cache-context memory leakage
+ * if we get some sort of error partway through.
+ */
+ result = NIL;
+
+ /* Prepare to scan pg_index for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(relation)));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ /* TODO maybe include only already built statistics? */
+ result = insert_ordered_oid(result, HeapTupleGetOid(htup));
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* Now save a copy of the completed list in the relcache entry. */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ oldlist = relation->rd_mvstatlist;
+ relation->rd_mvstatlist = list_copy(result);
+
+ relation->rd_mvstatvalid = true;
+ MemoryContextSwitchTo(oldcxt);
+
+ /* Don't leak the old list, if there is one */
+ list_free(oldlist);
+
+ return result;
+}
+
/*
* insert_ordered_oid
* Insert a new Oid into a sorted list of Oids, preserving ordering
@@ -4920,6 +4977,8 @@ load_relcache_init_file(bool shared)
rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_idattr = NULL;
+ rel->rd_mvstatvalid = false;
+ rel->rd_mvstatlist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 65ffe84..3c1bc4b 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -44,6 +44,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -502,6 +503,28 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATNAMENSP */
+ MvStatisticNameIndexId,
+ 2,
+ {
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ 0,
+ 0
+ },
+ 4
+ },
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 4
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.dependencies b/src/backend/utils/mvstats/README.dependencies
new file mode 100644
index 0000000..1f96fbc
--- /dev/null
+++ b/src/backend/utils/mvstats/README.dependencies
@@ -0,0 +1,222 @@
+Soft functional dependencies
+============================
+
+A type of multivariate statistics used to capture cases when one column (or
+possibly a combination of columns) determines values in another column. We may
+also say that one column implies the other one.
+
+A simple artificial example may be a table with two columns, created like this
+
+ CREATE TABLE t (a INT, b INT)
+ AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+
+Clearly, once we know the value for column 'a' the value for 'b' is trivially
+determined, as it's simply (a/10). A more practical example may be addresses,
+where (ZIP code -> city name), i.e. once we know the ZIP, we probably know the
+city it belongs to, as ZIP codes are usually assigned to one city. Larger cities
+may have multiple ZIP codes, so the dependency can't be reversed.
+
+Functional dependencies are a concept well described in relational theory,
+particularly in definition of normalization and "normal forms". Wikipedia has a
+nice definition of a functional dependency [1]:
+
+ In a given table, an attribute Y is said to have a functional dependency on
+ a set of attributes X (written X -> Y) if and only if each X value is
+ associated with precisely one Y value. For example, in an "Employee" table
+ that includes the attributes "Employee ID" and "Employee Date of Birth", the
+ functional dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ It follows from the previous two sentences that each {Employee ID} is
+ associated with precisely one {Employee Date of Birth}.
+
+ [1] http://en.wikipedia.org/wiki/Database_normalization
+
+Many datasets might be normalized not to contain such dependencies, but often
+it's not practical for various reasons. In some cases it's actually a conscious
+design choice to model the dataset in denormalized way, either because of
+performance or to make querying easier.
+
+The functional dependencies are called 'soft' because the implementation is
+meant to allow small number of rows contradicting the dependency. Many actual
+data sets contain some sort of errors, either because of data entry mistakes
+(user mistyping the ZIP code) or issues in generating the data (e.g. a ZIP code
+mistakenly assigned to two cities in different states). A strict implementation
+would ignore dependencies on such noisy data, rendering the approach unusable on
+such data sets.
+
+
+Mining dependencies (ANALYZE)
+-----------------------------
+
+The current build algorithm is rather simple - for each pair (a,b) of columns,
+the data are sorted lexicographically (first by 'a', then by 'b'). Then for each
+group (rows with the same 'a' value) we decide whether the group is neutral,
+supporting or contradicting the dependency (a->b).
+
+A group is considered neutral when it's too small - e.g. when there's a single
+row in the group, there can't possibly be multiple values in 'b'. For this
+reason we ignore groups smaller than a threshold (currently 3 rows).
+
+For sufficiently large groups (3 rows or more), we count the number of distinct
+values in 'b'. When there's a single 'b' value, the group is considered to
+support the dependency (a->b), otherwise it's condidered as contradicting it.
+
+At the end, we compare the number of rows in supporting and contradicting groups,
+and if there are at least 10x as many supporting rows, we consider the
+functional dependency to be valid.
+
+
+This approach has the negative property that the algorithm is that it's a bit
+fragile with respect to the sample - there may be data sets producing quite
+different results for each ANALYZE execution (as even a single row may change
+the outcome of the final 10x test).
+
+It was proposed to make the dependencies "fuzzy" - e.g. track some coefficient
+between [0,1] determining how much the dependency holds. That would however mean
+we have to keep all the dependencies, as eliminating them based on the value of
+the coefficient (e.g. throw away dependencies <= 0.5) would result in exactly
+the same fragility issues. This would also make it more complicated to combine
+dependencies. So this does not seem like a practical approach.
+
+A better approach might be to replace the constants (min_group_size=3 and 10x)
+with values somehow related to the particular data set.
+
+
+Clause reduction (planner/optimizer)
+------------------------------------
+
+Apllying the functional dependencies is quite simple - given a list of equality
+clauses, check which clauses are redundant (i.e. implied by some other clause).
+For example given clause list
+
+ (a = 2) AND (b = 2) AND (c = 3)
+
+and dependencies (a->b) and (a->d), the list of clauses may be simplified to
+
+ (a = 1) AND (c = 3)
+
+Functional dependencies may only be applied to equality clauses, all other types
+of clauses are ignored. See clauselist_apply_dependencies() for more details.
+
+
+Compatibility of clauses
+------------------------
+
+The reduction assumes the clauses really are redundant, and the value in the
+reduced clause (b=2) is the value determined by (a=1). If that's not the case
+and the values are "incompatible" the result will be over-estimation.
+
+This may happen for example when using conditions on ZIP and city name with
+mismatching values (ZIP for a different city), etc. In such case the result
+set will be empty, but we'll estimate the selectivity using the ZIP condition.
+
+In this case the default estimation based on AVIA principle happens to work
+better, but mostly by chance.
+
+
+Dependencies vs. MCV/histogram
+------------------------------
+
+In some cases the "compatibility" of the conditions might be verified using the
+other types of multivariate stats - MCV lists and histograms.
+
+For MCV lists the verification might be very simple - peek into the list if
+there are any items matching the clause on the 'a' column (e.g. ZIP code), and
+if such item is found, check that the 'b' column matches the other clause. If it
+does not, the clauses are contradictory. We can't really say if such item was
+not found, except maybe restricting the selectivity using the MCV data (e.g.
+using min/max selectivity, or something).
+
+With histograms, it might work similarly - we can't check the values directly
+(because histograms use buckets, unlike MCV lists, storing the actual values).
+So we can only observe the buckets matching the clauses - if those buckets have
+very low frequency, it probably means the two clauses are incompatible.
+
+It's unclear what 'low frequency' is, but if one of the clauses is implied
+(automatically true because of the other clause), then
+
+ selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+
+So we might compute selectivity of the first clause - for example using regular
+statistics. And then check if the selectivity computed from the histogram is
+about the same (or significantly lower).
+
+The problem is that histograms work well only when the data ordering matches the
+natural meaning. For values that serve as labels - like city names or ZIP codes,
+or even generated IDs, histograms really don't work all that well. For example
+sorting cities by name won't match the sorting of ZIP codes, rendering the
+histogram unusable.
+
+So MCVs are probably going to work much better, because they don't really assume
+any sort of ordering. And it's probably more appropriate for the label-like data.
+
+A good question however is why even use functional dependencies in such cases
+and not simply use the MCV/histogram instead. One reason is that the functional
+dependencies allow fallback to regular stats, and often produce more accurate
+estimates - especially compared to histograms, that are quite bad in estimating
+equality clauses.
+
+
+Limitations
+-----------
+
+Let's see the main liminations of functional dependencies, especially those
+related to the current implementation.
+
+The current implementation supports only dependencies between two columns, but
+this is merely a simplification of the initial implementation. It's certainly
+useful to mine for dependencies involving multiple columns on the 'left' side,
+i.e. a condition for the dependency. That is dependencies like (a,b -> c).
+
+The implementation may/should be smart enough not to mine redundant conditions,
+e.g. (a->b) and (a,c -> b), because the latter is a trivial consequence of the
+former one (if values of 'a' determine 'b', adding another column won't change
+that relationship). The ANALYZE should first analyze 1:1 dependencies, then 2:1
+dependencies (and skip the already identified ones), etc.
+
+For example the dependency
+
+ (city name -> zip code)
+
+is much stronger, i.e. whenever it hold, then
+
+ (city name, state name -> zip code)
+
+holds too. But in case there are cities with the same name in different states,
+then only the latter dependency will be valid.
+
+Of course, there probably are cities with the same name within a single state,
+but hopefully this is relatively rare occurence (and thus we'll still detect
+the 'soft' dependency).
+
+Handling multiple columns on the right side of the dependency, is not necessary,
+as those dependencies may be simply decomposed into a set of dependencies with
+the same meaning, one for each column on the right side. For example
+
+ (a -> b,c)
+
+is exactly the same as
+
+ (a -> b) & (a -> c)
+
+Of course, storing the first form may be more efficient thant storing multiple
+'simple' dependencies separately.
+
+
+TODO Support dependencies with multiple columns on left/right.
+
+TODO Investigate using histogram and MCV list to verify the dependencies.
+
+TODO Investigate statistical testing of the distribution (to decide whether it
+ makes sense to build the histogram/MCV list).
+
+TODO Using a min/max of selectivities would probably make more sense for the
+ associated columns.
+
+TODO Consider eliminating the implied columns from the histogram and MCV lists
+ (but maybe that's not a good idea, because that'd make it impossible to use
+ these stats for non-equality clauses and also it wouldn't be possible to
+ use the stats for verification of the dependencies).
+
+TODO The reduction probably might be extended to also handle IS NULL clauses,
+ assuming we fix the ANALYZE to properly handle NULL values. We however
+ won't be able to reduce IS NOT NULL (unless I'm missing something).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..a755c49
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,356 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+static List* list_mv_stats(Oid relid);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ ListCell *lc;
+ List *mvstats;
+
+ TupleDesc tupdesc = RelationGetDescr(onerel);
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel));
+
+ foreach (lc, mvstats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies deps = NULL;
+
+ VacAttrStats **stats = NULL;
+ int numatts = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = stat->stakeys;
+
+ /* see how many of the columns are not dropped */
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ numatts += 1;
+
+ /* if there are dropped attributes, build a filtered int2vector */
+ if (numatts != attrs->dim1)
+ {
+ int16 *tmp = palloc0(numatts * sizeof(int16));
+ int attnum = 0;
+
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ tmp[attnum++] = attrs->values[j];
+
+ pfree(attrs);
+ attrs = buildint2vector(tmp, numatts);
+ }
+
+ /* filter only the interesting vacattrstats records */
+ stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(stat->mvoid, deps, attrs);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+static List*
+list_mv_stats(Oid relid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result = NIL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ info->mvoid = HeapTupleGetOid(htup);
+ info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_built = stats->deps_built;
+
+ result = lappend(result, info);
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_stakeys-1] = false;
+
+ /* use the new attnums, in case we removed some dropped ones */
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_stakeys -1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..d96422d
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/sysattr.h"
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/builtins.h"
+#include "utils/datum.h"
+#include "utils/fmgroids.h"
+#include "utils/mvstats.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..2a064a0
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,437 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ * Detect functional dependencies between columns.
+ *
+ * TODO This builds a complete set of dependencies, i.e. including transitive
+ * dependencies - if we identify [A => B] and [B => C], we're likely to
+ * identify [A => C] too. It might be better to keep only the minimal set
+ * of dependencies, i.e. prune all the dependencies that we can recreate
+ * by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may be
+ * recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check before
+ * actually doing the work
+ *
+ * The second option has the advantage that we don't really need to perform
+ * the sort/count. It's not sufficient alone, though, because we may
+ * discover the dependencies in the wrong order. For example we may find
+ *
+ * (a -> b), (a -> c) and then (b -> c)
+ *
+ * None of those dependencies is a combination of the already known ones,
+ * yet (a -> C) is a combination of (a -> b) and (b -> c).
+ *
+ *
+ * FIXME Currently we simply replace NULL values with 0 and then handle is as
+ * a regular value, but that groups NULL and actual 0 values. That's
+ * clearly incorrect - we need to handle NULL values as a separate value.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct values in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we can estimate the
+ * average group size and use it as a threshold. Or something
+ * like that. Seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, stats);
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ SortItem current;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, stats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ current = items[0];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
+ {
+ n_violations += 1;
+ }
+
+ current = items[i];
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means there's a 1:1 relation (or one is a 'label'), making the
+ * conditions rather redundant. Although it's possible that the
+ * query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index fd8dc91..8ce9c0e 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2104,6 +2104,50 @@ describeOneTableDetails(const char *schemaname,
PQclear(result);
}
+ /* print any multivariate statistics */
+ if (pset.sversion >= 90600)
+ {
+ printfPQExpBuffer(&buf,
+ "SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
+ " deps_enabled,\n"
+ " deps_built,\n"
+ " (SELECT string_agg(attname::text,', ')\n"
+ " FROM ((SELECT unnest(stakeys) AS attnum) s\n"
+ " JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
+ "FROM pg_mv_statistic stat WHERE starelid = '%s' ORDER BY 1;",
+ oid);
+
+ result = PSQLexec(buf.data);
+ if (!result)
+ goto error_return;
+ else
+ tuples = PQntuples(result);
+
+ if (tuples > 0)
+ {
+ printTableAddFooter(&cont, _("Statistics:"));
+ for (i = 0; i < tuples; i++)
+ {
+ printfPQExpBuffer(&buf, " ");
+
+ /* statistics name (qualified with namespace) */
+ appendPQExpBuffer(&buf, "\"%s.%s\" ",
+ PQgetvalue(result, i, 1),
+ PQgetvalue(result, i, 2));
+
+ /* options */
+ if (!strcmp(PQgetvalue(result, i, 4), "t"))
+ appendPQExpBuffer(&buf, "(dependencies)");
+
+ appendPQExpBuffer(&buf, " ON (%s)",
+ PQgetvalue(result, i, 6));
+
+ printTableAddFooter(&cont, buf.data);
+ }
+ }
+ PQclear(result);
+ }
+
/* print rules */
if (tableinfo.hasrules && tableinfo.relkind != 'm')
{
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 049bf9f..12211fe 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -153,10 +153,11 @@ typedef enum ObjectClass
OCLASS_EXTENSION, /* pg_extension */
OCLASS_EVENT_TRIGGER, /* pg_event_trigger */
OCLASS_POLICY, /* pg_policy */
- OCLASS_TRANSFORM /* pg_transform */
+ OCLASS_TRANSFORM, /* pg_transform */
+ OCLASS_STATISTICS /* pg_mv_statistics */
} ObjectClass;
-#define LAST_OCLASS OCLASS_TRANSFORM
+#define LAST_OCLASS OCLASS_STATISTICS
/* in dependency.c */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index b80d8d8..5ae42f7 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -119,6 +119,7 @@ extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
DropBehavior behavior, bool complain, bool internal);
extern void RemoveAttrDefaultById(Oid attrdefId);
extern void RemoveStatistics(Oid relid, AttrNumber attnum);
+extern void RemoveMVStatistics(Oid relid, AttrNumber attnum);
extern Form_pg_attribute SystemAttributeDefinition(AttrNumber attno,
bool relhasoids);
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index ab2c1a8..a768bb5 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,13 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_name_index, 3997, on pg_mv_statistic using btree(staname name_ops, stanamespace oid_ops));
+#define MvStatisticNameIndexId 3997
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index 2ccb3a7..44cf9c6 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -137,6 +137,8 @@ extern Oid get_collation_oid(List *collname, bool missing_ok);
extern Oid get_conversion_oid(List *conname, bool missing_ok);
extern Oid FindDefaultConversionProc(int32 for_encoding, int32 to_encoding);
+extern Oid get_statistics_oid(List *names, bool missing_ok);
+
/* initialization & transaction cleanup code */
extern void InitializeSearchPath(void);
extern void AtEOXact_Namespace(bool isCommit, bool parallel);
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..c74af47
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+ NameData staname; /* statistics name */
+ Oid stanamespace; /* OID of namespace containing this statistics */
+ Oid staowner; /* statistics owner */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_mv_statistic
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 8
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_staname 2
+#define Anum_pg_mv_statistic_stanamespace 3
+#define Anum_pg_mv_statistic_staowner 4
+#define Anum_pg_mv_statistic_deps_enabled 5
+#define Anum_pg_mv_statistic_deps_built 6
+#define Anum_pg_mv_statistic_stakeys 7
+#define Anum_pg_mv_statistic_stadeps 8
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index cbbb883..eecce40 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2666,6 +2666,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index b7a38ce..a52096b 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3577, 3578);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 54f67e9..99a6a62 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -75,6 +75,10 @@ extern ObjectAddress DefineOperator(List *names, List *parameters);
extern void RemoveOperatorById(Oid operOid);
extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
+/* commands/statscmds.c */
+extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
+extern void RemoveStatisticsById(Oid statsOid);
+
/* commands/aggregatecmds.c */
extern ObjectAddress DefineAggregate(List *name, List *args, bool oldstyle,
List *parameters, const char *queryString);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index fad9988..545b62a 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -266,6 +266,7 @@ typedef enum NodeTag
T_PlaceHolderInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
+ T_MVStatisticInfo,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
@@ -401,6 +402,7 @@ typedef enum NodeTag
T_CreatePolicyStmt,
T_AlterPolicyStmt,
T_CreateTransformStmt,
+ T_CreateStatsStmt,
/*
* TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fd0629..e1807fb 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -601,6 +601,17 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct CreateStatsStmt
+{
+ NodeTag type;
+ List *defnames; /* qualified name (list of Value strings) */
+ RangeVar *relation; /* relation to build statistics on */
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+ bool if_not_exists; /* just do nothing if statistics already exists? */
+} CreateStatsStmt;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1410,6 +1421,7 @@ typedef enum ObjectType
OBJECT_RULE,
OBJECT_SCHEMA,
OBJECT_SEQUENCE,
+ OBJECT_STATISTICS,
OBJECT_TABCONSTRAINT,
OBJECT_TABLE,
OBJECT_TABLESPACE,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 641728b..e10dcf1 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -539,6 +539,7 @@ typedef struct RelOptInfo
List *lateral_vars; /* LATERAL Vars and PHVs referenced by rel */
Relids lateral_referencers; /* rels that reference me laterally */
List *indexlist; /* list of IndexOptInfo */
+ List *mvstatlist; /* list of MVStatisticInfo */
BlockNumber pages; /* size estimates derived from pg_class */
double tuples;
double allvisfrac;
@@ -634,6 +635,33 @@ typedef struct IndexOptInfo
void (*amcostestimate) (); /* AM's cost estimator */
} IndexOptInfo;
+/*
+ * MVStatisticInfo
+ * Information about multivariate stats for planning/optimization
+ *
+ * This contains information about which columns are covered by the
+ * statistics (stakeys), which options were requested while adding the
+ * statistics (*_enabled), and which kinds of statistics were actually
+ * built and are available for the optimizer (*_built).
+ */
+typedef struct MVStatisticInfo
+{
+ NodeTag type;
+
+ Oid mvoid; /* OID of the statistics row */
+ RelOptInfo *rel; /* back-link to index's table */
+
+ /* enabled statistics */
+ bool deps_enabled; /* functional dependencies enabled */
+
+ /* built/available statistics */
+ bool deps_built; /* functional dependencies built */
+
+ /* columns in the statistics (attnums) */
+ int2vector *stakeys; /* attnums of the columns covered */
+
+} MVStatisticInfo;
+
/*
* EquivalenceClasses
diff --git a/src/include/utils/acl.h b/src/include/utils/acl.h
index 4e15a14..3e11253 100644
--- a/src/include/utils/acl.h
+++ b/src/include/utils/acl.h
@@ -330,6 +330,7 @@ extern bool pg_foreign_data_wrapper_ownercheck(Oid srv_oid, Oid roleid);
extern bool pg_foreign_server_ownercheck(Oid srv_oid, Oid roleid);
extern bool pg_event_trigger_ownercheck(Oid et_oid, Oid roleid);
extern bool pg_extension_ownercheck(Oid ext_oid, Oid roleid);
+extern bool pg_statistics_ownercheck(Oid stat_oid, Oid roleid);
extern bool has_createrole_privilege(Oid roleid);
extern bool has_bypassrls_privilege(Oid roleid);
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..7ebd961
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,70 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "fmgr.h"
+#include "commands/vacuum.h"
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+
+#endif
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index f2bebf2..8771f9c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -61,6 +61,7 @@ typedef struct RelationData
bool rd_isvalid; /* relcache entry is valid */
char rd_indexvalid; /* state of rd_indexlist: 0 = not valid, 1 =
* valid, 2 = temporarily forced */
+ bool rd_mvstatvalid; /* state of rd_mvstatlist: true/false */
/*
* rd_createSubid is the ID of the highest subtransaction the rel has
@@ -93,6 +94,9 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
+
+ /* data managed by RelationGetMVStatList: */
+ List *rd_mvstatlist; /* list of OIDs of multivariate stats */
/* data managed by RelationGetIndexAttrBitmap: */
Bitmapset *rd_indexattr; /* identifies columns used in indexes */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 1b48304..9f03c8d 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -38,6 +38,7 @@ extern void RelationClose(Relation relation);
* Routines to compute/retrieve additional cached information
*/
extern List *RelationGetIndexList(Relation relation);
+extern List *RelationGetMVStatList(Relation relation);
extern Oid RelationGetOidIndex(Relation relation);
extern Oid RelationGetReplicaIndex(Relation relation);
extern List *RelationGetIndexExpressions(Relation relation);
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 256615b..0e0658d 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,8 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATNAMENSP,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/object_address.out b/src/test/regress/expected/object_address.out
index 75751be..eb60960 100644
--- a/src/test/regress/expected/object_address.out
+++ b/src/test/regress/expected/object_address.out
@@ -35,6 +35,7 @@ ALTER DEFAULT PRIVILEGES FOR ROLE regtest_addr_user REVOKE DELETE ON TABLES FROM
CREATE TRANSFORM FOR int LANGUAGE SQL (
FROM SQL WITH FUNCTION varchar_transform(internal),
TO SQL WITH FUNCTION int4recv(internal));
+CREATE STATISTICS addr_nsp.gentable_stat ON addr_nsp.gentable(a,b) WITH (dependencies);
-- test some error cases
SELECT pg_get_object_address('stone', '{}', '{}');
ERROR: unrecognized object type "stone"
@@ -373,7 +374,8 @@ WITH objects (type, name, args) AS (VALUES
-- extension
-- event trigger
('policy', '{addr_nsp, gentable, genpol}', '{}'),
- ('transform', '{int}', '{sql}')
+ ('transform', '{int}', '{sql}'),
+ ('statistics', '{addr_nsp, gentable_stat}', '{}')
)
SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)).*,
-- test roundtrip through pg_identify_object_as_address
@@ -420,13 +422,14 @@ SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)).*,
trigger | | | t on addr_nsp.gentable | t
operator family | pg_catalog | integer_ops | pg_catalog.integer_ops USING btree | t
policy | | | genpol on addr_nsp.gentable | t
+ statistics | addr_nsp | gentable_stat | addr_nsp.gentable_stat | t
collation | pg_catalog | "default" | pg_catalog."default" | t
transform | | | for integer on language sql | t
text search dictionary | addr_nsp | addr_ts_dict | addr_nsp.addr_ts_dict | t
text search parser | addr_nsp | addr_ts_prs | addr_nsp.addr_ts_prs | t
text search configuration | addr_nsp | addr_ts_conf | addr_nsp.addr_ts_conf | t
text search template | addr_nsp | addr_ts_temp | addr_nsp.addr_ts_temp | t
-(41 rows)
+(42 rows)
---
--- Cleanup resources
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 81bc5c9..84b4425 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1368,6 +1368,15 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.staname,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index eb0bc88..92a0d8a 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
diff --git a/src/test/regress/sql/object_address.sql b/src/test/regress/sql/object_address.sql
index 68e7cb0..3775b28 100644
--- a/src/test/regress/sql/object_address.sql
+++ b/src/test/regress/sql/object_address.sql
@@ -39,6 +39,7 @@ ALTER DEFAULT PRIVILEGES FOR ROLE regtest_addr_user REVOKE DELETE ON TABLES FROM
CREATE TRANSFORM FOR int LANGUAGE SQL (
FROM SQL WITH FUNCTION varchar_transform(internal),
TO SQL WITH FUNCTION int4recv(internal));
+CREATE STATISTICS addr_nsp.gentable_stat ON addr_nsp.gentable(a,b) WITH (dependencies);
-- test some error cases
SELECT pg_get_object_address('stone', '{}', '{}');
@@ -166,7 +167,8 @@ WITH objects (type, name, args) AS (VALUES
-- extension
-- event trigger
('policy', '{addr_nsp, gentable, genpol}', '{}'),
- ('transform', '{int}', '{sql}')
+ ('transform', '{int}', '{sql}'),
+ ('statistics', '{addr_nsp, gentable_stat}', '{}')
)
SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)).*,
-- test roundtrip through pg_identify_object_as_address
--
2.1.0
0003-clause-reduction-using-functional-dependencies.patchtext/x-patch; charset=UTF-8; name=0003-clause-reduction-using-functional-dependencies.patchDownload
From 2433b5b3cb25a093f78857adb7f9c0b12ac88967 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 19:42:18 +0200
Subject: [PATCH 3/9] clause reduction using functional dependencies
During planning, use functional dependencies to decide which clauses to
skip during cardinality estimation. Initial and rather simplistic
implementation.
This only works with regular WHERE clauses, not clauses used for join
clauses.
Note: The clause_is_mv_compatible() needs to identify the relation (so
that we can fetch the list of multivariate stats by OID).
planner_rt_fetch() seems like the appropriate way to get the relation
OID, but apparently it only works with simple vars. Maybe
examine_variable() would make this work with more complex vars too?
Includes regression tests analyzing functional dependencies (part of
ANALYZE) on several datasets (no dependencies, no transitive
dependencies, ...).
Checks that a query with conditions on two columns, where one (B) is
functionally dependent on the other one (A), correctly ignores the
clause on (B) and chooses bitmap index scan instead of plain index scan
(which is what happens otherwise, thanks to assumption of
independence).
Note: Functional dependencies only work with equality clauses, no
inequalities etc.
---
src/backend/optimizer/path/clausesel.c | 891 +++++++++++++++++++++++++-
src/backend/utils/mvstats/README.stats | 36 ++
src/backend/utils/mvstats/common.c | 5 +-
src/backend/utils/mvstats/dependencies.c | 24 +
src/include/utils/mvstats.h | 16 +-
src/test/regress/expected/mv_dependencies.out | 172 +++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 150 +++++
9 files changed, 1293 insertions(+), 5 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.stats
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 02660c2..80708fe 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -14,14 +14,19 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
+#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/selfuncs.h"
+#include "utils/typcache.h"
/*
@@ -41,6 +46,23 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+
+static bool clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum);
+
+static Bitmapset *collect_mv_attnums(List *clauses, Index relid);
+
+static int count_mv_attnums(List *clauses, Index relid);
+
+static int count_varnos(List *clauses, Index *relid);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Index relid, List *stats);
+
+static bool has_stats(List *stats, int type);
+
+static List * find_stats(PlannerInfo *root, Index relid);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
@@ -60,7 +82,19 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
+ * The first thing we try to do is applying multivariate statistics, in a way
+ * that intends to minimize the overhead when there are no multivariate stats
+ * on the relation. Thus we do several simple (and inexpensive) checks first,
+ * to verify that suitable multivariate statistics exist.
+ *
+ * If we identify such multivariate statistics apply, we try to apply them.
+ * Currently we only have (soft) functional dependencies, so we try to reduce
+ * the list of clauses.
+ *
+ * Then we remove the clauses estimated using multivariate stats, and process
+ * the rest of the clauses using the regular per-column stats.
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -99,6 +133,22 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+
+ /* list of multivariate stats on the relation */
+ List *stats = NIL;
+
+ /*
+ * To fetch the statistics, we first need to determine the rel. Currently
+ * point we only support estimates of simple restrictions with all Vars
+ * referencing a single baserel. However set_baserel_size_estimates() sets
+ * varRelid=0 so we have to actually inspect the clauses by pull_varnos
+ * and see if there's just a single varno referenced.
+ */
+ if ((count_varnos(clauses, &relid) == 1) && ((varRelid == 0) || (varRelid == relid)))
+ stats = find_stats(root, relid);
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +158,24 @@ clauselist_selectivity(PlannerInfo *root,
varRelid, jointype, sjinfo);
/*
+ * Apply functional dependencies, but first check that there are some stats
+ * with functional dependencies built (by simply walking the stats list),
+ * and that there are at two or more attributes referenced by clauses that
+ * may be reduced using functional dependencies.
+ *
+ * We would find that anyway when trying to actually apply the functional
+ * dependencies, but let's do the cheap checks first.
+ *
+ * After applying the functional dependencies we get the remainig clauses
+ * that need to be estimated by other types of stats (MCV, histograms etc).
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
+ (count_mv_attnums(clauses, relid) >= 2))
+ {
+ clauses = clauselist_apply_dependencies(root, clauses, relid, stats);
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -763,3 +831,824 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Pull varattnos from the clauses, similarly to pull_varattnos() but:
+ *
+ * (a) only get attributes for a particular relation (relid)
+ * (b) ignore system attributes (we can't build stats on them anyway)
+ *
+ * This makes it possible to directly compare the result with attnum
+ * values from pg_attribute etc.
+ */
+static Bitmapset *
+get_varattnos(Node * node, Index relid)
+{
+ int k;
+ Bitmapset *varattnos = NULL;
+ Bitmapset *result = NULL;
+
+ /* get the varattnos */
+ pull_varattnos(node, relid, &varattnos);
+
+ k = -1;
+ while ((k = bms_next_member(varattnos, k)) >= 0)
+ {
+ if (k + FirstLowInvalidHeapAttributeNumber > 0)
+ result
+ = bms_add_member(result,
+ k + FirstLowInvalidHeapAttributeNumber);
+ }
+
+ bms_free(varattnos);
+
+ return result;
+}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Index relid)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(clause, relid, &attnum))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Index relid)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Count varnos referenced in the clauses, and if there's a single varno then
+ * return the index in 'relid'.
+ */
+static int
+count_varnos(List *clauses, Index *relid)
+{
+ int cnt;
+ Bitmapset *varnos = NULL;
+
+ varnos = pull_varnos((Node *) clauses);
+ cnt = bms_num_members(varnos);
+
+ /* if there's a single varno in the clauses, remember it */
+ if (bms_num_members(varnos) == 1)
+ *relid = bms_singleton_member(varnos);
+
+ bms_free(varnos);
+
+ return cnt;
+}
+
+typedef struct
+{
+ Index varno; /* relid we're interested in */
+ Bitmapset *varattnos; /* attnums referenced by the clauses */
+} mv_compatible_context;
+
+/*
+ * Recursive walker that checks compatibility of the clause with multivariate
+ * statistics, and collects attnums from the Vars.
+ *
+ * XXX The original idea was to combine this with expression_tree_walker, but
+ * I've been unable to make that work - seems that does not quite allow
+ * checking the structure. Hence the explicit calls to the walker.
+ */
+static bool
+mv_compatible_walker(Node *node, mv_compatible_context *context)
+{
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) node;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return true;
+
+ /* clauses referencing multiple varnos are incompatible */
+ if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+ return true;
+
+ /* check the clause inside the RestrictInfo */
+ return mv_compatible_walker((Node*)rinfo->clause, (void *) context);
+ }
+
+ if (IsA(node, Var))
+ {
+ Var * var = (Var*)node;
+
+ /*
+ * Also, the variable needs to reference the right relid (this might be
+ * unnecessary given the other checks, but let's be sure).
+ */
+ if (var->varno != context->varno)
+ return true;
+
+ /* Also skip system attributes (we don't allow stats on those). */
+ if (! AttrNumberIsForUserDefinedAttr(var->varattno))
+ return true;
+
+ /* Seems fine, so let's remember the attnum. */
+ context->varattnos = bms_add_member(context->varattnos, var->varattno);
+
+ return false;
+ }
+
+ /*
+ * And finally the operator expressions - we only allow simple expressions
+ * with two arguments, where one is a Var and the other is a constant, and
+ * it's a simple comparison (which we detect using estimator function).
+ */
+ if (is_opclause(node))
+ {
+ OpExpr *expr = (OpExpr *) node;
+ Var *var;
+ bool varonleft = true;
+ bool ok;
+
+ /*
+ * Only expressions with two arguments are considered compatible.
+ *
+ * XXX Possibly unnecessary (can OpExpr have different arg count?).
+ */
+ if (list_length(expr->args) != 2)
+ return true;
+
+ /* see if it actually has the right */
+ ok = (NumRelids((Node*)expr) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ /* unsupported structure (two variables or so) */
+ if (! ok)
+ return true;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the clause.
+ * Otherwise note the relid and attnum for the variable. This uses the
+ * function for estimating selectivity, ont the operator directly (a bit
+ * awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+
+ /* equality conditions are compatible with all statistics */
+ break;
+
+ default:
+
+ /* unknown estimator */
+ return true;
+ }
+
+ var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ return mv_compatible_walker((Node *) var, context);
+ }
+
+ /* Node not explicitly supported, so terminate */
+ return true;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
+{
+ mv_compatible_context context;
+
+ context.varno = relid;
+ context.varattnos = NULL; /* no attnums */
+
+ if (mv_compatible_walker(clause, (void *) &context))
+ return false;
+
+ /* remember the newly collected attnums */
+ *attnum = bms_singleton_member(context.varattnos);
+
+ return true;
+}
+
+/*
+ * collect attnums from functional dependencies
+ *
+ * Walk through all statistics on the relation, and collect attnums covered
+ * by those with functional dependencies. We only look at columns specified
+ * when creating the statistics, not at columns actually referenced by the
+ * dependencies (which may only be a subset of the attributes).
+ */
+static Bitmapset*
+fdeps_collect_attnums(List *stats)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ int2vector *stakeys = info->stakeys;
+
+ /* skip stats without functional dependencies built */
+ if (! info->deps_built)
+ continue;
+
+ for (j = 0; j < stakeys->dim1; j++)
+ attnums = bms_add_member(attnums, stakeys->values[j]);
+ }
+
+ return attnums;
+}
+
+/* transforms bitmapset into an array (index => value) */
+static int*
+make_idx_to_attnum_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum;
+
+ int *mapping = (int*)palloc0(bms_num_members(attnums) * sizeof(int));
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attidx++] = attnum;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+/* transforms bitmapset into an array (value => index) */
+static int*
+make_attnum_to_idx_mapping(Bitmapset *attnums)
+{
+ int attidx = 0;
+ int attnum;
+ int maxattnum = -1;
+ int *mapping;
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ maxattnum = attnum;
+
+ mapping = (int*)palloc0((maxattnum+1) * sizeof(int));
+
+ attnum = -1;
+ while ((attnum = bms_next_member(attnums, attnum)) >= 0)
+ mapping[attnum] = attidx++;
+
+ Assert(attidx == bms_num_members(attnums));
+
+ return mapping;
+}
+
+/* build adjacency matrix for the dependencies */
+static bool*
+build_adjacency_matrix(List *stats, Bitmapset *attnums,
+ int *idx_to_attnum, int *attnum_to_idx)
+{
+ ListCell *lc;
+ int natts = bms_num_members(attnums);
+ bool *matrix = (bool*)palloc0(natts * natts * sizeof(bool));
+
+ foreach (lc, stats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies dependencies = NULL;
+
+ /* skip stats without functional dependencies built */
+ if (! stat->deps_built)
+ continue;
+
+ /* fetch and deserialize dependencies */
+ dependencies = load_mv_dependencies(stat->mvoid);
+ if (dependencies == NULL)
+ {
+ elog(WARNING, "failed to deserialize func deps %d", stat->mvoid);
+ continue;
+ }
+
+ /* set matrix[a,b] to 'true' if 'a=>b' */
+ for (j = 0; j < dependencies->ndeps; j++)
+ {
+ int aidx = attnum_to_idx[dependencies->deps[j]->a];
+ int bidx = attnum_to_idx[dependencies->deps[j]->b];
+
+ /* a=> b */
+ matrix[aidx * natts + bidx] = true;
+ }
+ }
+
+ return matrix;
+}
+
+/*
+ * multiply the adjacency matrix
+ *
+ * By multiplying the adjacency matrix, we derive dependencies implied by those
+ * stored in the catalog (but possibly in several separate rows). We need to
+ * repeat the multiplication until no new dependencies are discovered. The
+ * maximum number of multiplications is equal to the number of attributes.
+ *
+ * This is based on modeling the functional dependencies as edges in a directed
+ * graph with attributes as vertices.
+ */
+static void
+multiply_adjacency_matrix(bool *matrix, int natts)
+{
+ int i;
+
+ /* repeat the multiplication up to natts-times */
+ for (i = 0; i < natts; i++)
+ {
+ bool changed = false; /* no changes in this round */
+ int k, l, m;
+
+ /* k => l */
+ for (k = 0; k < natts; k++)
+ {
+ for (l = 0; l < natts; l++)
+ {
+ /* skip already known dependencies */
+ if (matrix[k * natts + l])
+ continue;
+
+ /*
+ * compute (k,l) in the multiplied matrix
+ *
+ * We don't really care about the exact value, just true/false,
+ * so terminate the loop once we get a hit. Also, this makes it
+ * safe to modify the matrix in-place.
+ */
+ for (m = 0; m < natts; m++)
+ {
+ if (matrix[k * natts + m] * matrix[m * natts + l])
+ {
+ matrix[k * natts + l] = true;
+ changed = true;
+ break;
+ }
+ }
+ }
+ }
+
+ /* no transitive dependency added in this round, so terminate */
+ if (! changed)
+ break;
+ }
+}
+
+/*
+ * Reduce clauses using functional dependencies
+ *
+ * Walk through clauses and eliminate the redundant ones (implied by other
+ * clauses). This is done by first deriving a transitive closure of all the
+ * functional dependencies (by multiplying the adjacency matrix).
+ */
+static List*
+fdeps_reduce_clauses(List *clauses, Bitmapset *attnums, bool *matrix,
+ int *idx_to_attnum, int *attnum_to_idx, Index relid)
+{
+ int i;
+ ListCell *lc;
+ List *reduced_clauses = NIL;
+
+ int nmvclauses; /* size of the arrays */
+ bool *reduced;
+ AttrNumber *mvattnums;
+ Node **mvclauses;
+
+ int natts = bms_num_members(attnums);
+
+ /*
+ * Preallocate space for all clauses (the list only containst
+ * compatible clauses at this point). This makes it somewhat easier
+ * to access the stats / attnums randomly.
+ *
+ * XXX This assumes each clause references exactly one Var, so the
+ * arrays are sized accordingly - for functional dependencies
+ * this is safe, because it only works with Var=Const.
+ */
+ mvclauses = (Node**)palloc0(list_length(clauses) * sizeof(Node*));
+ mvattnums = (AttrNumber*)palloc0(list_length(clauses) * sizeof(AttrNumber));
+ reduced = (bool*)palloc0(list_length(clauses) * sizeof(bool));
+
+ /* fill the arrays */
+ nmvclauses = 0;
+ foreach (lc, clauses)
+ {
+ Node * clause = (Node*)lfirst(lc);
+ Bitmapset * attnums = get_varattnos(clause, relid);
+
+ mvclauses[nmvclauses] = clause;
+ mvattnums[nmvclauses] = bms_singleton_member(attnums);
+ nmvclauses++;
+ }
+
+ Assert(nmvclauses == list_length(clauses));
+
+ /* now try to reduce the clauses (using the dependencies) */
+ for (i = 0; i < nmvclauses; i++)
+ {
+ int j;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[i], attnums))
+ continue;
+
+ /* this clause was already reduced, so let's skip it */
+ if (reduced[i])
+ continue;
+
+ /* walk the potentially 'implied' clauses */
+ for (j = 0; j < nmvclauses; j++)
+ {
+ int aidx, bidx;
+
+ /* not covered by dependencies */
+ if (! bms_is_member(mvattnums[j], attnums))
+ continue;
+
+ aidx = attnum_to_idx[mvattnums[i]];
+ bidx = attnum_to_idx[mvattnums[j]];
+
+ /* can't reduce the clause by itself, or if already reduced */
+ if ((i == j) || reduced[j])
+ continue;
+
+ /* mark the clause as reduced (if aidx => bidx) */
+ reduced[j] = matrix[aidx * natts + bidx];
+ }
+ }
+
+ /* now walk through the clauses, and keep only those not reduced */
+ for (i = 0; i < nmvclauses; i++)
+ if (! reduced[i])
+ reduced_clauses = lappend(reduced_clauses, mvclauses[i]);
+
+ pfree(reduced);
+ pfree(mvclauses);
+ pfree(mvattnums);
+
+ return reduced_clauses;
+}
+
+/*
+ * filter clauses that are interesting for the reduction step
+ *
+ * Functional dependencies can only work with equality clauses with attributes
+ * covered by at least one of the statistics, so we walk through the clauses
+ * and copy the uninteresting ones directly to the result (reduced) clauses.
+ *
+ * That includes clauses that:
+ * (a) are not mv-compatible
+ * (b) reference more than a single attnum
+ * (c) use attnum not covered by functional depencencies
+ *
+ * The clauses interesting for the reduction step are copied to deps_clauses.
+ *
+ * root - planner root
+ * clauses - list of clauses (input)
+ * deps_attnums - attributes covered by dependencies
+ * reduced_clauses - resulting clauses (not subject to reduction step)
+ * deps_clauses - clauses to be processed by reduction
+ * relid - relid of the baserel
+ *
+ * The return value is a bitmap of attnums referenced by deps_clauses.
+ */
+static Bitmapset *
+fdeps_filter_clauses(PlannerInfo *root,
+ List *clauses, Bitmapset *deps_attnums,
+ List **reduced_clauses, List **deps_clauses,
+ Index relid)
+{
+ ListCell *lc;
+ Bitmapset *clause_attnums = NULL;
+
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(lc);
+
+ if (! clause_is_mv_compatible(clause, relid, &attnum))
+
+ /* clause incompatible with functional dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(attnum, deps_attnums))
+
+ /* clause not covered by the dependencies */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else
+ {
+ *deps_clauses = lappend(*deps_clauses, clause);
+ clause_attnums = bms_add_member(clause_attnums, attnum);
+ }
+ }
+
+ return clause_attnums;
+}
+
+/*
+ * reduce list of equality clauses using soft functional dependencies
+ *
+ * We simply walk through list of functional dependencies, and for each one we
+ * check whether the dependency 'matches' the clauses, i.e. if there's a clause
+ * matching the condition. If yes, we attempt to remove all clauses matching
+ * the implied part of the dependency from the list.
+ *
+ * This only reduces equality clauses, and ignores all the other types. We might
+ * extend it to handle IS NULL clause, in the future.
+ *
+ * We also assume the equality clauses are 'compatible'. For example we can't
+ * identify when the clauses use a mismatching zip code and city name. In such
+ * case the usual approach (product of selectivities) would produce a better
+ * estimate, although mostly by chance.
+ *
+ * The implementation needs to be careful about cyclic dependencies, e.g. when
+ *
+ * (a -> b) and (b -> a)
+ *
+ * at the same time, which means there's 1:1 relationship between te columns.
+ * In this case we must not reduce clauses on both attributes at the same time.
+ *
+ * TODO Currently we only apply functional dependencies at the same level, but
+ * maybe we could transfer the clauses from upper levels to the subtrees?
+ * For example let's say we have (a->b) dependency, and condition
+ *
+ * (a=1) AND (b=2 OR c=3)
+ *
+ * Currently, we won't be able to perform any reduction, because we'll
+ * consider (a=1) and (b=2 OR c=3) independently. But maybe we could pass
+ * (a=1) into the other expression, and only check it against conditions
+ * of the functional dependencies?
+ *
+ * In this case we'd end up with
+ *
+ * (a=1)
+ *
+ * as we'd consider (b=2) implied thanks to the rule, rendering the whole
+ * OR clause valid.
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Index relid, List *stats)
+{
+ List *reduced_clauses = NIL;
+
+ /*
+ * matrix of (natts x natts), 1 means x=>y
+ *
+ * This serves two purposes - first, it merges dependencies from all
+ * the statistics, second it makes generating all the transitive
+ * dependencies easier.
+ *
+ * We need to build this only for attributes from the dependencies,
+ * not for all attributes in the table.
+ *
+ * We can't do that only for attributes from the clauses, because we
+ * want to build transitive dependencies (including those going
+ * through attributes not listed in the stats).
+ *
+ * This only works for A=>B dependencies, not sure how to do that
+ * for complex dependencies.
+ */
+ bool *deps_matrix;
+ int deps_natts; /* size of the matric */
+
+ /* mapping attnum <=> matrix index */
+ int *deps_idx_to_attnum;
+ int *deps_attnum_to_idx;
+
+ /* attnums in dependencies and clauses (and intersection) */
+ List *deps_clauses = NIL;
+ Bitmapset *deps_attnums = NULL;
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *intersect_attnums = NULL;
+
+ /*
+ * Is there at least one statistics with functional dependencies?
+ * If not, return the original clauses right away.
+ *
+ * XXX Isn't this pointless, thanks to exactly the same check in
+ * clauselist_selectivity()? Can we trigger the condition here?
+ */
+ if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ return clauses;
+
+ /*
+ * Build the dependency matrix, i.e. attribute adjacency matrix,
+ * where 1 means (a=>b). Once we have the adjacency matrix, we'll
+ * multiply it by itself, to get transitive dependencies.
+ *
+ * Note: This is pretty much transitive closure from graph theory.
+ *
+ * First, let's see what attributes are covered by functional
+ * dependencies (sides of the adjacency matrix), and also a maximum
+ * attribute (size of mapping to simple integer indexes);
+ */
+ deps_attnums = fdeps_collect_attnums(stats);
+
+ /*
+ * Walk through the clauses - clauses that are (one of)
+ *
+ * (a) not mv-compatible
+ * (b) are using more than a single attnum
+ * (c) using attnum not covered by functional depencencies
+ *
+ * may be copied directly to the result. The interesting clauses are
+ * kept in 'deps_clauses' and will be processed later.
+ */
+ clause_attnums = fdeps_filter_clauses(root, clauses, deps_attnums,
+ &reduced_clauses, &deps_clauses, relid);
+
+ /*
+ * we need at least two clauses referencing two different attributes
+ * referencing to do the reduction
+ */
+ if ((list_length(deps_clauses) < 2) || (bms_num_members(clause_attnums) < 2))
+ {
+ bms_free(clause_attnums);
+ list_free(reduced_clauses);
+ list_free(deps_clauses);
+
+ return clauses;
+ }
+
+
+ /*
+ * We need at least two matching attributes in the clauses and
+ * dependencies, otherwise we can't really reduce anything.
+ */
+ intersect_attnums = bms_intersect(clause_attnums, deps_attnums);
+ if (bms_num_members(intersect_attnums) < 2)
+ {
+ bms_free(clause_attnums);
+ bms_free(deps_attnums);
+ bms_free(intersect_attnums);
+
+ list_free(deps_clauses);
+ list_free(reduced_clauses);
+
+ return clauses;
+ }
+
+ /*
+ * Build mapping between matrix indexes and attnums, and then the
+ * adjacency matrix itself.
+ */
+ deps_idx_to_attnum = make_idx_to_attnum_mapping(deps_attnums);
+ deps_attnum_to_idx = make_attnum_to_idx_mapping(deps_attnums);
+
+ /* build the adjacency matrix */
+ deps_matrix = build_adjacency_matrix(stats, deps_attnums,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx);
+
+ deps_natts = bms_num_members(deps_attnums);
+
+ /*
+ * Multiply the matrix N-times (N = size of the matrix), so that we
+ * get all the transitive dependencies. That makes the next step
+ * much easier and faster.
+ *
+ * This is essentially an adjacency matrix from graph theory, and
+ * by multiplying it we get transitive edges. We don't really care
+ * about the exact number (number of paths between vertices) though,
+ * so we can do the multiplication in-place (we don't care whether
+ * we found the dependency in this round or in the previous one).
+ *
+ * Track how many new dependencies were added, and stop when 0, but
+ * we can't multiply more than N-times (longest path in the graph).
+ */
+ multiply_adjacency_matrix(deps_matrix, deps_natts);
+
+ /*
+ * Walk through the clauses, and see which other clauses we may
+ * reduce. The matrix contains all transitive dependencies, which
+ * makes this very fast.
+ *
+ * We have to be careful not to reduce the clause using itself, or
+ * reducing all clauses forming a cycle (so we have to skip already
+ * eliminated clauses).
+ *
+ * I'm not sure whether this guarantees finding the best solution,
+ * i.e. reducing the most clauses, but it probably does (thanks to
+ * having all the transitive dependencies).
+ */
+ deps_clauses = fdeps_reduce_clauses(deps_clauses,
+ deps_attnums, deps_matrix,
+ deps_idx_to_attnum,
+ deps_attnum_to_idx, relid);
+
+ /* join the two lists of clauses */
+ reduced_clauses = list_union(reduced_clauses, deps_clauses);
+
+ pfree(deps_matrix);
+ pfree(deps_idx_to_attnum);
+ pfree(deps_attnum_to_idx);
+
+ bms_free(deps_attnums);
+ bms_free(clause_attnums);
+ bms_free(intersect_attnums);
+
+ return reduced_clauses;
+}
+
+/*
+ * Check that there are stats with at least one of the requested types.
+ */
+static bool
+has_stats(List *stats, int type)
+{
+ ListCell *s;
+
+ foreach (s, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Lookups stats for a given baserel.
+ */
+static List *
+find_stats(PlannerInfo *root, Index relid)
+{
+ Assert(root->simple_rel_array[relid] != NULL);
+
+ return root->simple_rel_array[relid]->mvstatlist;
+}
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
new file mode 100644
index 0000000..a38ea7b
--- /dev/null
+++ b/src/backend/utils/mvstats/README.stats
@@ -0,0 +1,36 @@
+Multivariate statististics
+==========================
+
+When estimating various quantities (e.g. condition selectivities) the default
+approach relies on the assumption of independence. In practice that's often
+not true, resulting in estimation errors.
+
+Multivariate stats track different types of dependencies between the columns,
+hopefully improving the estimates.
+
+Currently we only have one kind of multivariate statistics - soft functional
+dependencies, and we use it to improve estimates of equality clauses. See
+README.dependencies for details.
+
+
+Selectivity estimation
+----------------------
+
+When estimating selectivity, we aim to achieve several things:
+
+ (a) maximize the estimate accuracy
+
+ (b) minimize the overhead, especially when no suitable multivariate stats
+ exist (so if you are not using multivariate stats, there's no overhead)
+
+This clauselist_selectivity() performs several inexpensive checks first, before
+even attempting to do the more expensive estimation.
+
+ (1) check if there are multivariate stats on the relation
+
+ (2) check there are at least two attributes referenced by clauses compatible
+ with multivariate statistics (equality clauses for func. dependencies)
+
+ (3) perform reduction of equality clauses using func. dependencies
+
+ (4) estimate the reduced list of clauses using regular statistics
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index a755c49..bd200bc 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -84,7 +84,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(stat->mvoid, deps, attrs);
@@ -163,6 +164,7 @@ list_mv_stats(Oid relid)
info->mvoid = HeapTupleGetOid(htup);
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
result = lappend(result, info);
@@ -274,6 +276,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 2a064a0..c80ba33 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -435,3 +435,27 @@ pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(result));
}
+
+MVDependencies
+load_mv_dependencies(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->deps_enabled && mvstat->deps_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_dependencies(DatumGetByteaP(deps));
+}
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 7ebd961..cc43a79 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,12 +17,20 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
int16 a;
@@ -48,6 +56,8 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
+MVDependencies load_mv_dependencies(Oid mvoid);
+
bytea * serialize_mv_dependencies(MVDependencies dependencies);
/* deserialization of stats (serialization is private to analyze) */
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..e759997
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,172 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index bec0316..4f2ffb8 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -110,3 +110,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7e9b319..097a04f 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -162,3 +162,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..48dea4d
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,150 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
--
2.1.0
0004-multivariate-MCV-lists.patchtext/x-patch; charset=UTF-8; name=0004-multivariate-MCV-lists.patchDownload
From 11e08f7a0ffc186dbc23605d522c278e9b393ea5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 16:52:15 +0200
Subject: [PATCH 4/9] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
---
doc/src/sgml/ref/create_statistics.sgml | 18 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 45 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 829 ++++++++++++++++++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.mcv | 137 ++++
src/backend/utils/mvstats/README.stats | 89 ++-
src/backend/utils/mvstats/common.c | 104 ++-
src/backend/utils/mvstats/common.h | 11 +-
src/backend/utils/mvstats/mcv.c | 1094 +++++++++++++++++++++++++++++++
src/bin/psql/describe.c | 25 +-
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 69 +-
src/test/regress/expected/mv_mcv.out | 207 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 178 +++++
22 files changed, 2776 insertions(+), 73 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.mcv
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index a86eae3..193e4b0 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -132,6 +132,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>max_mcv_items</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of MCV list items.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>mcv</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables MCV list for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index b8a264e..2d570ee 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -165,7 +165,9 @@ CREATE VIEW pg_mv_stats AS
S.staname AS staname,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 1b89bbe..b04c583 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -70,7 +70,13 @@ CreateStatistics(CreateStatsStmt *stmt)
ObjectAddress parentobject, childobject;
/* by default build nothing */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -146,6 +152,29 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -154,10 +183,16 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -178,8 +213,12 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 07206d7..333e24b 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2162,9 +2162,11 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
+ WRITE_BOOL_FIELD(mcv_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
+ WRITE_BOOL_FIELD(mcv_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 80708fe..977f88e 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -15,6 +15,7 @@
#include "postgres.h"
#include "access/sysattr.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
@@ -47,23 +48,51 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
-static bool clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum);
+static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
+ int type);
-static Bitmapset *collect_mv_attnums(List *clauses, Index relid);
+static Bitmapset *collect_mv_attnums(List *clauses, Index relid, int type);
-static int count_mv_attnums(List *clauses, Index relid);
+static int count_mv_attnums(List *clauses, Index relid, int type);
static int count_varnos(List *clauses, Index *relid);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Index relid, List *stats);
+static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
+
+static List *clauselist_mv_split(PlannerInfo *root, Index relid,
+ List *clauses, List **mvclauses,
+ MVStatisticInfo *mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
+
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,isor) \
+ (m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -89,11 +118,13 @@ static List * find_stats(PlannerInfo *root, Index relid);
* to verify that suitable multivariate statistics exist.
*
* If we identify such multivariate statistics apply, we try to apply them.
- * Currently we only have (soft) functional dependencies, so we try to reduce
- * the list of clauses.
*
- * Then we remove the clauses estimated using multivariate stats, and process
- * the rest of the clauses using the regular per-column stats.
+ * First we try to reduce the list of clauses by applying (soft) functional
+ * dependencies, and then we try to estimate the selectivity of the reduced
+ * list of clauses using the multivariate MCV list.
+ *
+ * Finally we remove the portion of clauses estimated using multivariate stats,
+ * and process the rest of the clauses using the regular per-column stats.
*
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
@@ -170,12 +201,46 @@ clauselist_selectivity(PlannerInfo *root,
* that need to be estimated by other types of stats (MCV, histograms etc).
*/
if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
- (count_mv_attnums(clauses, relid) >= 2))
+ (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_FDEP) >= 2))
{
clauses = clauselist_apply_dependencies(root, clauses, relid, stats);
}
/*
+ * Check that there are statistics with MCV list or histogram, and also the
+ * number of attributes covered by these types of statistics.
+ *
+ * If there are no such stats or not enough attributes, don't waste time
+ * with the multivariate code and simply skip to estimation using the
+ * regular per-column stats.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
+ (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV) >= 2))
+ {
+ /* collect attributes from the compatible conditions */
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV);
+
+ /* and search for the statistic covering the most attributes */
+ MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+
+ if (mvstat != NULL) /* we have a matching stats */
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
+ mvstat, MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -832,6 +897,69 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * estimate selectivity of clauses using multivariate statistic
+ *
+ * Perform estimation of the clauses using a MCV list.
+ *
+ * This assumes all the clauses are compatible with the selected statistics
+ * (e.g. only reference columns covered by the statistics, use supported
+ * operator, etc.).
+ *
+ * TODO We may support some additional conditions, most importantly those
+ * matching multiple columns (e.g. "a = b" or "a < b").
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities (i.e. the
+ * selectivity of the most restrictive clause), because that's the maximum
+ * we can ever get from ANDed list of clauses. This may probably prevent
+ * issues with hitting too many buckets and low precision histograms.
+ *
+ * TODO We may remember the lowest frequency in the MCV list, and then later use
+ * it as a upper boundary for the selectivity (had there been a more
+ * frequent item, it'd be in the MCV list). This might improve cases with
+ * low-detail histograms.
+ *
+ * TODO We may also derive some additional boundaries for the selectivity from
+ * the MCV list, because
+ *
+ * (a) if we have a "full equality condition" (one equality condition on
+ * each column of the statistic) and we found a match in the MCV list,
+ * then this is the final selectivity (and pretty accurate),
+ *
+ * (b) if we have a "full equality condition" and we haven't found a match
+ * in the MCV list, then the selectivity is below the lowest frequency
+ * found in the MCV list,
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive). But this
+ * requires really knowing the per-clause selectivities in advance,
+ * and that's not what we do now.
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Pull varattnos from the clauses, similarly to pull_varattnos() but:
*
@@ -869,28 +997,26 @@ get_varattnos(Node * node, Index relid)
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
-collect_mv_attnums(List *clauses, Index relid)
+collect_mv_attnums(List *clauses, Index relid, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
/*
- * Walk through the clauses and identify the ones we can estimate
- * using multivariate stats, and remember the relid/columns. We'll
- * then cross-check if we have suitable stats, and only if needed
- * we'll split the clauses into multivariate and regular lists.
+ * Walk through the clauses and identify the ones we can estimate using
+ * multivariate stats, and remember the relid/columns. We'll then
+ * cross-check if we have suitable stats, and only if needed we'll split
+ * the clauses into multivariate and regular lists.
*
- * For now we're only interested in RestrictInfo nodes with nested
- * OpExpr, using either a range or equality.
+ * For now we're only interested in RestrictInfo nodes with nested OpExpr,
+ * using either a range or equality.
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(clause, relid, &attnum))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, relid, &attnums, types);
}
/*
@@ -911,10 +1037,10 @@ collect_mv_attnums(List *clauses, Index relid)
* Count the number of attributes in clauses compatible with multivariate stats.
*/
static int
-count_mv_attnums(List *clauses, Index relid)
+count_mv_attnums(List *clauses, Index relid, int type)
{
int c;
- Bitmapset *attnums = collect_mv_attnums(clauses, relid);
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
c = bms_num_members(attnums);
@@ -944,9 +1070,183 @@ count_varnos(List *clauses, Index *relid)
return cnt;
}
+
+/*
+ * We're looking for statistics matching at least 2 attributes, referenced in
+ * clauses compatible with multivariate statistics. The current selection
+ * criteria is very simple - we choose the statistics referencing the most
+ * attributes.
+ *
+ * If there are multiple statistics referencing the same number of columns
+ * (from the clauses), the one with less source columns (as listed in the
+ * ADD STATISTICS when creating the statistics) wins. Else the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns, but one
+ * has 100 buckets and the other one has 1000 buckets (thus likely
+ * providing better estimates), this is not currently considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list, another
+ * one with just a histogram and a third one with both, we treat them equally.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts, so if
+ * there are multiple clauses on a single attribute, this still counts as
+ * a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example equality
+ * clauses probably work better with MCV lists than with histograms. But
+ * IS [NOT] NULL conditions may often work better with histograms (thanks
+ * to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be selected
+ * as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list of
+ * clauses into two parts - conditions that are compatible with the selected
+ * stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while the last
+ * condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting conditions
+ * instead of just referenced attributes), but eventually the best option should
+ * be to combine multiple statistics. But that's much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses, because
+ * 'dependencies' will probably work only with equality clauses.
+ */
+static MVStatisticInfo *
+choose_mv_statistics(List *stats, Bitmapset *attnums)
+{
+ int i;
+ ListCell *lc;
+
+ MVStatisticInfo *choice = NULL;
+
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements) and for
+ * each one count the referenced attributes (encoded in the 'attnums' bitmap).
+ */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = info->stakeys;
+ int numattrs = attrs->dim1;
+
+ /* skip dependencies-only stats */
+ if (! info->mcv_built)
+ continue;
+
+ /* count columns covered by the histogram */
+ for (i = 0; i < numattrs; i++)
+ if (bms_is_member(attrs->values[i], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = info;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses that
+ * will be evaluated using the chosen statistics, and the remaining clauses
+ * (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, Index relid,
+ List *clauses, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes, so we can do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(clause, relid, &attnums, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list of
+ * mv-compatible clauses. Otherwise, keep it in the list of 'regular'
+ * clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible with the chosen
+ * histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
typedef struct
{
+ int types; /* types of statistics ? */
Index varno; /* relid we're interested in */
Bitmapset *varattnos; /* attnums referenced by the clauses */
} mv_compatible_context;
@@ -964,23 +1264,66 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
{
if (node == NULL)
return false;
-
+
if (IsA(node, RestrictInfo))
{
RestrictInfo *rinfo = (RestrictInfo *) node;
-
+
/* Pseudoconstants are not really interesting here. */
if (rinfo->pseudoconstant)
return true;
-
+
/* clauses referencing multiple varnos are incompatible */
if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
return true;
-
+
/* check the clause inside the RestrictInfo */
return mv_compatible_walker((Node*)rinfo->clause, (void *) context);
}
+ if (or_clause(node) || and_clause(node) || not_clause(node))
+ {
+ /*
+ * AND/OR/NOT-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses are
+ * supported and some are not, and treat all supported subclauses
+ * as a single clause, compute it's selectivity using mv stats,
+ * and compute the total selectivity using the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the orclause
+ * with nested RestrictInfo - we won't have to call pull_varnos()
+ * for each clause, saving time.
+ *
+ * TODO Perhaps this needs a bit more thought for functional
+ * dependencies? Those don't quite work for NOT cases.
+ */
+ BoolExpr *expr = (BoolExpr *) node;
+ ListCell *lc;
+
+ foreach (lc, expr->args)
+ {
+ if (mv_compatible_walker((Node *) lfirst(lc), context))
+ return true;
+ }
+
+ return false;
+ }
+
+ if (IsA(node, NullTest))
+ {
+ NullTest* nt = (NullTest*)node;
+
+ /*
+ * Only simple (Var IS NULL) expressions supported for now. Maybe we could
+ * use examine_variable to fix this?
+ */
+ if (! IsA(nt->arg, Var))
+ return true;
+
+ return mv_compatible_walker((Node*)(nt->arg), context);
+ }
+
if (IsA(node, Var))
{
Var * var = (Var*)node;
@@ -1031,7 +1374,7 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
/* unsupported structure (two variables or so) */
if (! ok)
return true;
-
+
/*
* If it's not a "<" or ">" or "=" operator, just ignore the clause.
* Otherwise note the relid and attnum for the variable. This uses the
@@ -1041,10 +1384,18 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
switch (get_oprrest(expr->opno))
{
case F_EQSEL:
-
/* equality conditions are compatible with all statistics */
break;
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+
+ /* not compatible with functional dependencies */
+ if (! (context->types & MV_CLAUSE_TYPE_MCV))
+ return true; /* terminate */
+
+ break;
+
default:
/* unknown estimator */
@@ -1055,11 +1406,11 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
return mv_compatible_walker((Node *) var, context);
}
-
+
/* Node not explicitly supported, so terminate */
return true;
}
-
+
/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
@@ -1078,10 +1429,11 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
* evaluate them using multivariate stats.
*/
static bool
-clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
+clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums, int types)
{
mv_compatible_context context;
+ context.types = types;
context.varno = relid;
context.varattnos = NULL; /* no attnums */
@@ -1089,7 +1441,7 @@ clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
return false;
/* remember the newly collected attnums */
- *attnum = bms_singleton_member(context.varattnos);
+ *attnums = bms_add_members(*attnums, context.varattnos);
return true;
}
@@ -1394,24 +1746,39 @@ fdeps_filter_clauses(PlannerInfo *root,
foreach (lc, clauses)
{
- AttrNumber attnum;
+ Bitmapset *attnums = NULL;
Node *clause = (Node *) lfirst(lc);
- if (! clause_is_mv_compatible(clause, relid, &attnum))
+ if (! clause_is_mv_compatible(clause, relid, &attnums,
+ MV_CLAUSE_TYPE_FDEP))
/* clause incompatible with functional dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
- else if (! bms_is_member(attnum, deps_attnums))
+ else if (bms_num_members(attnums) > 1)
+
+ /*
+ * clause referencing multiple attributes (strange, should
+ * this be handled by clause_is_mv_compatible directly)
+ */
+ *reduced_clauses = lappend(*reduced_clauses, clause);
+
+ else if (! bms_is_member(bms_singleton_member(attnums), deps_attnums))
/* clause not covered by the dependencies */
*reduced_clauses = lappend(*reduced_clauses, clause);
else
{
+ /* ok, clause compatible with existing dependencies */
+ Assert(bms_num_members(attnums) == 1);
+
*deps_clauses = lappend(*deps_clauses, clause);
- clause_attnums = bms_add_member(clause_attnums, attnum);
+ clause_attnums = bms_add_member(clause_attnums,
+ bms_singleton_member(attnums));
}
+
+ bms_free(attnums);
}
return clause_attnums;
@@ -1637,6 +2004,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
}
return false;
@@ -1652,3 +2022,392 @@ find_stats(PlannerInfo *root, Index relid)
return root->simple_rel_array[relid]->mvstatlist;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = load_mv_mcvlist(mvstats->mvoid);
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* used to 'scale' for MCV lists not covering all tuples */
+ u += mcvlist->items[i]->frequency;
+
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s*u;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /*
+ * find the lowest frequency in the MCV list
+ *
+ * We need to do that here, because we do various tricks in the following
+ * code - skipping items already ruled out, etc.
+ *
+ * XXX A loop is necessary because the MCV list is not sorted by frequency.
+ */
+ *lowsel = 1.0;
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+ }
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr *expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+ FmgrInfo opproc;
+
+ /* get procedure computing operator selectivity */
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ switch (oprrest)
+ {
+ case F_EQSEL:
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ mismatch = ! DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (! mismatch)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ break;
+
+ case F_SCALARLTSEL: /* column < constant */
+ case F_SCALARGTSEL: /* column > constant */
+
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ /* invert the result if isgt=true */
+ mismatch = (isgt) ? (! mismatch) : mismatch;
+ break;
+ }
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! item->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (item->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 7fb2088..8394111 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -412,7 +412,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built)
+ if (mvstat->deps_built || mvstat->mcv_built)
{
info = makeNode(MVStatisticInfo);
@@ -421,9 +421,11 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
+ info->mcv_enabled = mvstat->mcv_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
+ info->mcv_built = mvstat->mcv_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..f9bf10c 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o dependencies.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.mcv b/src/backend/utils/mvstats/README.mcv
new file mode 100644
index 0000000..e93cfe4
--- /dev/null
+++ b/src/backend/utils/mvstats/README.mcv
@@ -0,0 +1,137 @@
+MCV lists
+=========
+
+Multivariate MCV (most-common values) lists are a straightforward extension of
+regular MCV list, tracking most frequent combinations of values for a group of
+attributes.
+
+This works particularly well for columns with a small number of distinct values,
+as the list may include all the combinations and approximate the distribution
+very accurately.
+
+For columns with large number of distinct values (e.g. those with continuous
+domains), the list will only track the most frequent combinations. If the
+distribution is mostly uniform (all combinations about equally frequent), the
+MCV list will be empty.
+
+Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
+than when using histograms.
+
+Also, MCV lists don't necessarily require sorting of the values (the fact that
+we use sorting when building them is implementation detail), but even more
+importantly the ordering is not built into the approximation (while histograms
+are built on ordering). So MCV lists work well even for attributes where the
+ordering of the data type is disconnected from the meaning of the data. For
+example we know how to sort strings, but it's unlikely to make much sense for
+city names (or other label-like attributes).
+
+
+Selectivity estimation
+----------------------
+
+The estimation, implemented in clauselist_mv_selectivity_mcvlist(), is quite
+simple in principle - we need to identify MCV items matching all the clauses
+and sum frequencies of all those items.
+
+Currently MCV lists support estimation of the following clause types:
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR clauses WHERE (a < 1) OR (b >= 2)
+
+It's possible to add support for additional clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and possibly others. These are tasks for the future, not yet implemented.
+
+
+Estimating equality clauses
+---------------------------
+
+When computing selectivity estimate for equality clauses
+
+ (a = 1) AND (b = 2)
+
+we can do this estimate pretty exactly assuming that two conditions are met:
+
+ (1) there's an equality condition on all attributes of the statistic
+
+ (2) we find a matching item in the MCV list
+
+In this case we know the MCV item represents all tuples matching the clauses,
+and the selectivity estimate is complete (i.e. we don't need to perform
+estimation using the histogram). This is what we call 'full match'.
+
+When only (1) holds, but there's no matching MCV item, we don't know whether
+there are no such rows or just are not very frequent. We can however use the
+frequency of the least frequent MCV item as an upper bound for the selectivity.
+
+For a combination of equality conditions (not full-match case) we can clamp the
+selectivity by the minimum of selectivities for each condition. For example if
+we know the number of distinct values for each column, we can use 1/ndistinct
+as a per-column estimate. Or rather 1/ndistinct + selectivity derived from the
+MCV list.
+
+We should also probably only use the 'residual ndistinct' by exluding the items
+included in the MCV list (and also residual frequency):
+
+ f = (1.0 - sum(MCV frequencies)) / (ndistinct - ndistinct(MCV list))
+
+but it's worth pointing out the ndistinct values are multi-variate for the
+columns referenced by the equality conditions.
+
+Note: Only the "full match" limit is currently implemented.
+
+
+Hashed MCV (not yet implemented)
+--------------------------------
+
+Regular MCV lists have to include actual values for each item, so if those items
+are large the list may be quite large. This is especially true for multi-variate
+MCV lists, although the current implementation partially mitigates this by
+performing de-duplicating the values before storing them on disk.
+
+It's possible to only store hashes (32-bit values) instead of the actual values,
+significantly reducing the space requirements. Obviously, this would only make
+the MCV lists useful for estimating equality conditions (assuming the 32-bit
+hashes make the collisions rare enough).
+
+This might also complicate matching the columns to available stats.
+
+
+TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
+ of the actual values. This type of MCV list will be useful only for
+ estimating equality clauses, and will reduce space requirements for large
+ varlena types (in such cases we usually only want equality anyway).
+
+TODO Currently there's no logic to consider building only a MCV list (and not
+ building the histogram at all), except for doing this decision manually in
+ ADD STATISTICS.
+
+
+Inspecting the MCV list
+-----------------------
+
+Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarrays, so we
+simply get the text representation of the arrays.
+
+With multivariate MCV lits it's not that simple due to the possible mix of
+data types. It might be possible to produce similar array-like representation,
+but that'd unnecessarily complicate further processing and analysis of the MCV
+list. Instead, there's a SRF function providing values, frequencies etc.
+
+ SELECT * FROM pg_mv_mcv_items();
+
+It has two input parameters:
+
+ oid - OID of the MCV list (pg_mv_statistic.staoid)
+
+and produces a table with these columns:
+
+ - item ID (0...nitems-1)
+ - values (string array)
+ - nulls only (boolean array)
+ - frequency (double precision)
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index a38ea7b..5c5c59a 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -8,9 +8,50 @@ not true, resulting in estimation errors.
Multivariate stats track different types of dependencies between the columns,
hopefully improving the estimates.
-Currently we only have one kind of multivariate statistics - soft functional
-dependencies, and we use it to improve estimates of equality clauses. See
-README.dependencies for details.
+
+Types of statistics
+-------------------
+
+Currently we only have two kinds of multivariate statistics
+
+ (a) soft functional dependencies (README.dependencies)
+
+ (b) MCV lists (README.mcv)
+
+
+Compatible clause types
+-----------------------
+
+Each type of statistics may be used to estimate some subset of clause types.
+
+ (a) functional dependencies - equality clauses (AND), possibly IS NULL
+
+ (b) MCV list - equality and inequality clauses, IS [NOT] NULL, AND/OR
+
+Currently only simple operator clauses (Var op Const) are supported, but it's
+possible to support more complex clause types, e.g. (Var op Var).
+
+
+Complex clauses
+---------------
+
+We also support estimating more complex clauses - essentially AND/OR clauses
+with (Var op Const) as leaves, as long as all the referenced attributes are
+covered by a single statistics.
+
+For example this condition
+
+ (a=1) AND ((b=2) OR ((c=3) AND (d=4)))
+
+may be estimated using statistics on (a,b,c,d). If we only have statistics on
+(b,c,d) we may estimate the second part, and estimate (a=1) using simple stats.
+
+If we only have statistics on (a,b,c) we can't apply it at all at this point,
+but it's worth pointing out clauselist_selectivity() works recursively and when
+handling the second part (the OR-clause), we'll be able to apply the statistics.
+
+Note: The multi-statistics estimation patch also makes it possible to pass some
+clauses as 'conditions' into the deeper parts of the expression tree.
Selectivity estimation
@@ -23,14 +64,48 @@ When estimating selectivity, we aim to achieve several things:
(b) minimize the overhead, especially when no suitable multivariate stats
exist (so if you are not using multivariate stats, there's no overhead)
-This clauselist_selectivity() performs several inexpensive checks first, before
+Thus clauselist_selectivity() performs several inexpensive checks first, before
even attempting to do the more expensive estimation.
(1) check if there are multivariate stats on the relation
- (2) check there are at least two attributes referenced by clauses compatible
- with multivariate statistics (equality clauses for func. dependencies)
+ (2) check that there are functional dependencies on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equality clauses for func. dependencies)
(3) perform reduction of equality clauses using func. dependencies
- (4) estimate the reduced list of clauses using regular statistics
+ (4) check that there are multivariate MCV lists on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equalities, inequalities, etc.)
+
+ (5) find the best multivariate statistics (matching the most conditions)
+ and use it to compute the estimate
+
+ (6) estimate the remaining clauses (not estimated using multivariate stats)
+ using the regular per-column statistics
+
+Whenever we find there are no suitable stats, we skip the expensive steps.
+
+
+Further (possibly crazy) ideas
+------------------------------
+
+Currently the clauses are only estimated using a single statistics, even if
+there are multiple candidate statistics - for example assume we have statistics
+on (a,b,c) and (b,c,d), and estimate conditions
+
+ (b = 1) AND (c = 2)
+
+Then both statistics may be used, but we only use one of them. Maybe we could
+use compute estimates using all candidate stats, and somehow aggregate them
+into the final estimate by using average or median.
+
+Some stats may give better estimates than others, but it's very difficult to say
+in advance which stats are the best (it depends on the number of buckets, number
+of additional columns not referenced in the clauses, type of condition etc.).
+
+But of course, this may result in expensive estimation (CPU-wise).
+
+So we might add a GUC to choose between a simple (single statistics) and thus
+multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index bd200bc..d1da714 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,12 +16,14 @@
#include "common.h"
+#include "utils/array.h"
+
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+ int natts,
+ VacAttrStats **vacattrstats);
static List* list_mv_stats(Oid relid);
-
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -49,6 +51,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int j;
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -87,8 +91,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (stat->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, attrs);
+ update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -166,6 +174,8 @@ list_mv_stats(Oid relid)
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
+ info->mcv_enabled = stats->mcv_enabled;
+ info->mcv_built = stats->mcv_built;
result = lappend(result, info);
}
@@ -180,8 +190,56 @@ list_mv_stats(Oid relid)
return result;
}
+
+/*
+ * Find attnims of MV stats using the mvoid.
+ */
+int2vector*
+find_mv_attnums(Oid mvoid, Oid *relid)
+{
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ HeapTuple htup;
+ int2vector *keys;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ htup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+ /* starelid */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_starelid, &isnull);
+ Assert(!isnull);
+
+ *relid = DatumGetObjectId(adatum);
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+ ReleaseSysCache(htup);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return keys;
+}
+
+
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -206,18 +264,29 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
@@ -246,6 +315,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -256,11 +340,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index d96422d..9f1bd59 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -46,7 +46,15 @@ typedef struct
Datum value; /* a data value */
int tupno; /* position index for tuple it came from */
} ScalarItem;
-
+
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -71,5 +79,6 @@ int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..551c934
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1094 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(int32))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(int32) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/* pointers into a flat serialized item of ITEM_SIZE(n) bytes */
+#define ITEM_INDEXES(item) ((uint16*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+/*
+ * Builds MCV list from sample rows, and removes rows represented by
+ * the MCV list from the sample (the number of remaining sample rows is
+ * returned by the numrows_filtered parameter).
+ *
+ * The method is quite simple - in short it does about these steps:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * For more details, see the comments in the code.
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we
+ * want to check the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or
+ * float4). Maybe we could save some space here, but the bytea
+ * compression should handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed
+ * from the table, but rather estimate the number of distinct
+ * values in the table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int count = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * Preallocate space for all the items as a single chunk, and point
+ * the items to the appropriate parts of the array.
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ /* keep all the rows by default (as if there was no MCV list) */
+ *numrows_filtered = numrows;
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numrows; j++)
+ for (i = 0; i < numattrs; i++)
+ items[j].values[i] = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Count the number of distinct groups - just walk through the
+ * sorted list and count the number of key changes. We use this to
+ * determine the threshold (125% of the average frequency).
+ */
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ /*
+ * Determine how many groups actually exceed the threshold, and then
+ * walk the array again and collect them into an array. We'll always
+ * require at least 4 rows per group.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e.
+ * if there are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS),
+ * we'll require only 2 rows per group.
+ *
+ * TODO For now the threshold is the same as in the single-column
+ * case (average + 25%), but maybe that's worth revisiting
+ * for the multivariate case.
+ *
+ * TODO We can do this only if we believe we got all the distinct
+ * values of the table.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog)
+ * instead of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /*
+ * Walk through the sorted data again, and see how many groups
+ * reach the mcv_threshold (and become an item in the MCV list).
+ */
+ count = 1;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or new group, so check if we exceed mcv_threshold */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* group hits the threshold, count the group as MCV item */
+ if (count >= mcv_threshold)
+ nitems += 1;
+
+ count = 1;
+ }
+ else /* within group, so increase the number of items */
+ count += 1;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as
+ * we'll pass this outside this method and thus it needs to be
+ * easy to pfree() the data (and we wouldn't know where the
+ * arrays start).
+ *
+ * TODO Maybe the reasoning that we can't allocate a single
+ * piece because we're passing it out is bogus? Who'd
+ * free a single item of the MCV list, anyway?
+ *
+ * TODO Maybe with a proper encoding (stuffing all the values
+ * into a list-level array, this will be untrue)?
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /*
+ * Repeat the same loop as above, but this time copy the data
+ * into the MCV list (for items exceeding the threshold).
+ *
+ * TODO Maybe we could simply remember indexes of the last item
+ * in each group (from the previous loop)?
+ */
+ count = 1;
+ nitems = 0;
+ for (i = 1; i <= numrows; i++)
+ {
+ /* last row or a new group */
+ if ((i == numrows) || (multi_sort_compare(&items[i], &items[i-1], mss) != 0))
+ {
+ /* count the MCV item if exceeding the threshold (and copy into the array) */
+ if (count >= mcv_threshold)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[nitems];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, items[(i-1)].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, items[(i-1)].isnull, sizeof(bool) * numattrs);
+
+
+ /* and finally the group frequency */
+ item->frequency = (double)count / numrows;
+
+ /* next item */
+ nitems += 1;
+ }
+
+ count = 1;
+ }
+ else /* same group, just increase the number of items */
+ count += 1;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows
+ * that are not represented by the MCV list).
+ *
+ * FIXME This implementation is rather naive, effectively O(N^2).
+ * As the MCV list grows, the check will take longer and
+ * longer. And as the number of sampled rows increases (by
+ * increasing statistics target), it will take longer and
+ * longer. One option is to sort the MCV items first and
+ * then perform a binary search.
+ *
+ * A better option would be keeping the ID of the row in
+ * the sort item, and then just walk through the items and
+ * mark rows to remove (in a bitmap of the same size).
+ * There's not space for that in SortItem at this moment,
+ * but it's trivial to add 'private' pointer, or just
+ * using another structure with extra field (starting with
+ * SortItem, so that the comparators etc. still work).
+ *
+ * Another option is to use the sorted array of items
+ * (because that's how we sorted the source data), and
+ * simply do a bsearch() into it. If we find a matching
+ * item, the row belongs to the MCV list.
+ */
+ if (nitems == ndistinct) /* all rows are covered by MCV items */
+ *numrows_filtered = 0;
+ else /* (nitems < ndistinct) && (nitems > 0) */
+ {
+ int nfiltered = 0;
+ HeapTuple *rows_filtered = (HeapTuple*)palloc0(sizeof(HeapTuple) * numrows);
+
+ /* used for the searches */
+ SortItem item, mcvitem;;
+
+ item.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ item.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * FIXME we don't need to allocate this, we can reference
+ * the MCV item directly ...
+ */
+ mcvitem.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ mcvitem.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ bool match = false;
+
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ item.values[j] = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &item.isnull[j]);
+
+ /* scan through the MCV list for matches */
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /*
+ * TODO Create a SortItem/MCVItem comparator so that
+ * we don't need to do memcpy() like crazy.
+ */
+ memcpy(mcvitem.values, mcvlist->items[j]->values,
+ numattrs * sizeof(Datum));
+ memcpy(mcvitem.isnull, mcvlist->items[j]->isnull,
+ numattrs * sizeof(bool));
+
+ if (multi_sort_compare(&item, &mcvitem, mss) == 0)
+ {
+ match = true;
+ break;
+ }
+ }
+
+ /* if no match in the MCV list, copy the row into the filtered ones */
+ if (! match)
+ memcpy(&rows_filtered[nfiltered++], &rows[i], sizeof(HeapTuple));
+ }
+
+ /* replace the rows and remember how many rows we kept */
+ memcpy(rows, rows_filtered, sizeof(HeapTuple) * nfiltered);
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(rows_filtered);
+ pfree(item.values);
+ pfree(item.isnull);
+ pfree(mcvitem.values);
+ pfree(mcvitem.isnull);
+ }
+ }
+
+ pfree(values);
+ pfree(items);
+ pfree(isnull);
+
+ return mcvlist;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+MCVList
+load_mv_mcvlist(Oid mvoid)
+{
+ bool isnull = false;
+ Datum mcvlist;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->mcv_enabled && mvstat->mcv_built);
+#endif
+
+ mcvlist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_mcvlist(DatumGetByteaP(mcvlist));
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Serialize MCV list into a bytea value. The basic algorithm is simple:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we're mixing
+ * different datatypes, and we don't know what equality means for them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * We'll use uint16 values for the indexes in step (3), as we don't
+ * allow more than 8k MCV items (see list max_mcv_items). We might
+ * increase this to 65k and still fit into uint16.
+ *
+ * We don't really expect the high compression as with histograms,
+ * because we're not doing any bucket splits etc. (which is the source
+ * of high redundancy there), but we need to do it anyway as we need
+ * to serialize varlena values etc. We might invent another way to
+ * serialize MCV lists, but let's keep it consistent.
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider using 16-bit values for the indexes in step (3).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ Size total_length = 0;
+
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for all values, including NULLs (won't use them) */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ if (! mcvlist->items[j]->isnull[i]) /* skip NULL values */
+ {
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* do not exceed UINT16_MAX */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typbyval || (info[i].typlen > 0))
+ /* by value pased by reference, but fixed length */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data.
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > 1024 * 1024)
+ elog(ERROR, "serialized MCV exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'ptr' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the values for each item */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! mcvlist->items[i]->isnull[j])
+ {
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ v = (Datum*)bsearch(&mcvlist->items[i]->values[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims),
+ mcvlist->items[i]->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims),
+ &mcvlist->items[i]->frequency, sizeof(double));
+
+ /* copy the item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/*
+ * Inverse to serialize_mv_mcvlist() - see the comment there.
+ *
+ * We'll do full deserialization, because we don't really expect high
+ * duplication of values so the caching may not be as efficient as with
+ * histograms.
+ */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ uint16 *indexes = NULL;
+ Datum **values = NULL;
+
+ /* local allocation buffer (used only for deserialization) */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* buffer used for the result */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVListData,items))
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert(nitems > 0);
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MCVListData,items) +
+ ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * We'll allocate one large chunk of memory for the intermediate
+ * data, needed only for deserializing the MCV list, and we'll pack
+ * use a local dense allocation to minimize the palloc overhead.
+ *
+ * Let's see how much space we'll actually need, and also include
+ * space for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ /* for full-size byval types, we reuse the serialized value */
+ if (! (info[i].typbyval && info[i].typlen == sizeof(Datum)))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ /* pased by reference, but fixed length (name, tid, ...) */
+ if (info[i].typlen > 0)
+ {
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should exhaust the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for the MCV items in a single piece */
+ rbufflen = (sizeof(MCVItem) + sizeof(MCVItemData) +
+ sizeof(Datum)*ndims + sizeof(bool)*ndims) * nitems;
+
+ rbuff = palloc(rbufflen);
+ rptr = rbuff;
+
+ mcvlist->items = (MCVItem*)rbuff;
+ rptr += (sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)rptr;
+ rptr += (sizeof(MCVItemData));
+
+ item->values = (Datum*)rptr;
+ rptr += (sizeof(Datum)*ndims);
+
+ item->isnull = (bool*)rptr;
+ rptr += (sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+ for (j = 0; j < ndims; j++)
+ Assert(indexes[j] <= UINT16_MAX);
+#endif
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ /* release the temporary buffer */
+ pfree(buff);
+
+ return mcvlist;
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_mcv_items);
+
+Datum
+pg_mv_mcv_items(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MCVList mcvlist;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ mcvlist = load_mv_mcvlist(PG_GETARG_OID(0));
+
+ funcctx->user_fctx = mcvlist;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = mcvlist->nitems;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MCVList mcvlist;
+ MCVItem item;
+
+ mcvlist = (MCVList)funcctx->user_fctx;
+
+ Assert(call_cntr < mcvlist->nitems);
+
+ item = mcvlist->items[call_cntr];
+
+ stakeys = find_mv_attnums(PG_GETARG_OID(0), &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(4 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+
+ /* frequency */
+ values[3] = (char *) palloc(64 * sizeof(char));
+
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * mcvlist->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* item ID */
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ Datum val, valout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == mcvlist->ndimensions-1)
+ format = "%s, %s}";
+
+ val = item->values[i];
+ valout = FunctionCall1(&fmgrinfo[i], val);
+
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8ce9c0e..2c22d31 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,8 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled,\n"
- " deps_built,\n"
+ " deps_enabled, mcv_enabled,\n"
+ " deps_built, mcv_built,\n"
+ " mcv_max_items,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2128,6 +2129,8 @@ describeOneTableDetails(const char *schemaname,
printTableAddFooter(&cont, _("Statistics:"));
for (i = 0; i < tuples; i++)
{
+ bool first = true;
+
printfPQExpBuffer(&buf, " ");
/* statistics name (qualified with namespace) */
@@ -2137,10 +2140,22 @@ describeOneTableDetails(const char *schemaname,
/* options */
if (!strcmp(PQgetvalue(result, i, 4), "t"))
- appendPQExpBuffer(&buf, "(dependencies)");
+ {
+ appendPQExpBuffer(&buf, "(dependencies");
+ first = false;
+ }
+
+ if (!strcmp(PQgetvalue(result, i, 5), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", mcv");
+ else
+ appendPQExpBuffer(&buf, "(mcv");
+ first = false;
+ }
- appendPQExpBuffer(&buf, " ON (%s)",
- PQgetvalue(result, i, 6));
+ appendPQExpBuffer(&buf, ") ON (%s)",
+ PQgetvalue(result, i, 9));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index c74af47..3529b03 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -38,15 +38,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -62,14 +68,18 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 8
+#define Natts_pg_mv_statistic 12
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_staowner 4
#define Anum_pg_mv_statistic_deps_enabled 5
-#define Anum_pg_mv_statistic_deps_built 6
-#define Anum_pg_mv_statistic_stakeys 7
-#define Anum_pg_mv_statistic_stadeps 8
+#define Anum_pg_mv_statistic_mcv_enabled 6
+#define Anum_pg_mv_statistic_mcv_max_items 7
+#define Anum_pg_mv_statistic_deps_built 8
+#define Anum_pg_mv_statistic_mcv_built 9
+#define Anum_pg_mv_statistic_stakeys 10
+#define Anum_pg_mv_statistic_stadeps 11
+#define Anum_pg_mv_statistic_stamcv 12
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index eecce40..b16eebc 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2670,6 +2670,10 @@ DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index e10dcf1..2bcd582 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -653,9 +653,11 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
+ bool mcv_enabled; /* MCV list enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
+ bool mcv_built; /* MCV list built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index cc43a79..4535db7 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -51,30 +51,89 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
*/
MVDependencies load_mv_dependencies(Oid mvoid);
+MCVList load_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..075320b
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s4 ON mcv_list (unknown_column) WITH (mcv);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s4 ON mcv_list (a) WITH (mcv);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a, b) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+ERROR: max number of MCV items is 8192
+-- correct command
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s5 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s6 ON mcv_list (a, b, c, d) WITH (mcv);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 84b4425..66071d8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1373,7 +1373,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
s.staname,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 4f2ffb8..85d94f1 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 097a04f..6584d73 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -163,3 +163,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..b31d32d
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,178 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s4 ON mcv_list (unknown_column) WITH (mcv);
+
+-- single column
+CREATE STATISTICS s4 ON mcv_list (a) WITH (mcv);
+
+-- single column, duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a) WITH (mcv);
+
+-- two columns, one duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a, b) WITH (mcv);
+
+-- unknown option
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (unknown_option);
+
+-- missing MCV statistics
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+
+-- correct command
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s5 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s6 ON mcv_list (a, b, c, d) WITH (mcv);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DROP TABLE mcv_list;
--
2.1.0
0005-multivariate-histograms.patchtext/x-patch; charset=UTF-8; name=0005-multivariate-histograms.patchDownload
From fea437ee38376fda67d177276ce9812f2b0e9d81 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 5/9] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
doc/src/sgml/ref/create_statistics.sgml | 18 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 44 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 571 +++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.histogram | 287 ++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 37 +-
src/backend/utils/mvstats/histogram.c | 2032 ++++++++++++++++++++++++++++
src/bin/psql/describe.c | 17 +-
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 136 +-
src/test/regress/expected/mv_histogram.out | 207 +++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 176 +++
21 files changed, 3538 insertions(+), 38 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.histogram
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index 193e4b0..fd3382e 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -133,6 +133,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</varlistentry>
<varlistentry>
+ <term><literal>histogram</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables histogram for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>max_buckets</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of histogram buckets.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><literal>max_mcv_items</> (<type>integer</>)</term>
<listitem>
<para>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d570ee..6afdee0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -167,7 +167,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index b04c583..e2f3ff1 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -71,12 +71,15 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -175,6 +178,29 @@ CreateStatistics(CreateStatsStmt *stmt)
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("maximum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -183,10 +209,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -194,6 +220,11 @@ CreateStatistics(CreateStatsStmt *stmt)
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -214,11 +245,14 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 333e24b..9172f21 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2163,10 +2163,12 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
WRITE_BOOL_FIELD(mcv_enabled);
+ WRITE_BOOL_FIELD(hist_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
WRITE_BOOL_FIELD(mcv_built);
+ WRITE_BOOL_FIELD(hist_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 977f88e..0de2418 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -49,6 +49,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
int type);
@@ -74,6 +75,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStatisticInfo *mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -81,6 +84,12 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
@@ -93,6 +102,7 @@ static List * find_stats(PlannerInfo *root, Index relid);
#define UPDATE_RESULT(m,r,isor) \
(m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -121,7 +131,7 @@ static List * find_stats(PlannerInfo *root, Index relid);
*
* First we try to reduce the list of clauses by applying (soft) functional
* dependencies, and then we try to estimate the selectivity of the reduced
- * list of clauses using the multivariate MCV list.
+ * list of clauses using the multivariate MCV list and histograms.
*
* Finally we remove the portion of clauses estimated using multivariate stats,
* and process the rest of the clauses using the regular per-column stats.
@@ -214,11 +224,13 @@ clauselist_selectivity(PlannerInfo *root,
* with the multivariate code and simply skip to estimation using the
* regular per-column stats.
*/
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
- (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV) >= 2))
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) &&
+ (count_mv_attnums(clauses, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
/* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV);
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* and search for the statistic covering the most attributes */
MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
@@ -230,7 +242,7 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
- mvstat, MV_CLAUSE_TYPE_MCV);
+ mvstat, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -942,6 +954,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -955,9 +968,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* TODO if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1160,7 +1188,7 @@ choose_mv_statistics(List *stats, Bitmapset *attnums)
int numattrs = attrs->dim1;
/* skip dependencies-only stats */
- if (! info->mcv_built)
+ if (! (info->mcv_built || info->hist_built))
continue;
/* count columns covered by the histogram */
@@ -1391,7 +1419,7 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (! (context->types & MV_CLAUSE_TYPE_MCV))
+ if (! (context->types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST)))
return true; /* terminate */
break;
@@ -2007,6 +2035,9 @@ has_stats(List *stats, int type)
if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
}
return false;
@@ -2411,3 +2442,525 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ int nmatches = 0;
+ char *matches = NULL;
+
+ MVSerializedHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = load_mv_histogram(mvstats->mvoid);
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * Find out what part of the data is covered by the histogram,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample got into the histogram, and the rest
+ * is in a MCV list).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole histogram, which might save us some time
+ * spent accessing the not-matching part of the histogram.
+ * Although it's likely in a cache, so it's very fast.
+ */
+ u += mvhist->buckets[i]->ntuples;
+
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+#ifdef DEBUG_MVHIST
+ debug_histogram_matches(mvhist, matches);
+#endif
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s * u;
+}
+
+/* cached result of bucket boundary comparison for a single dimension */
+
+#define HIST_CACHE_NOT_FOUND 0x00
+#define HIST_CACHE_FALSE 0x01
+#define HIST_CACHE_TRUE 0x03
+#define HIST_CACHE_MASK 0x02
+
+static char
+bucket_contains_value(FmgrInfo ltproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache)
+{
+ bool a, b;
+
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /*
+ * First some quick checks on equality - if any of the boundaries equals,
+ * we have a partial match (so no need to call the comparator).
+ */
+ if (((min_value == constvalue) && (min_include)) ||
+ ((max_value == constvalue) && (max_include)))
+ return MVSTATS_MATCH_PARTIAL;
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ /*
+ * If result for the bucket lower bound not in cache, evaluate the function
+ * and store the result in the cache.
+ */
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, min_value));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /* And do the same for the upper bound. */
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, max_value));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ return (a ^ b) ? MVSTATS_MATCH_PARTIAL : MVSTATS_MATCH_NONE;
+}
+
+static char
+bucket_is_smaller_than_value(FmgrInfo opproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache, bool isgt)
+{
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ bool a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ bool b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ min_value,
+ constvalue));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ max_value,
+ constvalue));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /*
+ * Now, we need to combine both results into the final answer, and we need
+ * to be careful about the 'isgt' variable which kinda inverts the meaning.
+ *
+ * First, we handle the case when each boundary returns different results.
+ * In that case the outcome can only be 'partial' match.
+ */
+ if (a != b)
+ return MVSTATS_MATCH_PARTIAL;
+
+ /*
+ * When the results are the same, then it depends on the 'isgt' value. There
+ * are four options:
+ *
+ * isgt=false a=b=true => full match
+ * isgt=false a=b=false => empty
+ * isgt=true a=b=true => empty
+ * isgt=true a=b=false => full match
+ *
+ * We'll cheat a bit, because we know that (a=b) so we'll use just one of them.
+ */
+ if (isgt)
+ return (!a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+ else
+ return ( a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ /*
+ * Used for caching function calls, only once per deduplicated value.
+ *
+ * We know may have up to (2 * nbuckets) values per dimension. It's
+ * probably overkill, but let's allocate that once for all clauses,
+ * to minimize overhead.
+ *
+ * Also, we only need two bits per value, but this allocates byte
+ * per value. Might be worth optimizing.
+ *
+ * 0x00 - not yet called
+ * 0x01 - called, result is 'false'
+ * 0x03 - called, result is 'true'
+ */
+ char *callcache = palloc(mvhist->nbuckets);
+
+ Assert(mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ /* reset the cache (per clause) */
+ memset(callcache, 0, mvhist->nbuckets);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_LT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ char res = MVSTATS_MATCH_NONE;
+
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /* histogram boundaries */
+ Datum minval, maxval;
+ bool mininclude, maxinclude;
+ int minidx, maxidx;
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* lookup the values and cache of function calls */
+ minidx = bucket->min[idx];
+ maxidx = bucket->max[idx];
+
+ minval = mvhist->values[idx][bucket->min[idx]];
+ maxval = mvhist->values[idx][bucket->max[idx]];
+
+ mininclude = bucket->min_inclusive[idx];
+ maxinclude = bucket->max_inclusive[idx];
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ */
+ switch (oprrest)
+ {
+ case F_SCALARLTSEL: /* Var < Const */
+ case F_SCALARGTSEL: /* Var > Const */
+
+ res = bucket_is_smaller_than_value(opproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache, isgt);
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the
+ * lt operator, and we also check for equality with the boundaries.
+ */
+
+ res = bucket_contains_value(ltproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache);
+ break;
+ }
+
+ UPDATE_RESULT(matches[i], res, is_or);
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the bucket, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+
+ /* free the call cache */
+ pfree(callcache);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8394111..2519249 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -412,7 +412,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
{
info = makeNode(MVStatisticInfo);
@@ -422,10 +422,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
+ info->hist_enabled = mvstat->hist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
+ info->hist_built = mvstat->hist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index f9bf10c..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.histogram b/src/backend/utils/mvstats/README.histogram
new file mode 100644
index 0000000..8234d2c
--- /dev/null
+++ b/src/backend/utils/mvstats/README.histogram
@@ -0,0 +1,287 @@
+Multivariate histograms
+=======================
+
+Histograms on individual attributes consist of buckets represented by ranges,
+covering the domain of the attribute. That is, each bucket is a [min,max]
+interval, and contains all values in this range. The histogram is built in such
+a way that all buckets have about the same frequency.
+
+Multivariate histograms are an extension into n-dimensional space - the buckets
+are n-dimensional intervals (i.e. n-dimensional rectagles), covering the domain
+of the combination of attributes. That is, each bucket has a vector of lower
+and upper boundaries, denoted min[i] and max[i] (where i = 1..n).
+
+In addition to the boundaries, each bucket tracks additional info:
+
+ * frequency (fraction of tuples in the bucket)
+ * whether the boundaries are inclusive or exclusive
+ * whether the dimension contains only NULL values
+ * number of distinct values in each dimension (for building only)
+
+It's possible that in the future we'll multiple histogram types, with different
+features. We do however expect all the types to share the same representation
+(buckets as ranges) and only differ in how we build them.
+
+The current implementation builds non-overlapping buckets, that may not be true
+for some histogram types and the code should not rely on this assumption. There
+are interesting types of histograms (or algorithms) with overlapping buckets.
+
+When used on low-cardinality data, histograms usually perform considerably worse
+than MCV lists (which are a good fit for this kind of data). This is especially
+true on label-like values, where ordering of the values is mostly unrelated to
+meaning of the data, as proper ordering is crucial for histograms.
+
+On high-cardinality data the histograms are usually a better choice, because MCV
+lists can't represent the distribution accurately enough.
+
+
+Selectivity estimation
+----------------------
+
+The estimation is implemented in clauselist_mv_selectivity_histogram(), and
+works very similarly to clauselist_mv_selectivity_mcvlist.
+
+The main difference is that while MCV lists support exact matches, histograms
+often result in approximate matches - e.g. with equality we can only say if
+the constant would be part of the bucket, but not whether it really is there
+or what fraction of the bucket it corresponds to. In this case we rely on
+some defaults just like in the per-column histograms.
+
+The current implementation uses histograms to estimates those types of clauses
+(think of WHERE conditions):
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR-clauses WHERE (a = 1) OR (b = 2)
+
+Similarly to MCV lists, it's possible to add support for additional types of
+clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and so on. These are tasks for the future, not yet implemented.
+
+
+When evaluating a clause on a bucket, we may get one of three results:
+
+ (a) FULL_MATCH - The bucket definitely matches the clause.
+
+ (b) PARTIAL_MATCH - The bucket matches the clause, but not necessarily all
+ the tuples it represents.
+
+ (c) NO_MATCH - The bucket definitely does not match the clause.
+
+This may be illustrated using a range [1, 5], which is essentially a 1-D bucket.
+With clause
+
+ WHERE (a < 10) => FULL_MATCH (all range values are below
+ 10, so the whole bucket matches)
+
+ WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ the clause, but we don't know how many)
+
+ WHERE (a < 0) => NO_MATCH (the whole range is above 1, so
+ no values from the bucket can match)
+
+Some clauses may produce only some of those results - for example equality
+clauses may never produce FULL_MATCH as we always hit only part of the bucket
+(we can't match both boundaries at the same time). This results in less accurate
+estimates compared to MCV lists, where we can hit a MCV items exactly (there's
+no PARTIAL match in MCV).
+
+There are also clauses that may not produce any PARTIAL_MATCH results. A nice
+example of that is 'IS [NOT] NULL' clause, which either matches the bucket
+completely (FULL_MATCH) or not at all (NO_MATCH), thanks to how the NULL-buckets
+are constructed.
+
+Computing the total selectivity estimate is trivial - simply sum selectivities
+from all the FULL_MATCH and PARTIAL_MATCH buckets (but for buckets marked with
+PARTIAL_MATCH, multiply the frequency by 0.5 to minimize the average error).
+
+
+Building a histogram
+---------------------
+
+The algorithm of building a histogram in general is quite simple:
+
+ (a) create an initial bucket (containing all sample rows)
+
+ (b) create NULL buckets (by splitting the initial bucket)
+
+ (c) repeat
+
+ (1) choose bucket to split next
+
+ (2) terminate if no bucket that might be split found, or if we've
+ reached the maximum number of buckets (16384)
+
+ (3) choose dimension to partition the bucket by
+
+ (4) partition the bucket by the selected dimension
+
+The main complexity is hidden in steps (c.1) and (c.3), i.e. how we choose the
+bucket and dimension for the split.
+
+Similarly to one-dimensional histograms, we want to produce buckets with roughly
+the same frequency. We also need to produce "regular" buckets, because buckets
+with one "side" much longer than the others are very likely to match a lot of
+conditions (which increases error, even if the bucket frequency is very low).
+
+To achieve this, we choose the largest bucket (containing the most sample rows),
+but we only choose buckets that can actually be split (have at least 3 different
+combinations of values).
+
+Then we choose the "longest" dimension of the bucket, which is computed by using
+the distinct values in the sample as a measure.
+
+For details see functions select_bucket_to_partition() and partition_bucket().
+
+The current limit on number of buckets (16384) is mostly arbitrary, but chosen
+so that it guarantees we don't exceed the number of distinct values indexable by
+uint16 in any of the dimensions. In practice we could handle more buckets as we
+index each dimension separately and the splits should use the dimensions evenly.
+
+Also, histograms this large (with 16k values in multiple dimensions) would be
+quite expensive to build and process, so the 16k limit is rather reasonable.
+
+The actual number of buckets is also related to statistics target, because we
+require MIN_BUCKET_ROWS (10) tuples per bucket before a split, so we can't have
+more than (2 * 300 * target / 10) buckets. For the default target (100) this
+evaluates to ~6k.
+
+
+NULL handling (create_null_buckets)
+-----------------------------------
+
+When building histograms on a single attribute, we first filter out NULL values.
+In the multivariate case, we can't really do that because the rows may contain
+a mix of NULL and non-NULL values in different columns (so we can't simply
+filter all of them out).
+
+For this reason, the histograms are built in a way so that for each bucket, each
+dimension only contains only NULL or non-NULL values. Building the NULL-buckets
+happens as the first step in the build, by the create_null_buckets() function.
+The number of NULL buckets, as produced by this function, has a clear upper
+boundary (2^N) where N is the number of dimensions (attributes the histogram is
+built on). Or rather 2^K where K is the number of attributes that are not marked
+as not-NULL.
+
+The buckets with NULL dimensions are then subject to the same build algorithm
+(i.e. may be split into smaller buckets) just like any other bucket, but may
+only be split by non-NULL dimension.
+
+
+Serialization
+-------------
+
+To store the histogram in pg_mv_statistic table, it is serialized into a more
+efficient form. We also use the representation for estimation, i.e. we don't
+fully deserialize the histogram.
+
+For example the boundary values are deduplicated to minimize the required space.
+How much redundancy is there, actually? Let's assume there are no NULL values,
+so we start with a single bucket - in that case we have 2*N boundaries. Each
+time we split a bucket we introduce one new value (in the "middle" of one of
+the dimensions), and keep boundries for all the other dimensions. So after K
+splits, we have up to
+
+ 2*N + K
+
+unique boundary values (we may have fewe values, if the same value is used for
+several splits). But after K splits we do have (K+1) buckets, so
+
+ (K+1) * 2 * N
+
+boundary values. Using e.g. N=4 and K=999, we arrive to those numbers:
+
+ 2*N + K = 1007
+ (K+1) * 2 * N = 8000
+
+wich means a lot of redundancy. It's somewhat counter-intuitive that the number
+of distinct values does not really depend on the number of dimensions (except
+for the initial bucket, but that's negligible compared to the total).
+
+By deduplicating the values and replacing them with 16-bit indexes (uint16), we
+reduce the required space to
+
+ 1007 * 8 + 8000 * 2 ~= 24kB
+
+which is significantly less than 64kB required for the 'raw' histogram (assuming
+the values are 8B).
+
+While the bytea compression (pglz) might achieve the same reduction of space,
+the deduplicated representation is used to optimize the estimation by caching
+results of function calls for already visited values. This significantly
+reduces the number of calls to (often quite expensive) operators.
+
+Note: Of course, this reasoning only holds for histograms built by the algorithm
+that simply splits the buckets in half. Other histograms types (e.g. containing
+overlapping buckets) may behave differently and require different serialization.
+
+Serialized histograms are marked with 'magic' constant, to make it easier to
+check the bytea value really is a serialized histogram.
+
+
+varlena compression
+-------------------
+
+This serialization may however disable automatic varlena compression, the array
+of unique values is placed at the beginning of the serialized form. Which is
+exactly the chunk used by pglz to check if the data is compressible, and it
+will probably decide it's not very compressible. This is similar to the issue
+we had with JSONB initially.
+
+Maybe storing buckets first would make it work, as the buckets may be better
+compressible.
+
+On the other hand the serialization is actually a context-aware compression,
+usually compressing to ~30% (or even less, with large data types). So the lack
+of additional pglz compression may be acceptable.
+
+
+Deserialization
+---------------
+
+The deserialization is not a perfect inverse of the serialization, as we keep
+the deduplicated arrays. This reduces the amount of memory and also allows
+optimizations during estimation (e.g. we can cache results for the distinct
+values, saving expensive function calls).
+
+
+Inspecting the histogram
+------------------------
+
+Inspecting the regular (per-attribute) histograms is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarray, so we
+simply get the text representation of the array.
+
+With multivariate histograms it's not that simple due to the possible mix of
+data types in the histogram. It might be possible to produce similar array-like
+text representation, but that'd unnecessarily complicate further processing
+and analysis of the histogram. Instead, there's a SRF function that allows
+access to lower/upper boundaries, frequencies etc.
+
+ SELECT * FROM pg_mv_histogram_buckets();
+
+It has two input parameters:
+
+ oid - OID of the histogram (pg_mv_statistic.staoid)
+ otype - type of output
+
+and produces a table with these columns:
+
+ - bucket ID (0...nbuckets-1)
+ - lower bucket boundaries (string array)
+ - upper bucket boundaries (string array)
+ - nulls only dimensions (boolean array)
+ - lower boundary inclusive (boolean array)
+ - upper boundary includive (boolean array)
+ - frequency (double precision)
+
+The 'otype' accepts three values, determining what will be returned in the
+lower/upper boundary arrays:
+
+ - 0 - values stored in the histogram, encoded as text
+ - 1 - indexes into the deduplicated arrays
+ - 2 - idnexes into the deduplicated arrays, scaled to [0,1]
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 5c5c59a..3e4f4d1 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -18,6 +18,8 @@ Currently we only have two kinds of multivariate statistics
(b) MCV lists (README.mcv)
+ (c) multivariate histograms (README.histogram)
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index d1da714..ffb76f4 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -13,11 +13,11 @@
*
*-------------------------------------------------------------------------
*/
+#include "postgres.h"
+#include "utils/array.h"
#include "common.h"
-#include "utils/array.h"
-
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts,
VacAttrStats **vacattrstats);
@@ -52,7 +52,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -95,8 +96,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (stat->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
}
}
@@ -176,6 +181,8 @@ list_mv_stats(Oid relid)
info->deps_built = stats->deps_built;
info->mcv_enabled = stats->mcv_enabled;
info->mcv_built = stats->mcv_built;
+ info->hist_enabled = stats->hist_enabled;
+ info->hist_built = stats->hist_built;
result = lappend(result, info);
}
@@ -190,7 +197,6 @@ list_mv_stats(Oid relid)
return result;
}
-
/*
* Find attnims of MV stats using the mvoid.
*/
@@ -236,9 +242,16 @@ find_mv_attnums(Oid mvoid, Oid *relid)
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -271,22 +284,34 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..9e5620a
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,2032 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+#include <math.h>
+
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static int bsearch_comparator(const void * a, const void * b);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(int32))
+ * - max boundary indexes (2 * ndim * sizeof(int32))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(int32) + 3 * sizeof(bool)) +
+ * 2 * sizeof(float)
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) ((float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* can't split bucket with less than 10 rows */
+#define MIN_BUCKET_ROWS 10
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * Building a multivariate algorithm. In short it first creates a single
+ * bucket containing all the rows, and then repeatedly split is by first
+ * searching for the bucket / dimension most in need of a split.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size.
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket
+ * for more details about the algorithm.
+ *
+ * The current algorithm works like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [not reaching maximum number of buckets]
+ *
+ * choose bucket to partition (largest bucket)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (largest dimension)
+ * split the bucket into two buckets
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ int *ndistvalues;
+ Datum **distvalues;
+
+ MVHistogram histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ histogram->ndimensions = numattrs;
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+ histogram->nbuckets = 1;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets
+ = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * Collect info on distinct values in each dimension (used later
+ * to select dimension to partition).
+ */
+ ndistvalues = (int*)palloc0(sizeof(int) * numattrs);
+ distvalues = (Datum**)palloc0(sizeof(Datum*) * numattrs);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ int j;
+ int nvals;
+ Datum *tmp;
+
+ SortSupportData ssup;
+ StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ nvals = 0;
+ tmp = (Datum*)palloc0(sizeof(Datum) * numrows);
+
+ for (j = 0; j < numrows; j++)
+ {
+ bool isnull;
+
+ /* remember the index of the sample row, to make the partitioning simpler */
+ Datum value = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+
+ if (isnull)
+ continue;
+
+ tmp[nvals++] = value;
+ }
+
+ /* do the sort and stuff only if there are non-NULL values */
+ if (nvals > 0)
+ {
+ /* sort the array of values */
+ qsort_arg((void *) tmp, nvals, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /* count distinct values */
+ ndistvalues[i] = 1;
+ for (j = 1; j < nvals; j++)
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ ndistvalues[i] += 1;
+
+ /* FIXME allocate only needed space (count ndistinct first) */
+ distvalues[i] = (Datum*)palloc0(sizeof(Datum) * ndistvalues[i]);
+
+ /* now collect distinct values into the array */
+ distvalues[i][0] = tmp[0];
+ ndistvalues[i] = 1;
+
+ for (j = 1; j < nvals; j++)
+ {
+ if (compare_scalars_simple(&tmp[j], &tmp[j-1], &ssup) != 0)
+ {
+ distvalues[i][ndistvalues[i]] = tmp[j];
+ ndistvalues[i] += 1;
+ }
+ }
+ }
+
+ pfree(tmp);
+ }
+
+ /*
+ * The initial bucket may contain NULL values, so we have to create
+ * buckets with NULL-only dimensions.
+ *
+ * FIXME We may need up to 2^ndims buckets - check that there are
+ * enough buckets (MVSTAT_HIST_MAX_BUCKETS >= 2^ndims).
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no more buckets to partition */
+ if (bucket == NULL)
+ break;
+
+ histogram->buckets[histogram->nbuckets]
+ = partition_bucket(bucket, attrs, stats,
+ ndistvalues, distvalues);
+
+ histogram->nbuckets += 1;
+ }
+
+ /* finalize the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data
+ = ((HistogramBuild)histogram->buckets[i]->build_data);
+
+ /*
+ * The frequency has to be computed from the whole sample, in
+ * case some of the rows were used for MCV (and thus are missing
+ * from the histogram).
+ */
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVSerializedHistogram
+load_mv_histogram(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram(DatumGetByteaP(histogram));
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVSerializedHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+
+/* used to pass context into bsearch() */
+static SortSupport ssup_private = NULL;
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm is quite
+ * simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ *
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ *
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we're mixing different
+ * datatypes, and we we need to use the right operators to compare/sort them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ DimensionInfo * info
+ = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ SortSupport ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs
+ * (we won't use them, but we don't know how many are there),
+ * and then collect all non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* make sure we fit into uint16 */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typlen > 0)
+ /* byval or byref, but with fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (10 * 1024 * 1024))
+ elog(ERROR, "serialized histogram exceeds 10MB (%ld > %d)",
+ total_length, (10 * 1024 * 1024));
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* value array for each dimension */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ if (info[i].typlen > 0)
+ {
+ /* pased by value or reference, but fixed length */
+ memcpy(data, &values[i][j], info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ VARSIZE_ANY(values[i][j]));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring (don't forget the \0 terminator!) */
+ memcpy(data, DatumGetPointer(values[i][j]),
+ strlen(DatumGetPointer(values[i][j])) + 1);
+ data += strlen(DatumGetPointer(values[i][j])) + 1;
+ }
+ }
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* and finally, the histogram buckets */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ *BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ uint16 idx;
+ Datum * v = NULL;
+ ssup_private = &ssup[j];
+
+ /* min boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ bsearch_comparator);
+
+ if (v == NULL)
+ elog(ERROR, "value for dim %d not found in array", j);
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* FIXME free the values/counts arrays here */
+
+ return output;
+}
+
+/*
+ * Returns histogram in a partially-serialized form (keeps the boundary
+ * values deduplicated, so that it's possible to optimize the estimation
+ * part by caching function call results between buckets etc.).
+ */
+MVSerializedHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+
+ MVSerializedHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram
+ = (MVSerializedHistogram)palloc(sizeof(MVSerializedHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVSerializedHistogramData, buckets));
+ tmp += offsetof(MVSerializedHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete,
+ * as we yet have to count the array sizes (from DimensionInfo
+ * records).
+ */
+ expected_size = offsetof(MVSerializedHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* now let's allocate a single buffer for all the values and counts */
+
+ bufflen = (sizeof(int) + sizeof(Datum*)) * ndims;
+ for (i = 0; i < ndims; i++)
+ {
+ /* don't allocate space for byval types, matching Datum */
+ if (! (info[i].typbyval && (info[i].typlen == sizeof(Datum))))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+ }
+
+ /* also, include space for the result, tracking the buckets */
+ bufflen += nbuckets * (
+ sizeof(MVSerializedBucket) + /* bucket pointer */
+ sizeof(MVSerializedBucketData)); /* bucket data */
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ histogram->nvalues = (int*)ptr;
+ ptr += (sizeof(int) * ndims);
+
+ histogram->values = (Datum**)ptr;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ histogram->nvalues[i] = info[i].nvalues;
+
+ if (info[i].typbyval && info[i].typlen == sizeof(Datum))
+ {
+ /* passed by value / Datum - simply reuse the array */
+ histogram->values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ /* all the varlena data need a chunk from the buffer */
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typbyval)
+ {
+ /* pased by value, but smaller than Datum */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&histogram->values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ histogram->buckets = (MVSerializedBucket*)ptr;
+ ptr += (sizeof(MVSerializedBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVSerializedBucket bucket = (MVSerializedBucket)ptr;
+ ptr += sizeof(MVSerializedBucketData);
+
+ bucket->ntuples = *BUCKET_NTUPLES(tmp);
+ bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+ bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+ bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+ bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+ bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ /* we should exhaust the output buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller ones.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which
+ * we use when selecting bucket to partition), and then number of
+ * distinct values for each partition (which we use when choosing
+ * which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm
+ * produces buckets with about equal frequency and regular size. We
+ * select the bucket with the highest number of distinct values, and
+ * then split it by the longest dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this
+ * is used to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this
+ * contains values for all the tuples from the sample, not just
+ * the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned,
+ * or NULL if there are no buckets that may be split (i.e. all buckets
+ * contain a single distinct value).
+ *
+ * TODO Consider other partitioning criteria (v-optimal, maxdiff etc.).
+ * For example use the "bucket volume" (product of dimension
+ * lengths) to select the bucket.
+ *
+ * We need buckets containing about the same number of tuples (so
+ * about the same frequency), as that limits the error when we
+ * match the bucket partially (in that case use 1/2 the bucket).
+ *
+ * We also need buckets with "regular" size, i.e. not "narrow" in
+ * some dimensions and "wide" in the others, because that makes
+ * partial matches more likely and increases the estimation error,
+ * especially when the clauses match many buckets partially. This
+ * is especially serious for OR-clauses, because in that case any
+ * of them may add the bucket as a (partial) match. With AND-clauses
+ * all the clauses have to match the bucket, which makes this issue
+ * somewhat less pressing.
+ *
+ * For example this table:
+ *
+ * CREATE TABLE t AS SELECT i AS a, i AS b
+ * FROM generate_series(1,1000000) s(i);
+ * ALTER TABLE t ADD STATISTICS (histogram) ON (a,b);
+ * ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because
+ * every bucket always has exactly the same number of distinct
+ * values in all dimensions, which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ * SELECT * FROM t WHERE a < 10 AND b < 10;
+ *
+ * is estimated to return ~120 rows, while in reality it returns 9.
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=117 width=8)
+ * (actual time=0.185..270.774 rows=9 loops=1)
+ * Filter: ((a < 10) AND (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * while the query using OR clauses is estimated like this:
+ *
+ * QUERY PLAN
+ * ----------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=8100 width=8)
+ * (actual time=0.118..189.919 rows=9 loops=1)
+ * Filter: ((a < 10) OR (b < 10))
+ * Rows Removed by Filter: 999991
+ *
+ * which is clearly much worse. This happens because the histogram
+ * contains buckets like this:
+ *
+ * bucket 592 [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the
+ * length of "b" is (30593-30134)=459. So the "b" dimension is much
+ * narrower than "a". Of course, there are buckets where "b" is the
+ * wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension
+ * in partition_bucket() but that only happens after we already
+ * selected the bucket. So if we never select the bucket, we can't
+ * really fix it there.
+ *
+ * The other reason why this particular example behaves so poorly
+ * is due to the way we split the partition in partition_bucket().
+ * Currently we attempt to divide the bucket into two parts with
+ * the same number of sampled tuples (frequency), but that does not
+ * work well when all the tuples are squashed on one end of the
+ * bucket (e.g. exactly at the diagonal, as a=b). In that case we
+ * split the bucket into a tiny bucket on the diagonal, and a huge
+ * remaining part of the bucket, which is still going to be narrow
+ * and we're unlikely to fix that.
+ *
+ * So perhaps we need two partitioning strategies - one aiming to
+ * split buckets with high frequency (number of sampled rows), the
+ * other aiming to split "large" buckets. And alternating between
+ * them, somehow.
+ *
+ * TODO Allowing the bucket to degenerate to a single combination of
+ * values makes it rather strange MCV list. Maybe we should use
+ * higher lower boundary, or maybe make the selection criteria
+ * more complex (e.g. consider number of rows in the bucket, etc.).
+ *
+ * That however is different from buckets 'degenerated' only for
+ * some dimensions (e.g. half of them), which is perfectly
+ * appropriate for statistics on a combination of low and high
+ * cardinality columns.
+ *
+ * TODO Consider using similar lower boundary for row count as for simple
+ * histograms, i.e. 300 tuples per bucket.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int numrows = 0;
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+ /* if the number of rows is higher, use this bucket */
+ if ((data->ndistinct > 2) &&
+ (data->numrows > numrows) &&
+ (data->numrows >= MIN_BUCKET_ROWS)) {
+ bucket = buckets[i];
+ numrows = data->numrows;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest
+ * bucket dimension, measured using the array of distinct values built
+ * at the very beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly
+ * distributed, and then use this to measure length. It's essentially
+ * a number of distinct values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts
+ * with roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning
+ * the new bucket (essentially shrinking the existing one in-place and
+ * returning the other "half" as a new bucket). The caller is responsible
+ * for adding the new bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension
+ * most in need of a split. For a nice summary and general overview, see
+ * "rK-Hist : an R-Tree based histogram for multi-dimensional selectivity
+ * estimation" thesis by J. A. Lopez, Concordia University, p.34-37 (and
+ * possibly p. 32-34 for explanation of the terms).
+ *
+ * TODO It requires care to prevent splitting only one dimension and not
+ * splitting another one at all (which might happen easily in case
+ * of strongly dependent columns - e.g. y=x). The current algorithm
+ * minimizes this, but may still happen for perfectly dependent
+ * examples (when all the dimensions have equal length, the first
+ * one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ /* looking for the split value */
+ // int ndistinct = 1; /* number of distinct values below current value */
+ int nrows = 1; /* number of rows below current value */
+ double delta;
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Look for the next dimension to split.
+ */
+ delta = 0.0;
+ dimension = -1;
+
+ for (i = 0; i < numattrs; i++)
+ {
+ Datum *a, *b;
+
+ mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ /* can't split NULL-only dimension */
+ if (bucket->nullsonly[i])
+ continue;
+
+ /* can't split dimension with a single ndistinct value */
+ if (data->ndistincts[i] <= 1)
+ continue;
+
+ /* sort support for the bsearch_comparator */
+ ssup_private = &ssup;
+
+ /* search for min boundary in the distinct list */
+ a = (Datum*)bsearch(&bucket->min[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ b = (Datum*)bsearch(&bucket->max[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), bsearch_comparator);
+
+ /* if this dimension is 'larger' then partition by it */
+ if (((b-a)*1.0 / ndistvalues[i]) > delta)
+ {
+ delta = ((b-a)*1.0 / ndistvalues[i]);
+ dimension = i;
+ }
+ }
+
+ /*
+ * If we haven't found a dimension here, we've done something
+ * wrong in select_bucket_to_partition.
+ */
+ Assert(dimension != -1);
+
+ /*
+ * Walk through the selected dimension, collect and sort the values
+ * and then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we don't do splits by null-only dimensions) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values
+ * in this dimension, and we want to split this into half, so walk
+ * through the array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value,
+ * and use it as an exclusive upper boundary (and inclusive lower
+ * boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct
+ * values (at least for even distinct counts), but that would
+ * require being able to do an average (which does not work
+ * for non-arithmetic types).
+ *
+ * TODO Another option is to look for a split that'd give about
+ * 50% tuples (not distinct values) in each partition. That
+ * might work better when there are a few very frequent
+ * values, and many rare ones.
+ */
+ delta = fabs(data->numrows);
+ split_value = values[0].value;
+
+ for (i = 1; i < data->numrows; i++)
+ {
+ if (values[i].value != values[i-1].value)
+ {
+ /* are we closer to splitting the bucket in half? */
+ if (fabs(i - data->numrows/2.0) < delta)
+ {
+ /* let's assume we'll use this value for the split */
+ split_value = values[i].value;
+ delta = fabs(i - data->numrows/2.0);
+ nrows = i;
+ }
+ }
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno'
+ * index. We know 'nrows' rows should remain in the original
+ * bucket and the rest goes to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should
+ * go to the new one. Use the tupno field to get the actual HeapTuple
+ * row from the original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time
+ * data, i.e. sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies
+ * the Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types
+ * (assuming they don't use collations etc.)
+ *
+ * TODO This might evaluate and store the distinct counts for all
+ * possible attribute combinations. The assumption is this might be
+ * useful for estimating things like GROUP BY cardinalities (e.g.
+ * in cases when some buckets contain a lot of low-frequency
+ * combinations, and other buckets contain few high-frequency ones).
+ *
+ * But it's unclear whether it's worth the price. Computing this
+ * is actually quite cheap, because it may be evaluated at the very
+ * end, when the buckets are rather small (so sorting it in 2^N ways
+ * is not a big deal). Assuming the partitioning algorithm does not
+ * use these values to do the decisions, of course (the current
+ * algorithm does not).
+ *
+ * The overhead with storing, fetching and parsing the data is more
+ * concerning - adding 2^N values per bucket (even if it's just
+ * a 1B or 2B value) would significantly bloat the histogram, and
+ * thus the impact on optimizer. Which is not really desirable.
+ *
+ * TODO This only updates the ndistinct for the sample (or bucket), but
+ * we eventually need an estimate of the total number of distinct
+ * values in the dataset. It's possible to either use the current
+ * 1D approach (i.e., if it's more than 10% of the sample, assume
+ * it's proportional to the number of rows). Or it's possible to
+ * implement the estimator suggested in the article, supposedly
+ * giving 'optimal' estimates (w.r.t. probability of error).
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes
+ * above (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and
+ * non-NULL values in a single dimension. Each dimension may either be
+ * marked as 'nulls only', and thus containing only NULL values, or
+ * it must not contain any NULL values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns,
+ * it's necessary to build those NULL-buckets. This is done in an
+ * iterative way using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL
+ * and non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not
+ * marked as NULL-only, mark it as NULL-only and run the
+ * algorithm again (on this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the
+ * bucket into two parts - one with NULL values, one with
+ * non-NULL values (replacing the current one). Then run
+ * the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions
+ * should be quite low - limited by the number of NULL-buckets. Also,
+ * in each branch the number of nested calls is limited by the number
+ * of dimensions (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The
+ * number of buckets produced by this algorithm is rather limited - with
+ * N dimensions, there may be only 2^N such buckets (each dimension may
+ * be either NULL or non-NULL). So with 8 dimensions (current value of
+ * MVSTATS_MAX_DIMENSIONS) there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further
+ * optimizing the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL
+ * in a dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ /*
+ * FIXME We don't need to start from the first attribute
+ * here - we can start from the last known dimension.
+ */
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only,
+ * but is not yet marked like that. It's enough to mark it and
+ * repeat the process recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in
+ * the dimension, one with non-NULL values. We don't need to sort
+ * the data or anything, but otherwise it's similar to what's done
+ * in partition_bucket().
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each
+ * bucket (NULL is not a value, so 0, and the other bucket got
+ * all the ndistinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+
+}
+
+/*
+ * We need to pass the SortSupport to the comparator, but bsearch()
+ * has no 'context' parameter, so we use a global variable (ugly).
+ */
+static int
+bsearch_comparator(const void * a, const void * b)
+{
+ Assert(ssup_private != NULL);
+ return compare_scalars_simple(a, b, (void*)ssup_private);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows
+ * returned if the statistics contains no histogram (or if there's no
+ * statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ * - prints actual values
+ * - using the output function of the data type (as string)
+ * - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ * - prints index of the distinct value (into the serialized array)
+ * - makes it easier to spot neighbor buckets, etc.
+ * - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ * - prints index of the distinct value, but normalized into [0,1]
+ * - similar to 1, but shows how 'long' the bucket range is
+ * - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options
+ * skew the lengths by distributing the distinct values uniformly. For
+ * data types without a clear meaning of 'distance' (e.g. strings) that
+ * is not a big deal, but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_histogram_buckets);
+
+Datum
+pg_mv_histogram_buckets(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ Oid mvoid = PG_GETARG_OID(0);
+ int otype = PG_GETARG_INT32(1);
+
+ if ((otype < 0) || (otype > 2))
+ elog(ERROR, "invalid output type specified");
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MVSerializedHistogram histogram;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ histogram = load_mv_histogram(mvoid);
+
+ funcctx->user_fctx = histogram;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = histogram->nbuckets;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+ double bucket_size = 1.0;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MVSerializedHistogram histogram;
+ MVSerializedBucket bucket;
+
+ histogram = (MVSerializedHistogram)funcctx->user_fctx;
+
+ Assert(call_cntr < histogram->nbuckets);
+
+ bucket = histogram->buckets[call_cntr];
+
+ stakeys = find_mv_attnums(mvoid, &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple.
+ * This should be an array of C strings which will
+ * be processed later by the type input functions.
+ */
+ values = (char **) palloc(9 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+ values[3] = (char *) palloc0(1024 * sizeof(char));
+ values[4] = (char *) palloc0(1024 * sizeof(char));
+ values[5] = (char *) palloc0(1024 * sizeof(char));
+
+ values[6] = (char *) palloc(64 * sizeof(char));
+ values[7] = (char *) palloc(64 * sizeof(char));
+ values[8] = (char *) palloc(64 * sizeof(char));
+
+ /* we need to do this only when printing the actual values */
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * histogram->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* bucket ID */
+
+ /*
+ * currently we only print array of indexes, but the deduplicated
+ * values should be sorted, so this is actually quite useful
+ *
+ * TODO print the actual min/max values, using the output
+ * function of the attribute type
+ */
+
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bucket_size *= (bucket->max[i] - bucket->min[i]) * 1.0
+ / (histogram->nvalues[i]-1);
+
+ /* print the actual values, i.e. use output function etc. */
+ if (otype == 0)
+ {
+ Datum minval, maxval;
+ Datum minout, maxout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ minval = histogram->values[i][bucket->min[i]];
+ minout = FunctionCall1(&fmgrinfo[i], minval);
+
+ maxval = histogram->values[i][bucket->max[i]];
+ maxout = FunctionCall1(&fmgrinfo[i], maxval);
+
+ // snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(minout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ // snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ snprintf(buff, 1024, format, values[2], DatumGetPointer(maxout));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else if (otype == 1)
+ {
+ format = "%s, %d";
+ if (i == 0)
+ format = "{%s%d";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %d}";
+
+ snprintf(buff, 1024, format, values[1], bucket->min[i]);
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], bucket->max[i]);
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+ else
+ {
+ format = "%s, %f";
+ if (i == 0)
+ format = "{%s%f";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %f}";
+
+ snprintf(buff, 1024, format, values[1],
+ bucket->min[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2],
+ bucket->max[i] * 1.0 / (histogram->nvalues[i]-1));
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == histogram->ndimensions-1)
+ format = "%s, %s}";
+
+ snprintf(buff, 1024, format, values[3], bucket->nullsonly[i] ? "t" : "f");
+ strncpy(values[3], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[4], bucket->min_inclusive[i] ? "t" : "f");
+ strncpy(values[4], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[5], bucket->max_inclusive[i] ? "t" : "f");
+ strncpy(values[5], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[6], 64, "%f", bucket->ntuples); /* frequency */
+ snprintf(values[7], 64, "%f", bucket->ntuples / bucket_size); /* density */
+ snprintf(values[8], 64, "%f", bucket_size); /* bucket_size */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+ pfree(values[4]);
+ pfree(values[5]);
+ pfree(values[6]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
+
+#ifdef DEBUG_MVHIST
+/*
+ * prints debugging info about matched histogram buckets (full/partial)
+ *
+ * XXX Currently works only for INT data type.
+ */
+void
+debug_histogram_matches(MVSerializedHistogram mvhist, char *matches)
+{
+ int i, j;
+
+ float ffull = 0, fpartial = 0;
+ int nfull = 0, npartial = 0;
+
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ char ranges[1024];
+
+ if (! matches[i])
+ continue;
+
+ /* increment the counters */
+ nfull += (matches[i] == MVSTATS_MATCH_FULL) ? 1 : 0;
+ npartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? 1 : 0;
+
+ /* and also update the frequencies */
+ ffull += (matches[i] == MVSTATS_MATCH_FULL) ? bucket->ntuples : 0;
+ fpartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? bucket->ntuples : 0;
+
+ memset(ranges, 0, sizeof(ranges));
+
+ /* build ranges for all the dimentions */
+ for (j = 0; j < mvhist->ndimensions; j++)
+ {
+ sprintf(ranges, "%s [%d %d]", ranges,
+ DatumGetInt32(mvhist->values[j][bucket->min[j]]),
+ DatumGetInt32(mvhist->values[j][bucket->max[j]]));
+ }
+
+ elog(WARNING, "bucket %d %s => %d [%f]", i, ranges, matches[i], bucket->ntuples);
+ }
+
+ elog(WARNING, "full=%f partial=%f (%f)", ffull, fpartial, (ffull + 0.5 * fpartial));
+}
+#endif
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2c22d31..b693f36 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,9 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled, mcv_enabled,\n"
- " deps_built, mcv_built,\n"
- " mcv_max_items,\n"
+ " deps_enabled, mcv_enabled, hist_enabled,\n"
+ " deps_built, mcv_built, hist_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2154,8 +2154,17 @@ describeOneTableDetails(const char *schemaname,
first = false;
}
+ if (!strcmp(PQgetvalue(result, i, 6), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", histogram");
+ else
+ appendPQExpBuffer(&buf, "(histogram");
+ first = false;
+ }
+
appendPQExpBuffer(&buf, ") ON (%s)",
- PQgetvalue(result, i, 9));
+ PQgetvalue(result, i, 12));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 3529b03..37f473f 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -39,13 +39,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -53,6 +56,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -68,18 +72,22 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 12
+#define Natts_pg_mv_statistic 15
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_staowner 4
#define Anum_pg_mv_statistic_deps_enabled 5
#define Anum_pg_mv_statistic_mcv_enabled 6
-#define Anum_pg_mv_statistic_mcv_max_items 7
-#define Anum_pg_mv_statistic_deps_built 8
-#define Anum_pg_mv_statistic_mcv_built 9
-#define Anum_pg_mv_statistic_stakeys 10
-#define Anum_pg_mv_statistic_stadeps 11
-#define Anum_pg_mv_statistic_stamcv 12
+#define Anum_pg_mv_statistic_hist_enabled 7
+#define Anum_pg_mv_statistic_mcv_max_items 8
+#define Anum_pg_mv_statistic_hist_max_buckets 9
+#define Anum_pg_mv_statistic_deps_built 10
+#define Anum_pg_mv_statistic_mcv_built 11
+#define Anum_pg_mv_statistic_hist_built 12
+#define Anum_pg_mv_statistic_stakeys 13
+#define Anum_pg_mv_statistic_stadeps 14
+#define Anum_pg_mv_statistic_stamcv 15
+#define Anum_pg_mv_statistic_stahist 16
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index b16eebc..19a490a 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2674,6 +2674,10 @@ DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f
DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
DESCR("details about MCV list items");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3374 ( pg_mv_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_size}" _null_ _null_ pg_mv_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 2bcd582..8c50bfb 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -654,10 +654,12 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
+ bool hist_enabled; /* histogram enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
+ bool hist_built; /* histogram built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 4535db7..f05a517 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -92,6 +92,123 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ *
+ * TODO add more detailed description here
+ */
+
+typedef struct MVSerializedBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ uint16 *min;
+ bool *min_inclusive;
+
+ /* indexes of upper boundaries - values and information about the
+ * inequalities (exclusive vs. inclusive) */
+ uint16 *max;
+ bool *max_inclusive;
+
+} MVSerializedBucketData;
+
+typedef MVSerializedBucketData *MVSerializedBucket;
+
+typedef struct MVSerializedHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ /*
+ * keep this the same with MVHistogramData, because of
+ * deserialization (same offset)
+ */
+ MVSerializedBucket *buckets; /* array of buckets */
+
+ /*
+ * serialized boundary values, one array per dimension, deduplicated
+ * (the min/max indexes point into these arrays)
+ */
+ int *nvalues;
+ Datum **values;
+
+} MVSerializedHistogramData;
+
+typedef MVSerializedHistogramData *MVSerializedHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -99,20 +216,25 @@ typedef MCVListData *MCVList;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
+MVSerializedHistogram load_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVSerializedHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
* dimension within the stats).
*/
int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
@@ -121,6 +243,8 @@ extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_histogram_buckets(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -130,10 +254,20 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
+#ifdef DEBUG_MVHIST
+extern void debug_histogram_matches(MVSerializedHistogram mvhist, char *matches);
+#endif
+
+
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..e830816
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s7 ON mv_histogram (unknown_column) WITH (histogram);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s7 ON mv_histogram (a) WITH (histogram);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a, b) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+ERROR: maximum number of buckets is 16384
+-- correct command
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s8 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s9 ON mv_histogram (a, b, c, d) WITH (histogram);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 66071d8..1a1a4ca 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1375,7 +1375,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 85d94f1..a885235 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 6584d73..2efdcd7 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -164,3 +164,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..27c2510
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,176 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s7 ON mv_histogram (unknown_column) WITH (histogram);
+
+-- single column
+CREATE STATISTICS s7 ON mv_histogram (a) WITH (histogram);
+
+-- single column, duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a) WITH (histogram);
+
+-- two columns, one duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a, b) WITH (histogram);
+
+-- unknown option
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (unknown_option);
+
+-- missing histogram statistics
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+
+-- invalid max_buckets value / too low
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+
+-- invalid max_buckets value / too high
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+
+-- correct command
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s8 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s9 ON mv_histogram (a, b, c, d) WITH (histogram);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DROP TABLE mv_histogram;
--
2.1.0
0006-multi-statistics-estimation.patchtext/x-patch; charset=UTF-8; name=0006-multi-statistics-estimation.patchDownload
From 3a564dbf9aa2c734d80c5e385f105cf8a48da1f5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 6/9] multi-statistics estimation
The general idea is that a probability (which is what selectivity is)
can be split into a product of conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part may be
simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute the original
probability.
The implementation works in the other direction, though. We know what
probability P(A & B & C) we need to compute, and also what statistics
are available.
So we search for a combinations of statistics, covering the clauses in
an optimal way (most clauses covered, most dependencies exploited).
There are two possible approaches - exhaustive and greedy. The
exhaustive one walks through all permutations of stats using dynamic
programming, so it's guaranteed to find the optimal solution, but it
soon gets very slow as it's roughly O(N!). The dynamic programming may
improve that a bit, but it's still far too expensive for large numbers
of statistics (on a single table).
The greedy algorithm is very simple - in every step choose the best
solution. That may not guarantee the best solution globally (but maybe
it does?), but it only needs N steps to find the solution, so it's very
fast (processing the selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with respect to
runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply them to the
clauses using the conditional probabilities. We process the selected
stats one by one, and for each we select the estimated clauses and
conditions. See clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to be covered by
a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single multivariate
statistics.
Clauses not covered by a single statistics at this level will be passed
to clause_selectivity() but this will treat them as a collection of
simpler clauses (connected by AND or OR), and the clauses from the
previous level will be used as conditions.
So using the same example, the last clause will be passed to
clause_selectivity() with 'clause1' and 'clause2' as conditions, and it
will be processed using multivariate stats if possible.
The other limitation is that all the expressions have to be
mv-compatible, i.e. there can't be a mix of expressions. If this is
violated, the clause may be passed to the next level (just like with
list of clauses not covered by a single statistics), which splits that
into clauses handled by multivariate stats and clauses handler by
regular statistics.
rework clauselist_selectivity_or to handle OR-clauses correctly
We might invent a completely new set of functions here, resembling
clauselist_selectivity but adapting the ideas to OR-clauses.
But luckily we know that each OR-clause
(a OR b OR c)
may be rewritten as an equivalent AND-clause using negation:
NOT ((NOT a) AND (NOT b) AND (NOT c))
And that's something we can pass to clauselist_selectivity.
---
contrib/file_fdw/file_fdw.c | 3 +-
contrib/postgres_fdw/postgres_fdw.c | 11 +-
src/backend/optimizer/path/clausesel.c | 1990 ++++++++++++++++++++++++++------
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/backend/utils/mvstats/README.stats | 166 +++
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
10 files changed, 1890 insertions(+), 358 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index dc035d7..8f11b7a 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -969,7 +969,8 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
baserel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 76d0e15..e78f140 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -498,7 +498,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
@@ -2149,7 +2150,8 @@ estimate_path_cost_size(PlannerInfo *root,
local_param_join_conds,
foreignrel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
@@ -3618,7 +3620,8 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
fpinfo->local_conds,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
/*
@@ -3637,7 +3640,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
*/
fpinfo->joinclause_sel = clauselist_selectivity(root, fpinfo->joinclauses,
0, fpinfo->jointype,
- extra->sjinfo);
+ extra->sjinfo, NIL);
}
fpinfo->server = GetForeignServer(joinrel->serverid);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 0de2418..c1b8999 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -29,6 +29,8 @@
#include "utils/selfuncs.h"
#include "utils/typcache.h"
+#include "miscadmin.h"
+
/*
* Data structure for accumulating info about possible range-query
@@ -44,6 +46,13 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
+
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -60,23 +69,25 @@ static int count_mv_attnums(List *clauses, Index relid, int type);
static int count_varnos(List *clauses, Index *relid);
+static List *clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ Index relid, int types, bool remove);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Index relid, List *stats);
-static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
-
-static List *clauselist_mv_split(PlannerInfo *root, Index relid,
- List *clauses, List **mvclauses,
- MVStatisticInfo *mvstats, int types);
-
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats, List *clauses,
+ List *conditions, bool is_or);
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats,
- bool *fullmatch, Selectivity *lowsel);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or, bool *fullmatch,
+ Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -90,10 +101,33 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static List *choose_mv_statistics(PlannerInfo *root, Index relid,
+ List *mvstats, List *clauses, List *conditions);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
+static bool stats_type_matches(MVStatisticInfo *stat, int type);
+
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
@@ -168,14 +202,15 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
/* processing mv stats */
- Oid relid = InvalidOid;
+ Index relid = InvalidOid;
/* list of multivariate stats on the relation */
List *stats = NIL;
@@ -191,12 +226,13 @@ clauselist_selectivity(PlannerInfo *root,
stats = find_stats(root, relid);
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, or matching multivariate statistics, so just go directly
+ * to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
+ varRelid, jointype, sjinfo, conditions);
/*
* Apply functional dependencies, but first check that there are some stats
@@ -228,31 +264,96 @@ clauselist_selectivity(PlannerInfo *root,
(count_mv_attnums(clauses, relid,
MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
- /* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, relid,
- MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ ListCell *s;
+
+ /*
+ * Copy the conditions we got from the upper part of the expression tree
+ * so that we can add local conditions to it (we need to keep the
+ * original list intact, for sibling expressions - other expressions
+ * at the same level).
+ */
+ List *conditions_local = list_copy(conditions);
- /* and search for the statistic covering the most attributes */
- MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+ /* find the best combination of statistics */
+ List *solution = choose_mv_statistics(root, relid, stats,
+ clauses, conditions);
- if (mvstat != NULL) /* we have a matching stats */
+ /*
+ * We have a good solution, which is merely a list of statistics that
+ * we need to apply. We'll apply the statistics one by one (in the order
+ * as they appear in the list), and for each statistic we'll
+ *
+ * (1) find clauses compatible with the statistic (and remove them
+ * from the list)
+ *
+ * (2) find local conditions compatible with the statistic
+ *
+ * (3) do the estimation P(clauses | conditions)
+ *
+ * (4) append the estimated clauses to local conditions
+ *
+ * continuously modify
+ */
+ foreach (s, solution)
{
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
- mvstat, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ /* clauses compatible with the statistic we're applying right now */
+ List *stat_clauses = NIL;
+ List *stat_conditions = NIL;
- /* we've chosen the histogram to match the clauses */
- Assert(mvclauses != NIL);
+ /*
+ * Find clauses and conditions matching the statistic - the clauses
+ * need to be removed from the list, while conditions should remain
+ * there (so that we can apply them repeatedly).
+ */
+ stat_clauses
+ = clauses_matching_statistic(&clauses, mvstat, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ true);
+
+ stat_conditions
+ = clauses_matching_statistic(&conditions_local, mvstat, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ false);
+
+ /*
+ * If we got no clauses to estimate, we've done something wrong,
+ * either during the optimization, detecting compatible clause, or
+ * somewhere else.
+ *
+ * Also, we need at least two attributes in clauses and conditions.
+ */
+ Assert(stat_clauses != NIL);
+ Assert(count_mv_attnums(list_union(stat_clauses, stat_conditions),
+ relid, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2);
/* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ stat_clauses, stat_conditions,
+ false); /* AND */
+
+ /*
+ * Add the new clauses to the local conditions, so that we can use
+ * them for the subsequent statistics. We only add the clauses,
+ * because the conditions are already there (or should be).
+ */
+ conditions_local = list_concat(conditions_local, stat_clauses);
}
+
+ /* from now on, work only with the 'local' list of conditions */
+ conditions = conditions_local;
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ return s1 * clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo, conditions);
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -264,7 +365,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions);
/*
* Check for being passed a RestrictInfo.
@@ -423,6 +525,55 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Similar to clauselist_selectivity(), but for OR-clauses. We can't simply use
+ * the same multi-statistic estimation logic for AND-clauses, at least not
+ * directly, because there are a few key differences:
+ *
+ * - functional dependencies don't really apply to OR-clauses
+ *
+ * - clauselist_selectivity() is based on decomposing the selectivity into
+ * a sequence of conditional probabilities (selectivities), but that can
+ * be done only for AND-clauses
+ *
+ * We might invent a similar infrastructure for optimizing OR-clauses, doing
+ * something similar to what clause_selectivity does for AND-clauses, but
+ * luckily we know that each disjunctive normal form (aka OR-clause)
+ *
+ * (a OR b OR c)
+ *
+ * may be rewritten as an equivalent conjunctive normal form (aka AND-clause)
+ * by using negation:
+ *
+ * NOT ((NOT a) AND (NOT b) AND (NOT c))
+ *
+ * And that's something we can pass to clauselist_selectivity and let it do
+ * all the heavy lifting.
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
+{
+ List *args = NIL;
+ ListCell *l;
+ Expr *expr;
+
+ /* build arguments for the AND-clause by negating args of the OR-clause */
+ foreach (l, clauses)
+ args = lappend(args, makeBoolExpr(NOT_EXPR, list_make1(lfirst(l)), -1));
+
+ /* and then the actual OR-clause on the negated args */
+ expr = makeBoolExpr(AND_EXPR, args, -1);
+
+ /* instead of constructing NOT expression, just do (1.0 - s) */
+ return 1.0 - clauselist_selectivity(root, list_make1(expr), varRelid,
+ jointype, sjinfo, conditions);
+}
+
+/*
* addRangeClause --- add a new range clause for clauselist_selectivity
*
* Here is where we try to match up pairs of range-query clauses
@@ -629,7 +780,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -749,7 +901,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -758,29 +911,18 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
- /*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
- */
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(arg),
- varRelid,
- jointype,
- sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ /* just call to clauselist_selectivity_or() */
+ s1 = clauselist_selectivity_or(root,
+ ((BoolExpr *) clause)->args,
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -870,7 +1012,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -879,7 +1022,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else
{
@@ -943,15 +1087,16 @@ clause_selectivity(PlannerInfo *root,
* in the MCV list, then the selectivity is below the lowest frequency
* found in the MCV list,
*
- * TODO When applying the clauses to the histogram/MCV list, we can do
- * that from the most selective clauses first, because that'll
- * eliminate the buckets/items sooner (so we'll be able to skip
- * them without inspection, which is more expensive). But this
- * requires really knowing the per-clause selectivities in advance,
- * and that's not what we do now.
+ * TODO When applying the clauses to the histogram/MCV list, we can do that from
+ * the most selective clauses first, because that'll eliminate the
+ * buckets/items sooner (so we'll be able to skip them without inspection,
+ * which is more expensive). But this requires really knowing the
+ * per-clause selectivities in advance, and that's not what we do now.
+ *
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
@@ -969,7 +1114,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
*/
/* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions, is_or,
&fullmatch, &mcv_low);
/*
@@ -982,7 +1128,8 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
/* TODO if (fullmatch) without matching MCV item, use the mcv_low
* selectivity as upper bound */
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions, is_or);
/* TODO clamp to <= 1.0 (or more strictly, when possible) */
return s1 + s2;
@@ -1016,260 +1163,1325 @@ get_varattnos(Node * node, Index relid)
k + FirstLowInvalidHeapAttributeNumber);
}
- bms_free(varattnos);
+ bms_free(varattnos);
+
+ return result;
+}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Index relid, int types)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, relid, &attnums, types);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ bms_free(attnums);
+ attnums = NULL;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Index relid, int type)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Count varnos referenced in the clauses, and if there's a single varno then
+ * return the index in 'relid'.
+ */
+static int
+count_varnos(List *clauses, Index *relid)
+{
+ int cnt;
+ Bitmapset *varnos = NULL;
+
+ varnos = pull_varnos((Node *) clauses);
+ cnt = bms_num_members(varnos);
+
+ /* if there's a single varno in the clauses, remember it */
+ if (bms_num_members(varnos) == 1)
+ *relid = bms_singleton_member(varnos);
+
+ bms_free(varnos);
+
+ return cnt;
+}
+
+static List *
+clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ Index relid, int types, bool remove)
+{
+ int i;
+ Bitmapset *stat_attnums = NULL;
+ List *matching_clauses = NIL;
+ ListCell *lc;
+
+ /* build attnum bitmapset for this statistics */
+ for (i = 0; i < statistic->stakeys->dim1; i++)
+ stat_attnums = bms_add_member(stat_attnums,
+ statistic->stakeys->values[i]);
+
+ /*
+ * We can't use foreach here, because we may need to remove some of the
+ * clauses if (remove=true).
+ */
+ lc = list_head(*clauses);
+ while (lc)
+ {
+ Node *clause = (Node*)lfirst(lc);
+ Bitmapset *attnums = NULL;
+
+ /* must advance lc before list_delete possibly pfree's it */
+ lc = lnext(lc);
+
+ /*
+ * skip clauses that are not compatible with stats (just leave them
+ * in the original list)
+ *
+ * XXX Perhaps this should check what stats are actually available in
+ * the statistics (not a big deal now, because MCV and histograms
+ * handle the same types of conditions).
+ */
+ if (! clause_is_mv_compatible(clause, relid, &attnums, types))
+ {
+ bms_free(attnums);
+ continue;
+ }
+
+ /* if the clause is covered by the statistic, add it to the list */
+ if (bms_is_subset(attnums, stat_attnums))
+ {
+ matching_clauses = lappend(matching_clauses, clause);
+
+ /* if remove=true, remove the matching item from the main list */
+ if (remove)
+ *clauses = list_delete_ptr(*clauses, clause);
+ }
+
+ bms_free(attnums);
+ }
+
+ bms_free(stat_attnums);
+
+ return matching_clauses;
+}
+
+/*
+ * Selects the best combination of multivariate statistics, in an exhaustive
+ * way, where 'best' means:
+ *
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ * (c) using the most conditions to exploit dependency
+ *
+ * Don't call this directly but through choose_mv_statistics(), which does some
+ * additional tricks to minimize the runtime.
+ *
+ *
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with maximum
+ * depth equal to the number of multi-variate statistics available on the table.
+ * It actually explores all valid combinations of stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it matches are
+ * divided into 'conditions' (clauses already matched by at least one previous
+ * statistics) and clauses that are estimated.
+ *
+ * Then several checks are performed:
+ *
+ * (a) The statistics covers at least 2 columns, referenced in the estimated
+ * clauses (otherwise multi-variate stats are useless).
+ *
+ * (b) The statistics covers at least 1 new column, i.e. column not refefenced
+ * by the already used stats (and the new column has to be referenced by
+ * the clauses, of couse). Otherwise the statistics would not add any new
+ * information.
+ *
+ * There are some other sanity checks (e.g. stats must not be used twice etc.).
+ *
+ *
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a rather simple optimality criteria, so it may
+ * not do the best choice when
+ *
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but with
+ * statistics in a different order). It's unclear which solution in the best
+ * one - in a sense all of them are equal.
+ *
+ * TODO It might be possible to compute estimate for each of those solutions,
+ * and then combine them to get the final estimate (e.g. by using average
+ * or median).
+ *
+ * (b) Does not consider that some types of stats are a better match for some
+ * types of clauses (e.g. MCV list is generally a better match for equality
+ * conditions than a histogram).
+ *
+ * But maybe this is pointless - generally, each column is either a label
+ * (it's not important whether because of the data type or how it's used),
+ * or a value with ordering that makes sense. So either a MCV list is more
+ * appropriate (labels) or a histogram (values with orderings).
+ *
+ * Now sure what to do with statistics on columns mixing both types of data
+ * (some columns would work best with MCVs, some with histograms). Maybe we
+ * could invent a new type of statistics combining MCV list and histogram
+ * (keeping a small histogram for each MCV item, and a separate histogram
+ * for values not on the MCV list).
+ *
+ * TODO The algorithm should probably count number of Vars (not just attnums)
+ * when computing the 'score' of each solution. Computing the ratio of
+ * (num of all vars) / (num of condition vars) as a measure of how well
+ * the solution uses conditions might be useful.
+ */
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* this may run for a long sime, so let's make it interruptible */
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ /*
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int c;
+
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
+
+ Bitmapset *all_attnums = NULL;
+ Bitmapset *new_attnums = NULL;
+
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nclauses; c++)
+ {
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
+
+ if (covered)
+ break;
+ }
+
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
+
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
+
+ /* add the attnums into attnums from 'new clauses' */
+ // new_attnums = bms_union(new_attnums, clause_attnums);
+ }
+
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+ bms_free(new_attnums);
+
+ all_attnums = NULL;
+ new_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
+ {
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+ }
+
+ /*
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
+ */
+ ruled_out[i] = step;
+
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
+
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
+
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics covering
+ * the clauses. This chooses the "best" statistics at each step, so the
+ * resulting solution may not be the best solution globally, but this produces
+ * the solution in only N steps (where N is the number of statistics), while
+ * the exhaustive approach may have to walk through ~N! combinations (although
+ * some of those are terminated early).
+ *
+ * See the comments at choose_mv_statistics_exhaustive() as this does the same
+ * thing (but in a different way).
+ *
+ * Don't call this directly, but through choose_mv_statistics().
+ *
+ * TODO There are probably other metrics we might use - e.g. using number of
+ * columns (num_cond_columns / num_cov_columns), which might work better
+ * with a mix of simple and complex clauses.
+ *
+ * TODO Also the choice at the very first step should be handled in a special
+ * way, because there will be 0 conditions at that moment, so there needs
+ * to be some other criteria - e.g. using the simplest (or most complex?)
+ * clause might be a good idea.
+ *
+ * TODO We might also select multiple stats using different criteria, and branch
+ * the search. This is however tricky, because if we choose k statistics at
+ * each step, we get k^N branches to walk through (with N steps). That's
+ * not really good with large number of stats (yet better than exhaustive
+ * search).
+ */
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
+
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
+
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
+ */
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *attnums_new
+ = bms_union(attnums_covered, clauses_attnums[j]);
+
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = attnums_new;
+
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
+
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ attnums_new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = attnums_new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
+
+ if (gain > max_gain)
+ {
+ max_gain = gain;
+ best_stat = i;
+ }
+ }
+
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
+ }
+
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
+}
+
+/*
+ * Remove clauses not covered by any of the available statistics
+ *
+ * This helps us to reduce the amount of work done in choose_mv_statistics()
+ * by not having to deal with clauses that can't possibly be useful.
+ */
+static List *
+filter_clauses(PlannerInfo *root, Index relid, int type,
+ List *stats, List *clauses, Bitmapset **attnums)
+{
+ ListCell *c;
+ ListCell *s;
+
+ /* results (list of compatible clauses, attnums) */
+ List *rclauses = NIL;
+
+ foreach (c, clauses)
+ {
+ Node *clause = (Node*)lfirst(c);
+ Bitmapset *clause_attnums = NULL;
+
+ /*
+ * We do assume that thanks to previous checks, we should not run into
+ * clauses that are incompatible with multivariate stats here. We also
+ * need to collect the attnums for the clause.
+ *
+ * XXX Maybe turn this into an assert?
+ */
+ if (! clause_is_mv_compatible(clause, relid, &clause_attnums, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ /* Is there a multivariate statistics covering the clause? */
+ foreach (s, stats)
+ {
+ int k, matches = 0;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* skip statistics not matching the required type */
+ if (! stats_type_matches(stat, type))
+ continue;
+
+ /*
+ * see if all clause attributes are covered by the statistic
+ *
+ * We'll do that in the opposite direction, i.e. we'll see how many
+ * attributes of the statistic are referenced in the clause, and then
+ * compare the counts.
+ */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ if (bms_is_member(stat->stakeys->values[k], clause_attnums))
+ matches += 1;
+
+ /*
+ * If the number of matches is equal to attributes referenced by the
+ * clause, then the clause is covered by the statistic.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ *attnums = bms_union(*attnums, clause_attnums);
+ rclauses = lappend(rclauses, clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(clauses) >= list_length(rclauses));
+
+ return rclauses;
+}
+
+/*
+ * Remove statistics not covering any new clauses
+ *
+ * Statistics not covering any new clauses (conditions don't count) are not
+ * really useful, so let's ignore them. Also, we need the statistics to
+ * reference at least two different attributes (both in conditions and clauses
+ * combined), and at least one of them in the clauses alone.
+ *
+ * This check might be made more strict by checking against individual clauses,
+ * because by using the bitmapsets of all attnums we may actually use attnums
+ * from clauses that are not covered by the statistics. For example, we may
+ * have a condition
+ *
+ * (a=1 AND b=2)
+ *
+ * and a new clause
+ *
+ * (c=1 AND d=1)
+ *
+ * With only bitmapsets, statistics on [b,c] will pass through this (assuming
+ * there are some statistics covering both clases).
+ *
+ * Parameters:
+ *
+ * stats - list of statistics to filter
+ * new_attnums - attnums referenced in new clauses
+ * all_attnums - attnums referenced by contidions and new clauses combined
+ *
+ * Returns filtered list of statistics.
+ *
+ * TODO Do the more strict check, i.e. walk through individual clauses and
+ * conditions and only use those covered by the statistics.
+ */
+static List *
+filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+{
+ ListCell *s;
+ List *stats_filtered = NIL;
+
+ foreach (s, stats)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* see how many attributes the statistics covers */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ /* attributes from new clauses */
+ if (bms_is_member(stat->stakeys->values[k], new_attnums))
+ matches_new += 1;
+
+ /* attributes from onditions */
+ if (bms_is_member(stat->stakeys->values[k], all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ stats_filtered = lappend(stats_filtered, stat);
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(list_length(stats) >= list_length(stats_filtered));
+
+ return stats_filtered;
+}
+
+static MVStatisticInfo *
+make_stats_array(List *stats, int *nmvstats)
+{
+ int i;
+ ListCell *l;
+
+ MVStatisticInfo *mvstats = NULL;
+ *nmvstats = list_length(stats);
+
+ mvstats
+ = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
+
+ i = 0;
+ foreach (l, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
+ memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
+ }
+
+ return mvstats;
+}
+
+static Bitmapset **
+make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
+{
+ int i, j;
+ Bitmapset **stats_attnums = NULL;
+
+ Assert(nmvstats > 0);
- return result;
+ /* build bitmaps of attnums for the stats (easier to compare) */
+ stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i]
+ = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+
+ return stats_attnums;
}
+
/*
- * Collect attributes from mv-compatible clauses.
+ * Remove redundant statistics
+ *
+ * If there are multiple statistics covering the same set of columns (counting
+ * only those referenced by clauses and conditions), we can apply one of those
+ * anyway and further reduce the size of the optimization problem.
+ *
+ * Thus when redundant stats are detected, we keep the smaller one (the one with
+ * fewer columns), based on the assumption that it's more accurate and also
+ * faster to process. That may be untrue for two reasons - first, the accuracy
+ * really depends on number of buckets/MCV items, not the number of columns.
+ * Second, some types of statistics may work better for certain types of clauses
+ * (e.g. MCV lists for equality conditions) etc.
*/
-static Bitmapset *
-collect_mv_attnums(List *clauses, Index relid, int types)
+static List*
+filter_redundant_stats(List *stats, List *clauses, List *conditions)
{
- Bitmapset *attnums = NULL;
- ListCell *l;
+ int i, j, nmvstats;
+
+ MVStatisticInfo *mvstats;
+ bool *redundant;
+ Bitmapset **stats_attnums;
+ Bitmapset *varattnos;
+ Index relid;
+
+ Assert(list_length(stats) > 0);
+ Assert(list_length(clauses) > 0);
/*
- * Walk through the clauses and identify the ones we can estimate using
- * multivariate stats, and remember the relid/columns. We'll then
- * cross-check if we have suitable stats, and only if needed we'll split
- * the clauses into multivariate and regular lists.
+ * We'll convert the list of statistics into an array now, because
+ * the reduction of redundant statistics is easier to do that way
+ * (we can mark previous stats as redundant, etc.).
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* by default, none of the stats is redundant (so palloc0) */
+ redundant = palloc0(nmvstats * sizeof(bool));
+
+ /*
+ * We only expect a single relid here, and also we should get the
+ * same relid from clauses and conditions (but we get it from
+ * clauses, because those are certainly non-empty).
+ */
+ relid = bms_singleton_member(pull_varnos((Node*)clauses));
+
+ /*
+ * Get the varattnos from both conditions and clauses.
+ *
+ * This skips system attributes, although that should be impossible
+ * thanks to previous filtering out of incompatible clauses.
*
- * For now we're only interested in RestrictInfo nodes with nested OpExpr,
- * using either a range or equality.
+ * XXX Is that really true?
*/
- foreach (l, clauses)
+ varattnos = bms_union(get_varattnos((Node*)clauses, relid),
+ get_varattnos((Node*)conditions, relid));
+
+ for (i = 1; i < nmvstats; i++)
{
- Node *clause = (Node *) lfirst(l);
+ /* intersect with current statistics */
+ Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
- /* ignore the result here - we only need the attnums */
- clause_is_mv_compatible(clause, relid, &attnums, types);
+ /* walk through 'previous' stats and check redundancy */
+ for (j = 0; j < i; j++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *prev;
+
+ /* skip stats already identified as redundant */
+ if (redundant[j])
+ continue;
+
+ prev = bms_intersect(stats_attnums[j], varattnos);
+
+ switch (bms_subset_compare(curr, prev))
+ {
+ case BMS_EQUAL:
+ /*
+ * Use the smaller one (hopefully more accurate).
+ * If both have the same size, use the first one.
+ */
+ if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
+ redundant[i] = TRUE;
+ else
+ redundant[j] = TRUE;
+
+ break;
+
+ case BMS_SUBSET1: /* curr is subset of prev */
+ redundant[i] = TRUE;
+ break;
+
+ case BMS_SUBSET2: /* prev is subset of curr */
+ redundant[j] = TRUE;
+ break;
+
+ case BMS_DIFFERENT:
+ /* do nothing - keep both stats */
+ break;
+ }
+
+ bms_free(prev);
+ }
+
+ bms_free(curr);
}
- /*
- * If there are not at least two attributes referenced by the clause(s),
- * we can throw everything out (as we'll revert to simple stats).
- */
- if (bms_num_members(attnums) <= 1)
+ /* can't reduce all statistics (at least one has to remain) */
+ Assert(nmvstats > 0);
+
+ /* now, let's remove the reduced statistics from the arrays */
+ list_free(stats);
+ stats = NIL;
+
+ for (i = 0; i < nmvstats; i++)
{
- if (attnums != NULL)
- pfree(attnums);
- attnums = NULL;
+ MVStatisticInfo *info;
+
+ pfree(stats_attnums[i]);
+
+ if (redundant[i])
+ continue;
+
+ info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
+
+ stats = lappend(stats, info);
}
- return attnums;
+ pfree(mvstats);
+ pfree(stats_attnums);
+ pfree(redundant);
+
+ return stats;
}
-/*
- * Count the number of attributes in clauses compatible with multivariate stats.
- */
-static int
-count_mv_attnums(List *clauses, Index relid, int type)
+static Node**
+make_clauses_array(List *clauses, int *nclauses)
{
- int c;
- Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
+ int i;
+ ListCell *l;
- c = bms_num_members(attnums);
+ Node** clauses_array;
- bms_free(attnums);
+ *nclauses = list_length(clauses);
+ clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
- return c;
+ i = 0;
+ foreach (l, clauses)
+ clauses_array[i++] = (Node *)lfirst(l);
+
+ *nclauses = i;
+
+ return clauses_array;
}
-/*
- * Count varnos referenced in the clauses, and if there's a single varno then
- * return the index in 'relid'.
- */
-static int
-count_varnos(List *clauses, Index *relid)
+static Bitmapset **
+make_clauses_attnums(PlannerInfo *root, Index relid,
+ int type, Node **clauses, int nclauses)
{
- int cnt;
- Bitmapset *varnos = NULL;
+ int i;
+ Bitmapset **clauses_attnums
+ = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
- varnos = pull_varnos((Node *) clauses);
- cnt = bms_num_members(varnos);
+ for (i = 0; i < nclauses; i++)
+ {
+ Bitmapset * attnums = NULL;
- /* if there's a single varno in the clauses, remember it */
- if (bms_num_members(varnos) == 1)
- *relid = bms_singleton_member(varnos);
+ if (! clause_is_mv_compatible(clauses[i], relid, &attnums, type))
+ elog(ERROR, "should not get non-mv-compatible clause");
- bms_free(varnos);
+ clauses_attnums[i] = attnums;
+ }
- return cnt;
+ return clauses_attnums;
}
-
+
+static bool*
+make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses)
+{
+ int i, j;
+ bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < nclauses; j++)
+ cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+
+ return cover_map;
+}
+
/*
- * We're looking for statistics matching at least 2 attributes, referenced in
- * clauses compatible with multivariate statistics. The current selection
- * criteria is very simple - we choose the statistics referencing the most
- * attributes.
- *
- * If there are multiple statistics referencing the same number of columns
- * (from the clauses), the one with less source columns (as listed in the
- * ADD STATISTICS when creating the statistics) wins. Else the first one wins.
- *
- * This is a very simple criteria, and has several weaknesses:
- *
- * (a) does not consider the accuracy of the statistics
- *
- * If there are two histograms built on the same set of columns, but one
- * has 100 buckets and the other one has 1000 buckets (thus likely
- * providing better estimates), this is not currently considered.
- *
- * (b) does not consider the type of statistics
- *
- * If there are three statistics - one containing just a MCV list, another
- * one with just a histogram and a third one with both, we treat them equally.
+ * Chooses the combination of statistics, optimal for estimation of a particular
+ * clause list.
*
- * (c) does not consider the number of clauses
+ * This only handles a 'preparation' shared by the exhaustive and greedy
+ * implementations (see the previous methods), mostly trying to reduce the size
+ * of the problem (eliminate clauses/statistics that can't be really used in
+ * the solution).
*
- * As explained, only the number of referenced attributes counts, so if
- * there are multiple clauses on a single attribute, this still counts as
- * a single attribute.
+ * It also precomputes bitmaps for attributes covered by clauses and statistics,
+ * so that we don't need to do that over and over in the actual optimizations
+ * (as it's both CPU and memory intensive).
*
- * (d) does not consider type of condition
*
- * Some clauses may work better with some statistics - for example equality
- * clauses probably work better with MCV lists than with histograms. But
- * IS [NOT] NULL conditions may often work better with histograms (thanks
- * to NULL-buckets).
+ * TODO Another way to make the optimization problems smaller might be splitting
+ * the statistics into several disjoint subsets, i.e. if we can split the
+ * graph of statistics (after the elimination) into multiple components
+ * (so that stats in different components share no attributes), we can do
+ * the optimization for each component separately.
*
- * So for example with five WHERE conditions
- *
- * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
- *
- * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be selected
- * as it references the most columns.
- *
- * Once we have selected the multivariate statistics, we split the list of
- * clauses into two parts - conditions that are compatible with the selected
- * stats, and conditions are estimated using simple statistics.
- *
- * From the example above, conditions
- *
- * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
- *
- * will be estimated using the multivariate statistics (a,b,c,d) while the last
- * condition (e = 1) will get estimated using the regular ones.
- *
- * There are various alternative selection criteria (e.g. counting conditions
- * instead of just referenced attributes), but eventually the best option should
- * be to combine multiple statistics. But that's much harder to do correctly.
- *
- * TODO Select multiple statistics and combine them when computing the estimate.
- *
- * TODO This will probably have to consider compatibility of clauses, because
- * 'dependencies' will probably work only with equality clauses.
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew that we
+ * can cover 10 clauses and reuse 8 dependencies, maybe covering 9 clauses
+ * and 7 dependencies would be OK?
*/
-static MVStatisticInfo *
-choose_mv_statistics(List *stats, Bitmapset *attnums)
+static List*
+choose_mv_statistics(PlannerInfo *root, Index relid, List *stats,
+ List *clauses, List *conditions)
{
int i;
- ListCell *lc;
+ mv_solution_t *best = NULL;
+ List *result = NIL;
+
+ int nmvstats;
+ MVStatisticInfo *mvstats;
+
+ /* we only work with MCV lists and histograms here */
+ int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
- MVStatisticInfo *choice = NULL;
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
+ stats = list_copy(stats);
/*
- * Walk through the statistics (simple array with nmvstats elements) and for
- * each one count the referenced attributes (encoded in the 'attnums' bitmap).
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
*/
- foreach (lc, stats)
+ while (true)
{
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
-
- /* columns matching this statistics */
- int matches = 0;
+ List *tmp;
- int2vector * attrs = info->stakeys;
- int numattrs = attrs->dim1;
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
- /* skip dependencies-only stats */
- if (! (info->mcv_built || info->hist_built))
- continue;
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. We'll also keep info
+ * about attnums in clauses (without conditions) so that we can
+ * ignore stats covering just conditions (which is pointless).
+ */
+ tmp = filter_clauses(root, relid, type,
+ stats, clauses, &compatible_attnums);
- /* count columns covered by the histogram */
- for (i = 0; i < numattrs; i++)
- if (bms_is_member(attrs->values[i], attnums))
- matches++;
+ /* discard the original list */
+ list_free(clauses);
+ clauses = tmp;
/*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new
+ * attribute (by comparing with clauses).
*/
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
+ if (conditions != NIL)
{
- choice = info;
- current_matches = matches;
- current_dims = numattrs;
+ tmp = filter_clauses(root, relid, type,
+ stats, conditions, &condition_attnums);
+
+ /* discard the original list */
+ list_free(conditions);
+ conditions = tmp;
}
- }
- return choice;
-}
+ /* get a union of attnums (from conditions and new clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ tmp = filter_stats(stats, compatible_attnums, all_attnums);
+ /* if we've not eliminated anything, terminate */
+ if (list_length(stats) == list_length(tmp))
+ break;
-/*
- * This splits the clauses list into two parts - one containing clauses that
- * will be evaluated using the chosen statistics, and the remaining clauses
- * (either non-mvcompatible, or not related to the histogram).
- */
-static List *
-clauselist_mv_split(PlannerInfo *root, Index relid,
- List *clauses, List **mvclauses,
- MVStatisticInfo *mvstats, int types)
-{
- int i;
- ListCell *l;
- List *non_mvclauses = NIL;
+ /* work only with filtered statistics from now */
+ list_free(stats);
+ stats = tmp;
+ }
- /* FIXME is there a better way to get info on int2vector? */
- int2vector * attrs = mvstats->stakeys;
- int numattrs = mvstats->stakeys->dim1;
+ /* only do the optimization if we have clauses/statistics */
+ if ((list_length(stats) == 0) || (list_length(clauses) == 0))
+ return NULL;
- Bitmapset *mvattnums = NULL;
+ /* remove redundant stats (stats covered by another stats) */
+ stats = filter_redundant_stats(stats, clauses, conditions);
- /* build bitmap of attributes, so we can do bms_is_subset later */
- for (i = 0; i < numattrs; i++)
- mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
- /* erase the list of mv-compatible clauses */
- *mvclauses = NIL;
+ /* collect clauses an bitmap of attnums */
+ clauses_array = make_clauses_array(clauses, &nclauses);
+ clauses_attnums = make_clauses_attnums(root, relid, type,
+ clauses_array, nclauses);
- foreach (l, clauses)
- {
- bool match = false; /* by default not mv-compatible */
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(l);
+ /* collect conditions and bitmap of attnums */
+ conditions_array = make_clauses_array(conditions, &nconditions);
+ conditions_attnums = make_clauses_attnums(root, relid, type,
+ conditions_array, nconditions);
- if (clause_is_mv_compatible(clause, relid, &attnums, types))
+ /*
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
+ */
+ clause_cover_map = make_cover_map(stats_attnums, nmvstats,
+ clauses_attnums, nclauses);
+
+ condition_cover_map = make_cover_map(stats_attnums, nmvstats,
+ conditions_attnums, nconditions);
+
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ /* no stats are ruled out by default */
+ for (i = 0; i < nmvstats; i++)
+ ruled_out[i] = -1;
+
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* create a list of statistics from the array */
+ if (best != NULL)
+ {
+ for (i = 0; i < best->nstats; i++)
{
- /* are all the attributes part of the selected stats? */
- if (bms_is_subset(attnums, mvattnums))
- match = true;
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
+ result = lappend(result, info);
}
- /*
- * The clause matches the selected stats, so put it to the list of
- * mv-compatible clauses. Otherwise, keep it in the list of 'regular'
- * clauses (that may be selected later).
- */
- if (match)
- *mvclauses = lappend(*mvclauses, clause);
- else
- non_mvclauses = lappend(non_mvclauses, clause);
+ pfree(best);
}
- /*
- * Perform regular estimation using the clauses incompatible with the chosen
- * histogram (or MV stats in general).
- */
- return non_mvclauses;
+ /* cleanup (maybe leave it up to the memory context?) */
+ for (i = 0; i < nmvstats; i++)
+ bms_free(stats_attnums[i]);
+
+ for (i = 0; i < nclauses; i++)
+ bms_free(clauses_attnums[i]);
+
+ for (i = 0; i < nconditions; i++)
+ bms_free(conditions_attnums[i]);
+
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(conditions_attnums);
+ pfree(clauses_array);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+ pfree(mvstats);
+
+ list_free(clauses);
+ list_free(conditions);
+ list_free(stats);
+
+ return result;
}
typedef struct
@@ -1474,6 +2686,7 @@ clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums, int type
return true;
}
+
/*
* collect attnums from functional dependencies
*
@@ -2022,6 +3235,24 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
* Check that there are stats with at least one of the requested types.
*/
static bool
+stats_type_matches(MVStatisticInfo *stat, int type)
+{
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
+
+ return false;
+}
+
+/*
+ * Check that there are stats with at least one of the requested types.
+ */
+static bool
has_stats(List *stats, int type)
{
ListCell *s;
@@ -2030,13 +3261,8 @@ has_stats(List *stats, int type)
{
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
- if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
- return true;
-
- if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
- return true;
-
- if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ /* terminate if we've found at least one matching statistics */
+ if (stats_type_matches(stat, type))
return true;
}
@@ -2087,22 +3313,26 @@ find_stats(PlannerInfo *root, Index relid)
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -2113,32 +3343,85 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
- nmatches, matches,
- lowsel, fullmatch, false);
+ ((is_or) ? 0 : nmatches), matches,
+ lowsel, fullmatch, is_or);
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
{
- /* used to 'scale' for MCV lists not covering all tuples */
+ /*
+ * Find out what part of the data is covered by the MCV list,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample items got into the MCV, and the rest
+ * is either in a histogram, or not covered by stats).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole list, which might save us a bit of time
+ * spent on accessing the not-matching part of the MCV list.
+ * Although it's likely in a cache, so it's very fast.
+ */
u += mcvlist->items[i]->frequency;
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s*u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -2369,64 +3652,57 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
}
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each MCV item */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching MCV items */
- or_nmatches = mcvlist->nitems;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
/* by default none of the MCV items matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
stakeys, mcvlist,
- or_nmatches, or_matches,
+ tmp_nmatches, tmp_matches,
lowsel, fullmatch, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mcvlist->nitems; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
+ pfree(tmp_matches);
}
else
- {
elog(ERROR, "unknown clause type: %d", clause->type);
- }
}
/*
@@ -2484,15 +3760,18 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
MVSerializedHistogram mvhist = NULL;
@@ -2505,25 +3784,55 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
Assert (mvhist != NULL);
Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
/*
- * Bitmap of bucket matches (mismatch, partial, full). by default
- * all buckets fully match (and we'll eliminate them).
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+ matches = palloc0(sizeof(char) * nmatches);
- nmatches = mvhist->nbuckets;
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, false);
- /* build the match bitmap */
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
- nmatches, matches, false);
+ ((is_or) ? 0 : nmatches), matches,
+ is_or);
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
/*
* Find out what part of the data is covered by the histogram,
* so that we can 'scale' the selectivity properly (e.g. when
@@ -2537,10 +3846,23 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
*/
u += mvhist->buckets[i]->ntuples;
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
#ifdef DEBUG_MVHIST
@@ -2549,9 +3871,14 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s * u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/* cached result of bucket boundary comparison for a single dimension */
@@ -2699,7 +4026,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
{
int i;
ListCell * l;
-
+
/*
* Used for caching function calls, only once per deduplicated value.
*
@@ -2742,7 +4069,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
FmgrInfo opproc; /* operator */
fmgr_info(get_opcode(expr->opno), &opproc);
-
+
/* reset the cache (per clause) */
memset(callcache, 0, mvhist->nbuckets);
@@ -2902,64 +4229,57 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each bucket */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching buckets */
- or_nmatches = mvhist->nbuckets;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mvhist->nbuckets;
- /* by default none of the buckets matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ /* by default none of the buckets matches the clauses (OR clause) */
+ tmp_matches = palloc0(sizeof(char) * mvhist->nbuckets);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* but AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ tmp_nmatches = update_match_bitmap_histogram(root, tmp_clauses,
stakeys, mvhist,
- or_nmatches, or_matches, or_clause(clause));
+ tmp_nmatches, tmp_matches, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mvhist->nbuckets; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
-
+ pfree(tmp_matches);
}
else
elog(ERROR, "unknown clause type: %d", clause->type);
}
- /* free the call cache */
pfree(callcache);
return nmatches;
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 5350329..57214e0 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3518,7 +3518,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3541,7 +3542,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3708,7 +3710,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3744,7 +3746,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3781,7 +3784,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3919,12 +3923,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3936,7 +3942,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index ea831f5..6299e75 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 46c95b0..7d0a3a1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1627,13 +1627,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6259,7 +6261,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6579,7 +6582,8 @@ btcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7330,7 +7334,8 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7560,7 +7565,7 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ea5a09a..27a8de5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -393,6 +394,15 @@ static const struct config_enum_entry force_parallel_mode_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3707,6 +3717,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 3e4f4d1..d404914 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -90,6 +90,137 @@ even attempting to do the more expensive estimation.
Whenever we find there are no suitable stats, we skip the expensive steps.
+Combining multiple statistics
+-----------------------------
+
+When estimating selectivity of a list of clauses, there may exist no statistics
+covering all of them. If there are multiple statistics, each covering some
+subset of the attributes, the optimizer needs to figure out which of those
+statistics to apply.
+
+When the statistics do not overlap, the solution is trivial - we can simply
+split the groups of conditions by the matching statistics, and then multiply the
+selectivities. For example assume multivariate statistics on (b,c) and (d,e),
+and a condition like this:
+
+ (a=1) AND (b=2) AND (c=3) AND (d=4) AND (e=5)
+
+Then (a=1) is not covered by any of the statistics, so will be estimated using
+the regular per-column statistics. The two conditions ((b=2) AND (c=3)) will be
+estimated using the (b,c) statistics, and ((d=4) AND (e=5)) will be estimated
+using (d,e) statistics. And the resulting selectivities will be estimated.
+
+Now, what if the statistics overlap? For example assume the same condition as
+above, but let's say we have statistics on (a,b,c) and (a,c,d,e). What then?
+
+As selectivity is just a probability that the condition holds for a random row,
+we can write the selectivity like this:
+
+ P(a=1 & b=2 & c=3 & d=4 & e=5)
+
+and we can rewrite it using conditional probability like this
+
+ P(a=1 & b=2 & c=3) * P(d=4 & e=5 | a=1 & b=2 & c=3)
+
+Notice that the first part already matches to (a,b,c) statistics. If we assume
+that columns that are not referenced by the same statistics are independent, we
+may rewrite the second half like this
+
+ P(d=4 & e=5 | a=1 & b=2 & c=3) = P(d=4 & e=5 | a=1 & c=3)
+
+which corresponds to the statistics on (a,c,d,e).
+
+If there are multiple statistics defined on a table, it's not difficult to come
+up with examples when there are multiple ways to combine them to cover a list of
+clauses. We need a way to find the best combination of statistics.
+
+This is the purpose of choose_mv_statistics(). It searches through the possible
+combinations of statistics, and searches such combination that
+
+ (a) covers the most clauses of the list
+
+ (b) reuses the maximum number of clauses as conditions
+ (in conditional probabilities)
+
+While (a) criteria seems natural, the (b) may seem a bit awkward at first. The
+idea is that conditions in a way of transfering information about dependencies
+between statistics.
+
+There are two alternative implementations of choose_mv_statistics() - greedy
+and exhaustive. Exhaustive actually searches through all possible combinations
+of statistics, and for larger numbers of statistics may get quite expensive
+(as it, unsurprisingly, has exponential cost). Greedy terminates in less than
+K steps (when K is the number of clauses), and in each step chooses the best
+next statistics. I've been unable to come up with an example where those two
+approaches would produce different combinations.
+
+It's possible to choose the optimization using mvstat_search_type, with either
+'greedy' or 'exhaustive' values (default is 'greedy').
+
+ SET mvstat_search_type = 'exhaustive';
+
+Note: This is meant mostly for experimentation. I do expect we'll choose one of
+the algorithms and remove the GUC before commit.
+
+
+Limitations of combining statistics
+-----------------------------------
+
+As described in the section 'Combining multiple statistics', the current appoach
+is based on transfering information between statistics by means of conditional
+probabilities. This is a relatively cheap and efficient approach, but it is
+based on two assumptions:
+
+ (1) The overlap between the statistics needs to be sufficiently large, i.e.
+ there needs to be enough columns shared by the statistics to transfer
+ information about dependencies between the remaining columns.
+
+ (2) The query needs to include sufficient clauses on the shared columns.
+
+How a violation of those assumptions may be a problem can be illustrated by
+a simple example. Assume a table with three columns (a,b,c) containing exactly
+the same values, and statistics on (a,b) and (b,c):
+
+ CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+
+ CREATE STATISTICS s1 ON test (a,b) WITH (mcv);
+ CREATE STATISTICS s2 ON test (b,c) WITH (mcv);
+
+ ANALYZE test;
+
+First, let's estimate this query:
+
+ SELECT * FROM test WHERE (a < 10) AND (c < 10);
+
+Clearly, there are no conditions on 'b' (which is the only column shared by the
+two statistics), so we'll end up with an estimate based on assumption of
+independence:
+
+ P(a < 10) * P(c < 10) = 0.01 * 0.01 = 0.0001
+
+Which is a significant under-estimate, as the proper selectivity is 0.01.
+
+But let's estimate another query:
+
+ SELECT * FROM test WHERE (a < 10) AND (b < 500) AND (c < 10);
+
+In this case, the estimate may be computed for example like this:
+
+ P[(a < 10) & (b < 500) & (c < 10)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (a < 10) & (b < 500)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (b < 500)]
+
+The trouble is the probability P(c < 10 | b < 500) evaluates to 0.02, because
+we have assumed (a) and (c) are independent because there is no statistic
+containing both these columns, and the condition on (b) does not transfer
+sufficient amount of information between the two statistics.
+
+Currently, the only solution is to build statistics on all three columns, but
+see the 'combining statistics using convolution' section for ideas on how to
+improve this.
+
+
Further (possibly crazy) ideas
------------------------------
@@ -111,3 +242,38 @@ But of course, this may result in expensive estimation (CPU-wise).
So we might add a GUC to choose between a simple (single statistics) and thus
multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
+
+
+Combining stats using convolution
+---------------------------------
+
+While the current approach for combining statistics is based on conditional
+probabilities, and thus only works when the query includes conditions on the
+overlapping parts of the statistics. But there may be other ways to combine
+statistics, relaxing this requirement.
+
+Let's assume two histograms H1 and H2 - then combining them might work about
+like this:
+
+
+ for (buckets of H1, satisfying local conditions)
+ {
+ for (buckets of H2, overlapping with H1 bucket)
+ {
+ mark H2 bucket as 'valid'
+ }
+ }
+
+ s1 = s2 = 0.0
+ for (buckets of H2 marked as valid)
+ {
+ s1 += frequency
+
+ if (bucket satistifes local conditions)
+ s2 += frequency
+ }
+
+ s = (s2 / s1) /* final selectivity estimate */
+
+However this may quickly get non-trivial, e.g. when combining two statistics
+of different types (histogram vs. MCV).
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index fea2bb7..33f5a1b 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -192,11 +192,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index f05a517..35b2f8e 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,6 +17,14 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
--
2.1.0
0007-multivariate-ndistinct-coefficients.patchtext/x-patch; charset=UTF-8; name=0007-multivariate-ndistinct-coefficients.patchDownload
From d9b0afd75f2f678079d50f3d520bdd478c75bc89 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Wed, 23 Dec 2015 02:07:58 +0100
Subject: [PATCH 7/9] multivariate ndistinct coefficients
---
doc/src/sgml/ref/create_statistics.sgml | 9 ++
src/backend/catalog/system_views.sql | 3 +-
src/backend/commands/analyze.c | 2 +-
src/backend/commands/statscmds.c | 11 +-
src/backend/optimizer/path/clausesel.c | 4 +
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/adt/selfuncs.c | 93 +++++++++++++++-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.ndistinct | 83 ++++++++++++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 23 +++-
src/backend/utils/mvstats/mvdist.c | 171 +++++++++++++++++++++++++++++
src/include/catalog/pg_mv_statistic.h | 26 +++--
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 9 +-
src/test/regress/expected/rules.out | 3 +-
16 files changed, 424 insertions(+), 23 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.ndistinct
create mode 100644 src/backend/utils/mvstats/mvdist.c
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index fd3382e..80360a6 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -168,6 +168,15 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>ndistinct</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables ndistinct coefficients for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 6afdee0..a550141 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -169,7 +169,8 @@ CREATE VIEW pg_mv_stats AS
length(S.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
length(S.stahist) AS histbytes,
- pg_mv_stats_histogram_info(S.stahist) AS histinfo
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo,
+ standcoeff AS ndcoeff
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 9087532..c29f1be 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -582,7 +582,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
}
/* Build multivariate stats (if there are any). */
- build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
+ build_mv_stats(onerel, totalrows, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index e2f3ff1..11de1c5 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -72,7 +72,8 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
build_mcv = false,
- build_histogram = false;
+ build_histogram = false,
+ build_ndistinct = false;
int32 max_buckets = -1,
max_mcv_items = -1;
@@ -155,6 +156,8 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "ndistinct") == 0)
+ build_ndistinct = defGetBoolean(opt);
else if (strcmp(opt->defname, "mcv") == 0)
build_mcv = defGetBoolean(opt);
else if (strcmp(opt->defname, "max_mcv_items") == 0)
@@ -209,10 +212,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv || build_histogram))
+ if (! (build_dependencies || build_mcv || build_histogram || build_ndistinct))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram, ndistinct) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -246,6 +249,7 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+ values[Anum_pg_mv_statistic_ndist_enabled-1] = BoolGetDatum(build_ndistinct);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
@@ -253,6 +257,7 @@ CreateStatistics(CreateStatsStmt *stmt)
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
nulls[Anum_pg_mv_statistic_stahist -1] = true;
+ nulls[Anum_pg_mv_statistic_standist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index c1b8999..2540da9 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -59,6 +59,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
#define MV_CLAUSE_TYPE_HIST 0x04
+#define MV_CLAUSE_TYPE_NDIST 0x08
static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
int type);
@@ -3246,6 +3247,9 @@ stats_type_matches(MVStatisticInfo *stat, int type)
if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
return true;
+ if ((type & MV_CLAUSE_TYPE_NDIST) && stat->ndist_built)
+ return true;
+
return false;
}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 2519249..3741b7a 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -412,7 +412,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built || mvstat->ndist_built)
{
info = makeNode(MVStatisticInfo);
@@ -423,11 +423,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
info->hist_enabled = mvstat->hist_enabled;
+ info->ndist_enabled = mvstat->ndist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
info->hist_built = mvstat->hist_built;
+ info->ndist_built = mvstat->ndist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 7d0a3a1..a84dd2b 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -132,6 +132,7 @@
#include "utils/fmgroids.h"
#include "utils/index_selfuncs.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/nabstime.h"
#include "utils/pg_locale.h"
#include "utils/rel.h"
@@ -206,6 +207,7 @@ static Const *string_to_const(const char *str, Oid datatype);
static Const *string_to_bytea_const(const char *str, size_t str_len);
static List *add_predicate_to_quals(IndexOptInfo *index, List *indexQuals);
+static Oid find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos);
/*
* eqsel - Selectivity of "=" for any data types.
@@ -3422,12 +3424,26 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
* don't know by how much. We should never clamp to less than the
* largest ndistinct value for any of the Vars, though, since
* there will surely be at least that many groups.
+ *
+ * However we don't need to do this if we have ndistinct stats on
+ * the columns - in that case we can simply use the coefficient
+ * to get the (probably way more accurate) estimate.
+ *
+ * XXX Probably needs refactoring (don't like to mix with clamp
+ * and coeff at the same time).
*/
double clamp = rel->tuples;
+ double coeff = 1.0;
if (relvarcount > 1)
{
- clamp *= 0.1;
+ Oid oid = find_ndistinct_coeff(root, rel, varinfos);
+
+ if (oid != InvalidOid)
+ coeff = load_mv_ndistinct(oid);
+ else
+ clamp *= 0.1;
+
if (clamp < relmaxndistinct)
{
clamp = relmaxndistinct;
@@ -3436,6 +3452,13 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
clamp = rel->tuples;
}
}
+
+ /*
+ * Apply ndistinct coefficient from multivar stats (we must do this
+ * before clamping the estimate in any way.
+ */
+ reldistinct /= coeff;
+
if (reldistinct > clamp)
reldistinct = clamp;
@@ -7582,3 +7605,71 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
/* XXX what about pages_per_range? */
}
+
+/*
+ * Find applicable ndistinct statistics and compute the coefficient to
+ * correct the estimate (simply a product of per-column ndistincts).
+ *
+ * Currently we only look for a perfect match, i.e. a single ndistinct
+ * estimate exactly matching all the columns of the statistics.
+ */
+static Oid
+find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+ VariableStatData vardata;
+
+ foreach(lc, varinfos)
+ {
+ GroupVarInfo *varinfo = (GroupVarInfo *) lfirst(lc);
+
+ if (varinfo->rel != rel)
+ continue;
+
+ /* FIXME handle expressions in general only */
+
+ /*
+ * examine the variable (or expression) so that we know which
+ * attribute we're dealing with - we need this for matching the
+ * ndistinct coefficient
+ *
+ * FIXME probably might remember this from estimate_num_groups
+ */
+ examine_variable(root, varinfo->var, 0, &vardata);
+
+ if (HeapTupleIsValid(vardata.statsTuple))
+ {
+ Form_pg_statistic stats
+ = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+
+ attnums = bms_add_member(attnums, stats->staattnum);
+
+ ReleaseVariableStats(vardata);
+ }
+ }
+
+ /* look for a matching ndistinct statistics */
+ foreach (lc, rel->mvstatlist)
+ {
+ int i;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* skip statistics without ndistinct coefficient built */
+ if (!info->ndist_built)
+ continue;
+
+ /* only exact matches for now (same set of columns) */
+ if (bms_num_members(attnums) != info->stakeys->dim1)
+ continue;
+
+ /* check that the columns match */
+ for (i = 0; i < info->stakeys->dim1; i++)
+ if (bms_is_member(info->stakeys->values[i], attnums))
+ continue;
+
+ return info->mvoid;
+ }
+
+ return InvalidOid;
+}
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 9dbb3b6..d4b88e9 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o histogram.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o mvdist.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.ndistinct b/src/backend/utils/mvstats/README.ndistinct
new file mode 100644
index 0000000..32d1624
--- /dev/null
+++ b/src/backend/utils/mvstats/README.ndistinct
@@ -0,0 +1,83 @@
+ndistinct coefficients
+======================
+
+Estimating number of distinct groups in a combination of columns is tricky,
+and the estimation error is often significant. By ndistinct coefficient we
+mean a ratio
+
+ q = ndistinct(a) * ndistinct(b) / ndistinct(a,b)
+
+where 'a' and 'b' are columns, ndistinct(a) is (an estimate of) a number of
+distinct values in column 'a'. And ndistinct(a,b) is the same thing for the
+pair of columns.
+
+The meaning of the coefficient may be illustrated by answering the following
+question: Given a combination of columns (a,b), how many distinct values of 'b'
+matches a chosen value of 'a' on average?
+
+Let's assume we know ndistinct(a) and ndistinct(a,b). Then the answer to the
+question clearly is
+
+ ndistinct(a,b) / ndistinct(a)
+
+and by using 'q' we may rewrite this as
+
+ ndistinct(b) / q
+
+so 'q' may be considered as a correction factor of the ndistinct estimate given
+a condition on one of the columns.
+
+This may be generalized to a combination of 'n' columns
+
+ [ndistinct(c1) * ... * ndistinct(cn)] / ndistinct(c1, ..., cn)
+
+and the meaning is very similar, except that we need to use conditions on (n-1)
+of the columns.
+
+
+Selectivity estimation
+----------------------
+
+As explained in the previous paragraph, ndistinct coefficients may be used to
+estimate cardinality of a column, given some apriori knowledge. Let's assume
+we need to estimate selectivity of a condition
+
+ (a=1) AND (b=2)
+
+which we can expand like this
+
+ P(a=1 & b=2) = P(a=1) * P(b=2 | a=1)
+
+Let's also assume that the distribution of 'b' is uniform, i.e. that
+
+ P(a=1) = 1/ndistinct(a)
+ P(b=2) = 1/ndistinct(b)
+ P(a=1 & b=2) = 1/ndistinct(a,b)
+
+ P(b=2 | a=1) = ndistinct(a) / ndistinct(a,b)
+
+which may be rewritten like
+
+ P(b=2 | a=1)
+ = ndistinct(a,b) / ndistinct(a)
+ = (1/ndistinct(b)) * [(ndistinct(a) * ndistinct(b)) / ndistinct(a,b)]
+ = (1/ndistinct(b)) * q
+
+and therefore
+
+ P(a=1 & b=2) = (1/ndistinct(a)) * (1/ndistinct(b)) * q
+
+This also illustrates 'q' as a correction coefficient.
+
+It also explains why we store the coefficient and not simply ndistinct(a,b).
+This way we can simply estimate individual clauses and then simply correct
+the estimate by multiplying the result with 'q' - we don't have to mess with
+ndistinct estimates at all.
+
+Naturally, as the coefficient is derives from ndistinct(a,b), it may be also
+used to estimate GROUP BY clauses on the combination of columns, replacing the
+existing heuristics in estimate_num_groups().
+
+Note: Currently only the GROUP BY estimation is implemented. It's a bit unclear
+how to implement the clause estimation when there are other statistics (esp.
+MCV lists and/or functional dependencies) available.
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index d404914..6d4b09b 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -20,6 +20,8 @@ Currently we only have two kinds of multivariate statistics
(c) multivariate histograms (README.histogram)
+ (d) ndistinct coefficients
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index ffb76f4..2be980d 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -32,7 +32,8 @@ static List* list_mv_stats(Oid relid);
* and serializes them back into the catalog (as bytea values).
*/
void
-build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats)
{
ListCell *lc;
@@ -53,6 +54,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
MVHistogram histogram = NULL;
+ double ndist = -1;
int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
@@ -92,6 +94,9 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->ndist_enabled)
+ ndist = build_mv_ndistinct(totalrows, numrows, rows, attrs, stats);
+
/* build the MCV list */
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
@@ -101,7 +106,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, ndist, attrs, stats);
}
}
@@ -183,6 +188,8 @@ list_mv_stats(Oid relid)
info->mcv_built = stats->mcv_built;
info->hist_enabled = stats->hist_enabled;
info->hist_built = stats->hist_built;
+ info->ndist_enabled = stats->ndist_enabled;
+ info->ndist_built = stats->ndist_built;
result = lappend(result, info);
}
@@ -252,7 +259,7 @@ find_mv_attnums(Oid mvoid, Oid *relid)
void
update_mv_stats(Oid mvoid,
MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
- int2vector *attrs, VacAttrStats **stats)
+ double ndistcoeff, int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -292,26 +299,36 @@ update_mv_stats(Oid mvoid,
= PointerGetDatum(data);
}
+ if (ndistcoeff > 1.0)
+ {
+ nulls[Anum_pg_mv_statistic_standist -1] = false;
+ values[Anum_pg_mv_statistic_standist-1] = Float8GetDatum(ndistcoeff);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
replaces[Anum_pg_mv_statistic_stahist-1] = true;
+ replaces[Anum_pg_mv_statistic_standist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_hist_built-1] = false;
+ nulls[Anum_pg_mv_statistic_ndist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_ndist_built-1] = true;
replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
+ values[Anum_pg_mv_statistic_ndist_built-1] = BoolGetDatum(ndistcoeff > 1.0);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/mvdist.c b/src/backend/utils/mvstats/mvdist.c
new file mode 100644
index 0000000..59b8358
--- /dev/null
+++ b/src/backend/utils/mvstats/mvdist.c
@@ -0,0 +1,171 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvdist.c
+ * POSTGRES multivariate distinct coefficients
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mvdist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include <math.h>
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+static double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
+
+/*
+ * Compute ndistinct coefficient for the combination of attributes. This
+ * computes the ndistinct estimate using the same estimator used in analyze.c
+ * and then computes the coefficient.
+ */
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats)
+{
+ int i, j;
+ int f1, cnt, d;
+ int nmultiple, summultiple;
+ int numattrs = attrs->dim1;
+ MultiSortSupport mss = multi_sort_init(numattrs);
+ double ndistcoeff;
+
+ /*
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simpler / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ Assert(numattrs >= 2);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* accumulate all the data into the array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ items[j].values[i]
+ = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+ }
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /* count number of distinct combinations */
+
+ f1 = 0;
+ cnt = 1;
+ d = 1;
+ for (i = 1; i < numrows; i++)
+ {
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ {
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ d++;
+ cnt = 0;
+ }
+
+ cnt += 1;
+ }
+
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ ndistcoeff = 1 / estimate_ndistinct(totalrows, numrows, d, f1);
+
+ /*
+ * now count distinct values for each attribute and incrementally
+ * compute ndistinct(a,b) / (ndistinct(a) * ndistinct(b))
+ *
+ * FIXME Probably need to handle cases when one of the ndistinct
+ * estimates is negative, and also check that the combined
+ * ndistinct is greater than any of those partial values.
+ */
+ for (i = 0; i < numattrs; i++)
+ ndistcoeff *= stats[i]->stadistinct;
+
+ return ndistcoeff;
+}
+
+double
+load_mv_ndistinct(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->ndist_enabled && mvstat->ndist_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_standist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return DatumGetFloat8(deps);
+}
+
+/* The Duj1 estimator (already used in analyze.c). */
+static double
+estimate_ndistinct(double totalrows, int numrows, int d, int f1)
+{
+ double numer,
+ denom,
+ ndistinct;
+
+ numer = (double) numrows *(double) d;
+
+ denom = (double) (numrows - f1) +
+ (double) f1 * (double) numrows / totalrows;
+
+ ndistinct = numer / denom;
+
+ /* Clamp to sane range in case of roundoff error */
+ if (ndistinct < (double) d)
+ ndistinct = (double) d;
+
+ if (ndistinct > totalrows)
+ ndistinct = totalrows;
+
+ return floor(ndistinct + 0.5);
+}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 37f473f..e46cc6b 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -40,6 +40,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
bool hist_enabled; /* build histogram? */
+ bool ndist_enabled; /* build ndist coefficient? */
/* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
@@ -49,6 +50,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
bool hist_built; /* histogram was built */
+ bool ndist_built; /* ndistinct coeff built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -57,6 +59,7 @@ CATALOG(pg_mv_statistic,3381)
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
bytea stahist; /* MV histogram (serialized) */
+ float8 standcoeff; /* ndistinct coeff (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -72,7 +75,7 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 15
+#define Natts_pg_mv_statistic 19
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
@@ -80,14 +83,17 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
#define Anum_pg_mv_statistic_deps_enabled 5
#define Anum_pg_mv_statistic_mcv_enabled 6
#define Anum_pg_mv_statistic_hist_enabled 7
-#define Anum_pg_mv_statistic_mcv_max_items 8
-#define Anum_pg_mv_statistic_hist_max_buckets 9
-#define Anum_pg_mv_statistic_deps_built 10
-#define Anum_pg_mv_statistic_mcv_built 11
-#define Anum_pg_mv_statistic_hist_built 12
-#define Anum_pg_mv_statistic_stakeys 13
-#define Anum_pg_mv_statistic_stadeps 14
-#define Anum_pg_mv_statistic_stamcv 15
-#define Anum_pg_mv_statistic_stahist 16
+#define Anum_pg_mv_statistic_ndist_enabled 8
+#define Anum_pg_mv_statistic_mcv_max_items 9
+#define Anum_pg_mv_statistic_hist_max_buckets 19
+#define Anum_pg_mv_statistic_deps_built 11
+#define Anum_pg_mv_statistic_mcv_built 12
+#define Anum_pg_mv_statistic_hist_built 13
+#define Anum_pg_mv_statistic_ndist_built 14
+#define Anum_pg_mv_statistic_stakeys 15
+#define Anum_pg_mv_statistic_stadeps 16
+#define Anum_pg_mv_statistic_stamcv 17
+#define Anum_pg_mv_statistic_stahist 18
+#define Anum_pg_mv_statistic_standist 19
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 8c50bfb..1923f2b 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -655,11 +655,13 @@ typedef struct MVStatisticInfo
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
bool hist_enabled; /* histogram enabled */
+ bool ndist_enabled; /* ndistinct coefficient enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
bool hist_built; /* histogram built */
+ bool ndist_built; /* ndistinct coefficient built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 35b2f8e..fb2c5d8 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -225,6 +225,7 @@ typedef MVSerializedHistogramData *MVSerializedHistogram;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
MVSerializedHistogram load_mv_histogram(Oid mvoid);
+double load_mv_ndistinct(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
@@ -266,11 +267,17 @@ MVHistogram
build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int numrows_total);
-void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
void update_mv_stats(Oid relid, MVDependencies dependencies,
MCVList mcvlist, MVHistogram histogram,
+ double ndistcoeff,
int2vector *attrs, VacAttrStats **stats);
#ifdef DEBUG_MVHIST
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 1a1a4ca..0ad935e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1377,7 +1377,8 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
length(s.stahist) AS histbytes,
- pg_mv_stats_histogram_info(s.stahist) AS histinfo
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo,
+ s.standcoeff AS ndcoeff
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
--
2.1.0
0008-change-how-we-apply-selectivity-to-number-of-groups-.patchtext/x-patch; charset=UTF-8; name=0008-change-how-we-apply-selectivity-to-number-of-groups-.patchDownload
From 3e6238b1651b37c0fc3f1dbad6be3c5bdbae5be8 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 26 Jan 2016 18:14:33 +0100
Subject: [PATCH 8/9] change how we apply selectivity to number of groups
estimate
Instead of simply multiplying the ndistinct estimate with selecticity,
we instead use the formula for the expected number of distinct values
observed in 'k' rows when there are 'd' distinct values in the bin
d * (1 - ((d - 1) / d)^k)
This is 'with replacements' which seems appropriate for the use, and it
mostly assumes uniform distribution of the distinct values. So if the
distribution is not uniform (e.g. there are very frequent groups) this
may be less accurate than the current algorithm in some cases, giving
over-estimates. But that's probably better than OOM.
---
src/backend/utils/adt/selfuncs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index a84dd2b..ce3ad19 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3465,7 +3465,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
/*
* Multiply by restriction selectivity.
*/
- reldistinct *= rel->rows / rel->tuples;
+ reldistinct = reldistinct * (1 - powl((reldistinct - 1) / reldistinct,rel->rows));
/*
* Update estimate of total distinct groups.
--
2.1.0
0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchtext/x-patch; charset=UTF-8; name=0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchDownload
From 82510eec9a98e24bc86deb313f3c031d54420996 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Sun, 28 Feb 2016 21:16:40 +0100
Subject: [PATCH 9/9] fixup of regression tests (plans changes by group by
estimation)
---
src/test/regress/expected/join.out | 18 ++++++++++--------
src/test/regress/expected/subselect.out | 25 +++++++++++--------------
src/test/regress/expected/union.out | 16 ++++++++--------
3 files changed, 29 insertions(+), 30 deletions(-)
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index cafbc5e..151402d 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3965,18 +3965,20 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
explain (costs off)
select d.* from d left join (select distinct * from b) s
on d.a = s.id;
- QUERY PLAN
---------------------------------------
+ QUERY PLAN
+---------------------------------------------
Merge Right Join
- Merge Cond: (b.id = d.a)
- -> Unique
- -> Sort
- Sort Key: b.id, b.c_id
- -> Seq Scan on b
+ Merge Cond: (s.id = d.a)
+ -> Sort
+ Sort Key: s.id
+ -> Subquery Scan on s
+ -> HashAggregate
+ Group Key: b.id, b.c_id
+ -> Seq Scan on b
-> Sort
Sort Key: d.a
-> Seq Scan on d
-(9 rows)
+(11 rows)
-- check join removal works when uniqueness of the join condition is enforced
-- by a UNION
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index de64ca7..0fc93d9 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -807,27 +807,24 @@ select * from int4_tbl where
explain (verbose, costs off)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
- QUERY PLAN
-----------------------------------------------------------------------
- Hash Join
+ QUERY PLAN
+----------------------------------------------------------------
+ Hash Semi Join
Output: o.f1
Hash Cond: (o.f1 = "ANY_subquery".f1)
-> Seq Scan on public.int4_tbl o
Output: o.f1
-> Hash
Output: "ANY_subquery".f1, "ANY_subquery".g
- -> HashAggregate
+ -> Subquery Scan on "ANY_subquery"
Output: "ANY_subquery".f1, "ANY_subquery".g
- Group Key: "ANY_subquery".f1, "ANY_subquery".g
- -> Subquery Scan on "ANY_subquery"
- Output: "ANY_subquery".f1, "ANY_subquery".g
- Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
- -> HashAggregate
- Output: i.f1, (generate_series(1, 2) / 10)
- Group Key: i.f1
- -> Seq Scan on public.int4_tbl i
- Output: i.f1
-(18 rows)
+ Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
+ -> HashAggregate
+ Output: i.f1, (generate_series(1, 2) / 10)
+ Group Key: i.f1
+ -> Seq Scan on public.int4_tbl i
+ Output: i.f1
+(15 rows)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 016571b..f2e297e 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -263,16 +263,16 @@ ORDER BY 1;
SELECT q2 FROM int8_tbl INTERSECT SELECT q1 FROM int8_tbl;
q2
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
SELECT q2 FROM int8_tbl INTERSECT ALL SELECT q1 FROM int8_tbl;
q2
------------------
+ 123
4567890123456789
4567890123456789
- 123
(3 rows)
SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
@@ -305,16 +305,16 @@ SELECT q1 FROM int8_tbl EXCEPT SELECT q2 FROM int8_tbl;
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q2 FROM int8_tbl;
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT DISTINCT q2 FROM int8_tbl;
q1
------------------
+ 123
4567890123456789
4567890123456789
- 123
(3 rows)
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q1 FROM int8_tbl FOR NO KEY UPDATE;
@@ -343,8 +343,8 @@ SELECT f1 FROM float8_tbl EXCEPT SELECT f1 FROM int4_tbl ORDER BY 1;
SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl;
q1
-------------------
- 4567890123456789
123
+ 4567890123456789
456
4567890123456789
123
@@ -355,15 +355,15 @@ SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FR
SELECT q1 FROM int8_tbl INTERSECT (((SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl)));
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
(((SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl))) UNION ALL SELECT q2 FROM int8_tbl;
q1
-------------------
- 4567890123456789
123
+ 4567890123456789
456
4567890123456789
123
@@ -419,8 +419,8 @@ HINT: There is a column named "q2" in table "*SELECT* 2", but it cannot be refe
SELECT q1 FROM int8_tbl EXCEPT (((SELECT q2 FROM int8_tbl ORDER BY q2 LIMIT 1)));
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
--
--
2.1.0
On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
Hi,
thanks for the feedback. Attached is v14 of the patch series, fixing
most of the points you've raised.
Hi Tomas,
Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in make
check if I configure without --enable-cassert.
With --enable-cassert, it passes the regression test.
I got the core file, configured and compiled with:
CFLAGS="-fno-omit-frame-pointer" --enable-debug
The first core dump is on this statement:
-- check explain (expect bitmap index scan, not plain index scan)
INSERT INTO functional_dependencies
SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
bt
#0 0x00000000006e1160 in cost_qual_eval (cost=0x2494418,
quals=0x2495550, root=0x2541b88) at costsize.c:3181
#1 0x00000000006e1ee5 in set_baserel_size_estimates (root=0x2541b88,
rel=0x2494300) at costsize.c:3754
#2 0x00000000006d37e8 in set_plain_rel_size (root=0x2541b88,
rel=0x2494300, rte=0x247e660) at allpaths.c:480
#3 0x00000000006d353d in set_rel_size (root=0x2541b88, rel=0x2494300,
rti=1, rte=0x247e660) at allpaths.c:350
#4 0x00000000006d338f in set_base_rel_sizes (root=0x2541b88) at allpaths.c:270
#5 0x00000000006d3233 in make_one_rel (root=0x2541b88,
joinlist=0x2494628) at allpaths.c:169
#6 0x000000000070012e in query_planner (root=0x2541b88,
tlist=0x2541e58, qp_callback=0x7048d4 <standard_qp_callback>,
qp_extra=0x7ffefa6474e0)
at planmain.c:246
#7 0x0000000000702a33 in grouping_planner (root=0x2541b88,
inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1647
#8 0x0000000000701310 in subquery_planner (glob=0x2541af8,
parse=0x246a838, parent_root=0x0, hasRecursion=0 '\000',
tuple_fraction=0) at planner.c:740
#9 0x000000000070055b in standard_planner (parse=0x246a838,
cursorOptions=256, boundParams=0x0) at planner.c:290
#10 0x000000000070023f in planner (parse=0x246a838, cursorOptions=256,
boundParams=0x0) at planner.c:160
#11 0x00000000007b8bf9 in pg_plan_query (querytree=0x246a838,
cursorOptions=256, boundParams=0x0) at postgres.c:798
#12 0x00000000005d1967 in ExplainOneQuery (query=0x246a838, into=0x0,
es=0x246a778,
queryString=0x2443d80 "EXPLAIN (COSTS off)\n SELECT * FROM
mcv_list WHERE a = 10 AND b = 5;", params=0x0) at explain.c:350
#13 0x00000000005d16a3 in ExplainQuery (stmt=0x2444f90,
queryString=0x2443d80 "EXPLAIN (COSTS off)\n SELECT * FROM mcv_list
WHERE a = 10 AND b = 5;",
params=0x0, dest=0x246a6e8) at explain.c:244
#14 0x00000000007c0afb in standard_ProcessUtility (parsetree=0x2444f90,
queryString=0x2443d80 "EXPLAIN (COSTS off)\n SELECT * FROM
mcv_list WHERE a = 10 AND b = 5;", context=PROCESS_UTILITY_TOPLEVEL,
params=0x0,
dest=0x246a6e8, completionTag=0x7ffefa647b60 "") at utility.c:659
#15 0x00000000007c0299 in ProcessUtility (parsetree=0x2444f90,
queryString=0x2443d80 "EXPLAIN (COSTS off)\n SELECT * FROM mcv_list
WHERE a = 10 AND b = 5;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x246a6e8,
completionTag=0x7ffefa647b60 "") at utility.c:335
#16 0x00000000007bf47b in PortalRunUtility (portal=0x23ed510,
utilityStmt=0x2444f90, isTopLevel=1 '\001', dest=0x246a6e8,
completionTag=0x7ffefa647b60 "")
at pquery.c:1183
#17 0x00000000007bf1ce in FillPortalStore (portal=0x23ed510,
isTopLevel=1 '\001') at pquery.c:1057
#18 0x00000000007beb19 in PortalRun (portal=0x23ed510,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x253f6c0,
altdest=0x253f6c0,
completionTag=0x7ffefa647d40 "") at pquery.c:781
#19 0x00000000007b90ae in exec_simple_query (query_string=0x2443d80
"EXPLAIN (COSTS off)\n SELECT * FROM mcv_list WHERE a = 10 AND b =
5;")
at postgres.c:1094
#20 0x00000000007bcfac in PostgresMain (argc=1, argv=0x23d5070,
dbname=0x23d4e48 "regression", username=0x23d4e30 "jjanes") at
postgres.c:4021
#21 0x0000000000745a62 in BackendRun (port=0x23f4110) at postmaster.c:4258
#22 0x00000000007451d6 in BackendStartup (port=0x23f4110) at postmaster.c:3932
#23 0x0000000000741ab7 in ServerLoop () at postmaster.c:1690
#24 0x00000000007411c0 in PostmasterMain (argc=8, argv=0x23d3f20) at
postmaster.c:1298
#25 0x0000000000690026 in main (argc=8, argv=0x23d3f20) at main.c:223
Cheers,
Jeff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On Wed, 2016-03-09 at 08:45 -0800, Jeff Janes wrote:
On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:Hi,
thanks for the feedback. Attached is v14 of the patch series, fixing
most of the points you've raised.Hi Tomas,
Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in make
check if I configure without --enable-cassert.
Ah, after disabling asserts I can reproduce it too. And the reason why
it fails is quite simple - clauselist_selectivity modifies the original
list of clauses, which then confuses cost_qual_eval.
Can you try if the attached patch fixes the issue? I'll need to rework a
bit more of the code, but let's see if this fixes the issue on your
machine too.
With --enable-cassert, it passes the regression test.
I wonder how can it work with casserts and fail without them. That's
kinda exactly the opposite to what I'd expect ...
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
mvstats-segfault-fix.patchtext/x-patch; charset=UTF-8; name=mvstats-segfault-fix.patchDownload
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 2540da9..ddfdc3b 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -279,6 +279,10 @@ clauselist_selectivity(PlannerInfo *root,
List *solution = choose_mv_statistics(root, relid, stats,
clauses, conditions);
+ /* FIXME we must not scribble over the original list */
+ if (solution)
+ clauses = list_copy(clauses);
+
/*
* We have a good solution, which is merely a list of statistics that
* we need to apply. We'll apply the statistics one by one (in the order
On Wed, Mar 9, 2016 at 9:21 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
Hi,
On Wed, 2016-03-09 at 08:45 -0800, Jeff Janes wrote:
On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:Hi,
thanks for the feedback. Attached is v14 of the patch series, fixing
most of the points you've raised.Hi Tomas,
Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in make
check if I configure without --enable-cassert.Ah, after disabling asserts I can reproduce it too. And the reason why
it fails is quite simple - clauselist_selectivity modifies the original
list of clauses, which then confuses cost_qual_eval.Can you try if the attached patch fixes the issue? I'll need to rework a
bit more of the code, but let's see if this fixes the issue on your
machine too.
Yes, that fixes it.
With --enable-cassert, it passes the regression test.
I wonder how can it work with casserts and fail without them. That's
kinda exactly the opposite to what I'd expect ...
I too was surprised by that. Maybe cassert makes a copy of some data
structure which is used in-place without cassert?
Thanks,
Jeff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, 2016-03-09 at 18:21 +0100, Tomas Vondra wrote:
Hi,
On Wed, 2016-03-09 at 08:45 -0800, Jeff Janes wrote:
On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:Hi,
thanks for the feedback. Attached is v14 of the patch series, fixing
most of the points you've raised.Hi Tomas,
Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in make
check if I configure without --enable-cassert.Ah, after disabling asserts I can reproduce it too. And the reason why
it fails is quite simple - clauselist_selectivity modifies the original
list of clauses, which then confuses cost_qual_eval.
More precisely, it gets confused because the first clause in the list
gets deleted but cost_qual_eval never learns about that, and follows
stale pointer to the next cell, thus a segfault.
Can you try if the attached patch fixes the issue? I'll need to rework a
bit more of the code, but let's see if this fixes the issue on your
machine too.With --enable-cassert, it passes the regression test.
I wonder how can it work with casserts and fail without them. That's
kinda exactly the opposite to what I'd expect ...
FWIW it seems to be somehow related to this assert in clausesel.c:
Assert(count_mv_attnums(list_union(stat_clauses, stat_conditions),
relid, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2);
With the assert in place, the code passes without a failure. After
removing the assert (commenting it out), or even just changing it to
Assert(count_mv_attnums(stat_clauses, relid,
MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST)
+ count_mv_attnums(stat_conditions, relid,
MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2);
i.e. removing the list_union, it fails as expected.
The only thing that I can think of is that list_union happens to place
the right stuff at the right position in memory - pure luck.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Mar 9, 2016 at 9:21 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
Hi,
On Wed, 2016-03-09 at 08:45 -0800, Jeff Janes wrote:
On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:Hi,
thanks for the feedback. Attached is v14 of the patch series, fixing
most of the points you've raised.Hi Tomas,
Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in make
check if I configure without --enable-cassert.Ah, after disabling asserts I can reproduce it too. And the reason why
it fails is quite simple - clauselist_selectivity modifies the original
list of clauses, which then confuses cost_qual_eval.Can you try if the attached patch fixes the issue? I'll need to rework a
bit more of the code, but let's see if this fixes the issue on your
machine too.
That patch on top of v14 did fix the original problem. But I got
another segfault:
jjanes=# create table foo as select x, floor(x/(10000000/500))::int as
y from generate_series(1,10000000) f(x);
jjanes=# create index on foo (x,y);
jjanes=# create index on foo (y,x);
jjanes=# create statistics jjj on foo (x,y) with (dependencies,histogram);
jjanes=# analyze ;
server closed the connection unexpectedly
#0 multi_sort_add_dimension (mss=mss@entry=0x7f45dafc7c88,
sortdim=sortdim@entry=0, dim=dim@entry=0,
vacattrstats=vacattrstats@entry=0x16f0dd0) at common.c:436
#1 0x00000000007d022a in update_bucket_ndistinct (attrs=0x166fdf8,
stats=0x16f0dd0, bucket=<optimized out>) at histogram.c:1384
#2 0x00000000007d09aa in create_initial_mv_bucket (stats=0x16f0dd0,
attrs=0x166fdf8, rows=0x17cda20, numrows=30000) at histogram.c:880
#3 build_mv_histogram (numrows=30000, rows=rows@entry=0x170ecf0,
attrs=attrs@entry=0x166fdf8, stats=stats@entry=0x16f0dd0,
numrows_total=numrows_total@entry=30000)
at histogram.c:156
#4 0x00000000007ced19 in build_mv_stats
(onerel=onerel@entry=0x7f45e797d040, totalrows=9999985,
numrows=numrows@entry=30000, rows=rows@entry=0x170ecf0,
natts=natts@entry=2,
vacattrstats=vacattrstats@entry=0x166efa0) at common.c:106
#5 0x000000000055ff6b in do_analyze_rel
(onerel=onerel@entry=0x7f45e797d040, options=options@entry=2,
va_cols=va_cols@entry=0x0, acquirefunc=<optimized out>,
relpages=44248,
inh=inh@entry=0 '\000', in_outer_xact=in_outer_xact@entry=0
'\000', elevel=elevel@entry=13, params=0x7ffcbe382a30) at
analyze.c:585
#6 0x0000000000560ced in analyze_rel (relid=relid@entry=16441,
relation=relation@entry=0x16bc9d0, options=options@entry=2,
params=params@entry=0x7ffcbe382a30,
va_cols=va_cols@entry=0x0, in_outer_xact=<optimized out>,
bstrategy=0x16640f0) at analyze.c:262
#7 0x00000000005b70fd in vacuum (options=2, relation=0x16bc9d0,
relid=relid@entry=0, params=params@entry=0x7ffcbe382a30, va_cols=0x0,
bstrategy=<optimized out>,
bstrategy@entry=0x0, isTopLevel=isTopLevel@entry=1 '\001') at vacuum.c:313
#8 0x00000000005b748e in ExecVacuum (vacstmt=vacstmt@entry=0x16bca20,
isTopLevel=isTopLevel@entry=1 '\001') at vacuum.c:121
#9 0x00000000006c90f3 in standard_ProcessUtility
(parsetree=0x16bca20, queryString=0x16bbfc0 "analyze foo ;",
context=<optimized out>, params=0x0, dest=0x16bcd60,
completionTag=0x7ffcbe382fa0 "") at utility.c:654
#10 0x00007f45e413b1d1 in pgss_ProcessUtility (parsetree=0x16bca20,
queryString=0x16bbfc0 "analyze foo ;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x16bcd60,
completionTag=0x7ffcbe382fa0 "") at pg_stat_statements.c:986
#11 0x00000000006c6841 in PortalRunUtility (portal=0x16f7700,
utilityStmt=0x16bca20, isTopLevel=<optimized out>, dest=0x16bcd60,
completionTag=0x7ffcbe382fa0 "") at pquery.c:1175
#12 0x00000000006c73c5 in PortalRunMulti
(portal=portal@entry=0x16f7700, isTopLevel=isTopLevel@entry=1 '\001',
dest=dest@entry=0x16bcd60, altdest=altdest@entry=0x16bcd60,
completionTag=completionTag@entry=0x7ffcbe382fa0 "") at pquery.c:1306
#13 0x00000000006c7dd9 in PortalRun (portal=portal@entry=0x16f7700,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1
'\001', dest=dest@entry=0x16bcd60,
altdest=altdest@entry=0x16bcd60,
completionTag=completionTag@entry=0x7ffcbe382fa0 "") at pquery.c:813
#14 0x00000000006c5c98 in exec_simple_query (query_string=0x16bbfc0
"analyze foo ;") at postgres.c:1094
#15 PostgresMain (argc=<optimized out>, argv=argv@entry=0x164baf8,
dbname=0x164b9a8 "jjanes", username=<optimized out>) at
postgres.c:4021
#16 0x000000000047cb1e in BackendRun (port=0x1669d40) at postmaster.c:4258
#17 BackendStartup (port=0x1669d40) at postmaster.c:3932
#18 ServerLoop () at postmaster.c:1690
#19 0x000000000066ff27 in PostmasterMain (argc=argc@entry=1,
argv=argv@entry=0x164aa10) at postmaster.c:1298
#20 0x000000000047d35e in main (argc=1, argv=0x164aa10) at main.c:228
Cheers,
Jeff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, 2016-03-12 at 23:30 -0800, Jeff Janes wrote:
On Wed, Mar 9, 2016 at 9:21 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:Hi,
On Wed, 2016-03-09 at 08:45 -0800, Jeff Janes wrote:
On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:Hi,
thanks for the feedback. Attached is v14 of the patch series,
fixing
most of the points you've raised.Hi Tomas,
Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in
make
check if I configure without --enable-cassert.Ah, after disabling asserts I can reproduce it too. And the reason
why
it fails is quite simple - clauselist_selectivity modifies the
original
list of clauses, which then confuses cost_qual_eval.Can you try if the attached patch fixes the issue? I'll need to
rework a
bit more of the code, but let's see if this fixes the issue on your
machine too.That patch on top of v14 did fix the original problem. But I got
another segfault:
Oh, yeah. There was an extra pfree().
Attached is v15 of the patch series, fixing this and also doing quite a
few additional improvements:
* added some basic examples into the SGML documentation
* addressing the objectaddress omissions, as pointed out by Alvaro
* support for ALTER STATISTICS ... OWNER TO / RENAME / SET SCHEMA
* significant refactoring of MCV and histogram code, particularly
serialization, deserialization and building
* reworking the functional dependencies to support more complex
dependencies, with multiple columns as 'conditions'
* the reduction using functional dependencies is also significantly
simplified (I decided to get rid of computing the transitive closure
for now - it got too complex after the multi-condition dependencies,
so I'll leave that for the future
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0008-change-how-we-apply-selectivity-to-number-of-groups-.patchtext/x-patch; charset=UTF-8; name=0008-change-how-we-apply-selectivity-to-number-of-groups-.patchDownload
From 494a31e1ed7976e0f965a32e81c769e1c3dfad66 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 26 Jan 2016 18:14:33 +0100
Subject: [PATCH 8/9] change how we apply selectivity to number of groups
estimate
Instead of simply multiplying the ndistinct estimate with selecticity,
we instead use the formula for the expected number of distinct values
observed in 'k' rows when there are 'd' distinct values in the bin
d * (1 - ((d - 1) / d)^k)
This is 'with replacements' which seems appropriate for the use, and it
mostly assumes uniform distribution of the distinct values. So if the
distribution is not uniform (e.g. there are very frequent groups) this
may be less accurate than the current algorithm in some cases, giving
over-estimates. But that's probably better than OOM.
---
src/backend/utils/adt/selfuncs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index f8d39aa..6eceedf 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3466,7 +3466,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
/*
* Multiply by restriction selectivity.
*/
- reldistinct *= rel->rows / rel->tuples;
+ reldistinct = reldistinct * (1 - powl((reldistinct - 1) / reldistinct,rel->rows));
/*
* Update estimate of total distinct groups.
--
2.5.0
0007-multivariate-ndistinct-coefficients.patchtext/x-patch; charset=UTF-8; name=0007-multivariate-ndistinct-coefficients.patchDownload
From 1b905c77e851d34229da72c2a84107fa0925f54a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Wed, 23 Dec 2015 02:07:58 +0100
Subject: [PATCH 7/9] multivariate ndistinct coefficients
---
doc/src/sgml/ref/create_statistics.sgml | 9 ++
src/backend/catalog/system_views.sql | 3 +-
src/backend/commands/analyze.c | 2 +-
src/backend/commands/statscmds.c | 11 +-
src/backend/optimizer/path/clausesel.c | 4 +
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/adt/selfuncs.c | 93 +++++++++++++++-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.ndistinct | 83 ++++++++++++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 23 +++-
src/backend/utils/mvstats/mvdist.c | 171 +++++++++++++++++++++++++++++
src/include/catalog/pg_mv_statistic.h | 26 +++--
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 9 +-
src/test/regress/expected/rules.out | 3 +-
16 files changed, 424 insertions(+), 23 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.ndistinct
create mode 100644 src/backend/utils/mvstats/mvdist.c
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index f7336fd..80e472f 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -168,6 +168,15 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>ndistinct</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables ndistinct coefficients for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index b151db1..8d2b435 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -169,7 +169,8 @@ CREATE VIEW pg_mv_stats AS
length(S.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
length(S.stahist) AS histbytes,
- pg_mv_stats_histogram_info(S.stahist) AS histinfo
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo,
+ standcoeff AS ndcoeff
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 9087532..c29f1be 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -582,7 +582,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
}
/* Build multivariate stats (if there are any). */
- build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
+ build_mv_stats(onerel, totalrows, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index e0b085f..a7c569d 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -72,7 +72,8 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
build_mcv = false,
- build_histogram = false;
+ build_histogram = false,
+ build_ndistinct = false;
int32 max_buckets = -1,
max_mcv_items = -1;
@@ -155,6 +156,8 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "ndistinct") == 0)
+ build_ndistinct = defGetBoolean(opt);
else if (strcmp(opt->defname, "mcv") == 0)
build_mcv = defGetBoolean(opt);
else if (strcmp(opt->defname, "max_mcv_items") == 0)
@@ -209,10 +212,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv || build_histogram))
+ if (! (build_dependencies || build_mcv || build_histogram || build_ndistinct))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram, ndistinct) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -246,6 +249,7 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+ values[Anum_pg_mv_statistic_ndist_enabled-1] = BoolGetDatum(build_ndistinct);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
@@ -253,6 +257,7 @@ CreateStatistics(CreateStatsStmt *stmt)
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
nulls[Anum_pg_mv_statistic_stahist -1] = true;
+ nulls[Anum_pg_mv_statistic_standist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index e06fd99..255d275 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -59,6 +59,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
#define MV_CLAUSE_TYPE_HIST 0x04
+#define MV_CLAUSE_TYPE_NDIST 0x08
static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
int type);
@@ -2860,6 +2861,9 @@ stats_type_matches(MVStatisticInfo *stat, int type)
if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
return true;
+ if ((type & MV_CLAUSE_TYPE_NDIST) && stat->ndist_built)
+ return true;
+
return false;
}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 2519249..3741b7a 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -412,7 +412,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built || mvstat->ndist_built)
{
info = makeNode(MVStatisticInfo);
@@ -423,11 +423,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
info->hist_enabled = mvstat->hist_enabled;
+ info->ndist_enabled = mvstat->ndist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
info->hist_built = mvstat->hist_built;
+ info->ndist_built = mvstat->ndist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 805d633..f8d39aa 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -132,6 +132,7 @@
#include "utils/fmgroids.h"
#include "utils/index_selfuncs.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/nabstime.h"
#include "utils/pg_locale.h"
#include "utils/rel.h"
@@ -206,6 +207,7 @@ static Const *string_to_const(const char *str, Oid datatype);
static Const *string_to_bytea_const(const char *str, size_t str_len);
static List *add_predicate_to_quals(IndexOptInfo *index, List *indexQuals);
+static Oid find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos);
/*
* eqsel - Selectivity of "=" for any data types.
@@ -3423,12 +3425,26 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
* don't know by how much. We should never clamp to less than the
* largest ndistinct value for any of the Vars, though, since
* there will surely be at least that many groups.
+ *
+ * However we don't need to do this if we have ndistinct stats on
+ * the columns - in that case we can simply use the coefficient
+ * to get the (probably way more accurate) estimate.
+ *
+ * XXX Probably needs refactoring (don't like to mix with clamp
+ * and coeff at the same time).
*/
double clamp = rel->tuples;
+ double coeff = 1.0;
if (relvarcount > 1)
{
- clamp *= 0.1;
+ Oid oid = find_ndistinct_coeff(root, rel, varinfos);
+
+ if (oid != InvalidOid)
+ coeff = load_mv_ndistinct(oid);
+ else
+ clamp *= 0.1;
+
if (clamp < relmaxndistinct)
{
clamp = relmaxndistinct;
@@ -3437,6 +3453,13 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
clamp = rel->tuples;
}
}
+
+ /*
+ * Apply ndistinct coefficient from multivar stats (we must do this
+ * before clamping the estimate in any way.
+ */
+ reldistinct /= coeff;
+
if (reldistinct > clamp)
reldistinct = clamp;
@@ -7583,3 +7606,71 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
/* XXX what about pages_per_range? */
}
+
+/*
+ * Find applicable ndistinct statistics and compute the coefficient to
+ * correct the estimate (simply a product of per-column ndistincts).
+ *
+ * Currently we only look for a perfect match, i.e. a single ndistinct
+ * estimate exactly matching all the columns of the statistics.
+ */
+static Oid
+find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+ VariableStatData vardata;
+
+ foreach(lc, varinfos)
+ {
+ GroupVarInfo *varinfo = (GroupVarInfo *) lfirst(lc);
+
+ if (varinfo->rel != rel)
+ continue;
+
+ /* FIXME handle expressions in general only */
+
+ /*
+ * examine the variable (or expression) so that we know which
+ * attribute we're dealing with - we need this for matching the
+ * ndistinct coefficient
+ *
+ * FIXME probably might remember this from estimate_num_groups
+ */
+ examine_variable(root, varinfo->var, 0, &vardata);
+
+ if (HeapTupleIsValid(vardata.statsTuple))
+ {
+ Form_pg_statistic stats
+ = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+
+ attnums = bms_add_member(attnums, stats->staattnum);
+
+ ReleaseVariableStats(vardata);
+ }
+ }
+
+ /* look for a matching ndistinct statistics */
+ foreach (lc, rel->mvstatlist)
+ {
+ int i;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* skip statistics without ndistinct coefficient built */
+ if (!info->ndist_built)
+ continue;
+
+ /* only exact matches for now (same set of columns) */
+ if (bms_num_members(attnums) != info->stakeys->dim1)
+ continue;
+
+ /* check that the columns match */
+ for (i = 0; i < info->stakeys->dim1; i++)
+ if (bms_is_member(info->stakeys->values[i], attnums))
+ continue;
+
+ return info->mvoid;
+ }
+
+ return InvalidOid;
+}
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 9dbb3b6..d4b88e9 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o histogram.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o mvdist.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.ndistinct b/src/backend/utils/mvstats/README.ndistinct
new file mode 100644
index 0000000..32d1624
--- /dev/null
+++ b/src/backend/utils/mvstats/README.ndistinct
@@ -0,0 +1,83 @@
+ndistinct coefficients
+======================
+
+Estimating number of distinct groups in a combination of columns is tricky,
+and the estimation error is often significant. By ndistinct coefficient we
+mean a ratio
+
+ q = ndistinct(a) * ndistinct(b) / ndistinct(a,b)
+
+where 'a' and 'b' are columns, ndistinct(a) is (an estimate of) a number of
+distinct values in column 'a'. And ndistinct(a,b) is the same thing for the
+pair of columns.
+
+The meaning of the coefficient may be illustrated by answering the following
+question: Given a combination of columns (a,b), how many distinct values of 'b'
+matches a chosen value of 'a' on average?
+
+Let's assume we know ndistinct(a) and ndistinct(a,b). Then the answer to the
+question clearly is
+
+ ndistinct(a,b) / ndistinct(a)
+
+and by using 'q' we may rewrite this as
+
+ ndistinct(b) / q
+
+so 'q' may be considered as a correction factor of the ndistinct estimate given
+a condition on one of the columns.
+
+This may be generalized to a combination of 'n' columns
+
+ [ndistinct(c1) * ... * ndistinct(cn)] / ndistinct(c1, ..., cn)
+
+and the meaning is very similar, except that we need to use conditions on (n-1)
+of the columns.
+
+
+Selectivity estimation
+----------------------
+
+As explained in the previous paragraph, ndistinct coefficients may be used to
+estimate cardinality of a column, given some apriori knowledge. Let's assume
+we need to estimate selectivity of a condition
+
+ (a=1) AND (b=2)
+
+which we can expand like this
+
+ P(a=1 & b=2) = P(a=1) * P(b=2 | a=1)
+
+Let's also assume that the distribution of 'b' is uniform, i.e. that
+
+ P(a=1) = 1/ndistinct(a)
+ P(b=2) = 1/ndistinct(b)
+ P(a=1 & b=2) = 1/ndistinct(a,b)
+
+ P(b=2 | a=1) = ndistinct(a) / ndistinct(a,b)
+
+which may be rewritten like
+
+ P(b=2 | a=1)
+ = ndistinct(a,b) / ndistinct(a)
+ = (1/ndistinct(b)) * [(ndistinct(a) * ndistinct(b)) / ndistinct(a,b)]
+ = (1/ndistinct(b)) * q
+
+and therefore
+
+ P(a=1 & b=2) = (1/ndistinct(a)) * (1/ndistinct(b)) * q
+
+This also illustrates 'q' as a correction coefficient.
+
+It also explains why we store the coefficient and not simply ndistinct(a,b).
+This way we can simply estimate individual clauses and then simply correct
+the estimate by multiplying the result with 'q' - we don't have to mess with
+ndistinct estimates at all.
+
+Naturally, as the coefficient is derives from ndistinct(a,b), it may be also
+used to estimate GROUP BY clauses on the combination of columns, replacing the
+existing heuristics in estimate_num_groups().
+
+Note: Currently only the GROUP BY estimation is implemented. It's a bit unclear
+how to implement the clause estimation when there are other statistics (esp.
+MCV lists and/or functional dependencies) available.
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index d404914..6d4b09b 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -20,6 +20,8 @@ Currently we only have two kinds of multivariate statistics
(c) multivariate histograms (README.histogram)
+ (d) ndistinct coefficients
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index f6d1074..d34d072 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -32,7 +32,8 @@ static List* list_mv_stats(Oid relid);
* and serializes them back into the catalog (as bytea values).
*/
void
-build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats)
{
ListCell *lc;
@@ -53,6 +54,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
MVHistogram histogram = NULL;
+ double ndist = -1;
int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
@@ -92,6 +94,9 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->ndist_enabled)
+ ndist = build_mv_ndistinct(totalrows, numrows, rows, attrs, stats);
+
/* build the MCV list */
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
@@ -101,7 +106,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, ndist, attrs, stats);
}
}
@@ -183,6 +188,8 @@ list_mv_stats(Oid relid)
info->mcv_built = stats->mcv_built;
info->hist_enabled = stats->hist_enabled;
info->hist_built = stats->hist_built;
+ info->ndist_enabled = stats->ndist_enabled;
+ info->ndist_built = stats->ndist_built;
result = lappend(result, info);
}
@@ -252,7 +259,7 @@ find_mv_attnums(Oid mvoid, Oid *relid)
void
update_mv_stats(Oid mvoid,
MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
- int2vector *attrs, VacAttrStats **stats)
+ double ndistcoeff, int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -292,26 +299,36 @@ update_mv_stats(Oid mvoid,
= PointerGetDatum(data);
}
+ if (ndistcoeff > 1.0)
+ {
+ nulls[Anum_pg_mv_statistic_standist -1] = false;
+ values[Anum_pg_mv_statistic_standist-1] = Float8GetDatum(ndistcoeff);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
replaces[Anum_pg_mv_statistic_stahist-1] = true;
+ replaces[Anum_pg_mv_statistic_standist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_hist_built-1] = false;
+ nulls[Anum_pg_mv_statistic_ndist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_ndist_built-1] = true;
replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
+ values[Anum_pg_mv_statistic_ndist_built-1] = BoolGetDatum(ndistcoeff > 1.0);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/mvdist.c b/src/backend/utils/mvstats/mvdist.c
new file mode 100644
index 0000000..59b8358
--- /dev/null
+++ b/src/backend/utils/mvstats/mvdist.c
@@ -0,0 +1,171 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvdist.c
+ * POSTGRES multivariate distinct coefficients
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mvdist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include <math.h>
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+static double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
+
+/*
+ * Compute ndistinct coefficient for the combination of attributes. This
+ * computes the ndistinct estimate using the same estimator used in analyze.c
+ * and then computes the coefficient.
+ */
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats)
+{
+ int i, j;
+ int f1, cnt, d;
+ int nmultiple, summultiple;
+ int numattrs = attrs->dim1;
+ MultiSortSupport mss = multi_sort_init(numattrs);
+ double ndistcoeff;
+
+ /*
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simpler / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ Assert(numattrs >= 2);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* accumulate all the data into the array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ items[j].values[i]
+ = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+ }
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /* count number of distinct combinations */
+
+ f1 = 0;
+ cnt = 1;
+ d = 1;
+ for (i = 1; i < numrows; i++)
+ {
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ {
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ d++;
+ cnt = 0;
+ }
+
+ cnt += 1;
+ }
+
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ ndistcoeff = 1 / estimate_ndistinct(totalrows, numrows, d, f1);
+
+ /*
+ * now count distinct values for each attribute and incrementally
+ * compute ndistinct(a,b) / (ndistinct(a) * ndistinct(b))
+ *
+ * FIXME Probably need to handle cases when one of the ndistinct
+ * estimates is negative, and also check that the combined
+ * ndistinct is greater than any of those partial values.
+ */
+ for (i = 0; i < numattrs; i++)
+ ndistcoeff *= stats[i]->stadistinct;
+
+ return ndistcoeff;
+}
+
+double
+load_mv_ndistinct(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->ndist_enabled && mvstat->ndist_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_standist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return DatumGetFloat8(deps);
+}
+
+/* The Duj1 estimator (already used in analyze.c). */
+static double
+estimate_ndistinct(double totalrows, int numrows, int d, int f1)
+{
+ double numer,
+ denom,
+ ndistinct;
+
+ numer = (double) numrows *(double) d;
+
+ denom = (double) (numrows - f1) +
+ (double) f1 * (double) numrows / totalrows;
+
+ ndistinct = numer / denom;
+
+ /* Clamp to sane range in case of roundoff error */
+ if (ndistinct < (double) d)
+ ndistinct = (double) d;
+
+ if (ndistinct > totalrows)
+ ndistinct = totalrows;
+
+ return floor(ndistinct + 0.5);
+}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 7020772..e46cc6b 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -40,6 +40,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
bool hist_enabled; /* build histogram? */
+ bool ndist_enabled; /* build ndist coefficient? */
/* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
@@ -49,6 +50,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
bool hist_built; /* histogram was built */
+ bool ndist_built; /* ndistinct coeff built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -57,6 +59,7 @@ CATALOG(pg_mv_statistic,3381)
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
bytea stahist; /* MV histogram (serialized) */
+ float8 standcoeff; /* ndistinct coeff (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -72,7 +75,7 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 16
+#define Natts_pg_mv_statistic 19
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
@@ -80,14 +83,17 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
#define Anum_pg_mv_statistic_deps_enabled 5
#define Anum_pg_mv_statistic_mcv_enabled 6
#define Anum_pg_mv_statistic_hist_enabled 7
-#define Anum_pg_mv_statistic_mcv_max_items 8
-#define Anum_pg_mv_statistic_hist_max_buckets 9
-#define Anum_pg_mv_statistic_deps_built 10
-#define Anum_pg_mv_statistic_mcv_built 11
-#define Anum_pg_mv_statistic_hist_built 12
-#define Anum_pg_mv_statistic_stakeys 13
-#define Anum_pg_mv_statistic_stadeps 14
-#define Anum_pg_mv_statistic_stamcv 15
-#define Anum_pg_mv_statistic_stahist 16
+#define Anum_pg_mv_statistic_ndist_enabled 8
+#define Anum_pg_mv_statistic_mcv_max_items 9
+#define Anum_pg_mv_statistic_hist_max_buckets 19
+#define Anum_pg_mv_statistic_deps_built 11
+#define Anum_pg_mv_statistic_mcv_built 12
+#define Anum_pg_mv_statistic_hist_built 13
+#define Anum_pg_mv_statistic_ndist_built 14
+#define Anum_pg_mv_statistic_stakeys 15
+#define Anum_pg_mv_statistic_stadeps 16
+#define Anum_pg_mv_statistic_stamcv 17
+#define Anum_pg_mv_statistic_stahist 18
+#define Anum_pg_mv_statistic_standist 19
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 8c50bfb..1923f2b 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -655,11 +655,13 @@ typedef struct MVStatisticInfo
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
bool hist_enabled; /* histogram enabled */
+ bool ndist_enabled; /* ndistinct coefficient enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
bool hist_built; /* histogram built */
+ bool ndist_built; /* ndistinct coefficient built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 80bf96f..0ff24ce 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -226,6 +226,7 @@ typedef MVSerializedHistogramData *MVSerializedHistogram;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
MVSerializedHistogram load_mv_histogram(Oid mvoid);
+double load_mv_ndistinct(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
@@ -267,11 +268,17 @@ MVHistogram
build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int numrows_total);
-void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
void update_mv_stats(Oid relid, MVDependencies dependencies,
MCVList mcvlist, MVHistogram histogram,
+ double ndistcoeff,
int2vector *attrs, VacAttrStats **stats);
#ifdef DEBUG_MVHIST
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 528ac36..7a914da 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1377,7 +1377,8 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
length(s.stahist) AS histbytes,
- pg_mv_stats_histogram_info(s.stahist) AS histinfo
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo,
+ s.standcoeff AS ndcoeff
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
--
2.5.0
0006-multi-statistics-estimation.patchtext/x-patch; charset=UTF-8; name=0006-multi-statistics-estimation.patchDownload
From 91b9b31cbeb22767b33c2f58b912b7a14c943b28 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 6/9] multi-statistics estimation
The general idea is that a probability (which is what selectivity is)
can be split into a product of conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part may be
simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute the original
probability.
The implementation works in the other direction, though. We know what
probability P(A & B & C) we need to compute, and also what statistics
are available.
So we search for a combinations of statistics, covering the clauses in
an optimal way (most clauses covered, most dependencies exploited).
There are two possible approaches - exhaustive and greedy. The
exhaustive one walks through all permutations of stats using dynamic
programming, so it's guaranteed to find the optimal solution, but it
soon gets very slow as it's roughly O(N!). The dynamic programming may
improve that a bit, but it's still far too expensive for large numbers
of statistics (on a single table).
The greedy algorithm is very simple - in every step choose the best
solution. That may not guarantee the best solution globally (but maybe
it does?), but it only needs N steps to find the solution, so it's very
fast (processing the selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with respect to
runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply them to the
clauses using the conditional probabilities. We process the selected
stats one by one, and for each we select the estimated clauses and
conditions. See clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to be covered by
a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single multivariate
statistics.
Clauses not covered by a single statistics at this level will be passed
to clause_selectivity() but this will treat them as a collection of
simpler clauses (connected by AND or OR), and the clauses from the
previous level will be used as conditions.
So using the same example, the last clause will be passed to
clause_selectivity() with 'clause1' and 'clause2' as conditions, and it
will be processed using multivariate stats if possible.
The other limitation is that all the expressions have to be
mv-compatible, i.e. there can't be a mix of expressions. If this is
violated, the clause may be passed to the next level (just like with
list of clauses not covered by a single statistics), which splits that
into clauses handled by multivariate stats and clauses handler by
regular statistics.
rework clauselist_selectivity_or to handle OR-clauses correctly
We might invent a completely new set of functions here, resembling
clauselist_selectivity but adapting the ideas to OR-clauses.
But luckily we know that each OR-clause
(a OR b OR c)
may be rewritten as an equivalent AND-clause using negation:
NOT ((NOT a) AND (NOT b) AND (NOT c))
And that's something we can pass to clauselist_selectivity.
---
contrib/file_fdw/file_fdw.c | 3 +-
contrib/postgres_fdw/postgres_fdw.c | 11 +-
src/backend/optimizer/path/clausesel.c | 2024 ++++++++++++++++++++++++++------
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/backend/utils/mvstats/README.stats | 166 +++
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
10 files changed, 1913 insertions(+), 369 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index dc035d7..8f11b7a 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -969,7 +969,8 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
baserel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 40bffd6..d458a81 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -500,7 +500,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
@@ -2136,7 +2137,8 @@ estimate_path_cost_size(PlannerInfo *root,
local_param_join_conds,
foreignrel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
@@ -3663,7 +3665,8 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
fpinfo->local_conds,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
/*
@@ -3682,7 +3685,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
*/
fpinfo->joinclause_sel = clauselist_selectivity(root, fpinfo->joinclauses,
0, fpinfo->jointype,
- extra->sjinfo);
+ extra->sjinfo, NIL);
}
fpinfo->server = GetForeignServer(joinrel->serverid);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 5e73a4e..e06fd99 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -29,6 +29,8 @@
#include "utils/selfuncs.h"
#include "utils/typcache.h"
+#include "miscadmin.h"
+
/*
* Data structure for accumulating info about possible range-query
@@ -44,6 +46,13 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
+
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -60,23 +69,25 @@ static int count_mv_attnums(List *clauses, Index relid, int type);
static int count_varnos(List *clauses, Index *relid);
+static List *clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ Index relid, int types, bool remove);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Index relid, List *stats);
-static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
-
-static List *clauselist_mv_split(PlannerInfo *root, Index relid,
- List *clauses, List **mvclauses,
- MVStatisticInfo *mvstats, int types);
-
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats, List *clauses,
+ List *conditions, bool is_or);
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats,
- bool *fullmatch, Selectivity *lowsel);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or, bool *fullmatch,
+ Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -90,12 +101,33 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static List *choose_mv_statistics(PlannerInfo *root, Index relid,
+ List *mvstats, List *clauses, List *conditions);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
static bool stats_type_matches(MVStatisticInfo *stat, int type);
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
@@ -170,14 +202,15 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
/* processing mv stats */
- Oid relid = InvalidOid;
+ Index relid = InvalidOid;
/* list of multivariate stats on the relation */
List *stats = NIL;
@@ -193,12 +226,13 @@ clauselist_selectivity(PlannerInfo *root,
stats = find_stats(root, relid);
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, or matching multivariate statistics, so just go directly
+ * to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
+ varRelid, jointype, sjinfo, conditions);
/*
* Apply functional dependencies, but first check that there are some stats
@@ -230,31 +264,100 @@ clauselist_selectivity(PlannerInfo *root,
(count_mv_attnums(clauses, relid,
MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
- /* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, relid,
- MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ ListCell *s;
+
+ /*
+ * Copy the conditions we got from the upper part of the expression tree
+ * so that we can add local conditions to it (we need to keep the
+ * original list intact, for sibling expressions - other expressions
+ * at the same level).
+ */
+ List *conditions_local = list_copy(conditions);
+
+ /* find the best combination of statistics */
+ List *solution = choose_mv_statistics(root, relid, stats,
+ clauses, conditions);
- /* and search for the statistic covering the most attributes */
- MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+ /* FIXME we must not scribble over the original list */
+ if (solution)
+ clauses = list_copy(clauses);
- if (mvstat != NULL) /* we have a matching stats */
+ /*
+ * We have a good solution, which is merely a list of statistics that
+ * we need to apply. We'll apply the statistics one by one (in the order
+ * as they appear in the list), and for each statistic we'll
+ *
+ * (1) find clauses compatible with the statistic (and remove them
+ * from the list)
+ *
+ * (2) find local conditions compatible with the statistic
+ *
+ * (3) do the estimation P(clauses | conditions)
+ *
+ * (4) append the estimated clauses to local conditions
+ *
+ * continuously modify
+ */
+ foreach (s, solution)
{
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
- mvstat, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ /* clauses compatible with the statistic we're applying right now */
+ List *stat_clauses = NIL;
+ List *stat_conditions = NIL;
- /* we've chosen the histogram to match the clauses */
- Assert(mvclauses != NIL);
+ /*
+ * Find clauses and conditions matching the statistic - the clauses
+ * need to be removed from the list, while conditions should remain
+ * there (so that we can apply them repeatedly).
+ */
+ stat_clauses
+ = clauses_matching_statistic(&clauses, mvstat, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ true);
+
+ stat_conditions
+ = clauses_matching_statistic(&conditions_local, mvstat, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ false);
+
+ /*
+ * If we got no clauses to estimate, we've done something wrong,
+ * either during the optimization, detecting compatible clause, or
+ * somewhere else.
+ *
+ * Also, we need at least two attributes in clauses and conditions.
+ */
+ Assert(stat_clauses != NIL);
+ Assert(count_mv_attnums(list_union(stat_clauses, stat_conditions),
+ relid, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2);
/* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ stat_clauses, stat_conditions,
+ false); /* AND */
+
+ /*
+ * Add the new clauses to the local conditions, so that we can use
+ * them for the subsequent statistics. We only add the clauses,
+ * because the conditions are already there (or should be).
+ */
+ conditions_local = list_concat(conditions_local, stat_clauses);
}
+
+ /* from now on, work only with the 'local' list of conditions */
+ conditions = conditions_local;
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ return s1 * clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo, conditions);
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -266,7 +369,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions);
/*
* Check for being passed a RestrictInfo.
@@ -425,6 +529,55 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Similar to clauselist_selectivity(), but for OR-clauses. We can't simply use
+ * the same multi-statistic estimation logic for AND-clauses, at least not
+ * directly, because there are a few key differences:
+ *
+ * - functional dependencies don't really apply to OR-clauses
+ *
+ * - clauselist_selectivity() is based on decomposing the selectivity into
+ * a sequence of conditional probabilities (selectivities), but that can
+ * be done only for AND-clauses
+ *
+ * We might invent a similar infrastructure for optimizing OR-clauses, doing
+ * something similar to what clause_selectivity does for AND-clauses, but
+ * luckily we know that each disjunctive normal form (aka OR-clause)
+ *
+ * (a OR b OR c)
+ *
+ * may be rewritten as an equivalent conjunctive normal form (aka AND-clause)
+ * by using negation:
+ *
+ * NOT ((NOT a) AND (NOT b) AND (NOT c))
+ *
+ * And that's something we can pass to clauselist_selectivity and let it do
+ * all the heavy lifting.
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
+{
+ List *args = NIL;
+ ListCell *l;
+ Expr *expr;
+
+ /* build arguments for the AND-clause by negating args of the OR-clause */
+ foreach (l, clauses)
+ args = lappend(args, makeBoolExpr(NOT_EXPR, list_make1(lfirst(l)), -1));
+
+ /* and then the actual OR-clause on the negated args */
+ expr = makeBoolExpr(AND_EXPR, args, -1);
+
+ /* instead of constructing NOT expression, just do (1.0 - s) */
+ return 1.0 - clauselist_selectivity(root, list_make1(expr), varRelid,
+ jointype, sjinfo, conditions);
+}
+
+/*
* addRangeClause --- add a new range clause for clauselist_selectivity
*
* Here is where we try to match up pairs of range-query clauses
@@ -631,7 +784,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -751,7 +905,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -760,29 +915,18 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
- /*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
- */
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(arg),
- varRelid,
- jointype,
- sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ /* just call to clauselist_selectivity_or() */
+ s1 = clauselist_selectivity_or(root,
+ ((BoolExpr *) clause)->args,
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -872,7 +1016,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -881,7 +1026,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else
{
@@ -945,15 +1091,16 @@ clause_selectivity(PlannerInfo *root,
* in the MCV list, then the selectivity is below the lowest frequency
* found in the MCV list,
*
- * TODO When applying the clauses to the histogram/MCV list, we can do
- * that from the most selective clauses first, because that'll
- * eliminate the buckets/items sooner (so we'll be able to skip
- * them without inspection, which is more expensive). But this
- * requires really knowing the per-clause selectivities in advance,
- * and that's not what we do now.
+ * TODO When applying the clauses to the histogram/MCV list, we can do that from
+ * the most selective clauses first, because that'll eliminate the
+ * buckets/items sooner (so we'll be able to skip them without inspection,
+ * which is more expensive). But this requires really knowing the
+ * per-clause selectivities in advance, and that's not what we do now.
+ *
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
@@ -964,281 +1111,1375 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
*/
Selectivity mcv_low = 0.0;
- /* TODO Evaluate simple 1D selectivities, use the smallest one as
- * an upper bound, product as lower bound, and sort the
- * clauses in ascending order by selectivity (to optimize the
- * MCV/histogram evaluation).
- */
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions, is_or,
+ &fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* TODO if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions, is_or);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
+}
+
+/*
+ * Pull varattnos from the clauses, similarly to pull_varattnos() but:
+ *
+ * (a) only get attributes for a particular relation (relid)
+ * (b) ignore system attributes (we can't build stats on them anyway)
+ *
+ * This makes it possible to directly compare the result with attnum
+ * values from pg_attribute etc.
+ */
+static Bitmapset *
+get_varattnos(Node * node, Index relid)
+{
+ int k;
+ Bitmapset *varattnos = NULL;
+ Bitmapset *result = NULL;
+
+ /* get the varattnos */
+ pull_varattnos(node, relid, &varattnos);
+
+ k = -1;
+ while ((k = bms_next_member(varattnos, k)) >= 0)
+ {
+ if (k + FirstLowInvalidHeapAttributeNumber > 0)
+ result
+ = bms_add_member(result,
+ k + FirstLowInvalidHeapAttributeNumber);
+ }
+
+ bms_free(varattnos);
+
+ return result;
+}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Index relid, int types)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, relid, &attnums, types);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ bms_free(attnums);
+ attnums = NULL;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Index relid, int type)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Count varnos referenced in the clauses, and if there's a single varno then
+ * return the index in 'relid'.
+ */
+static int
+count_varnos(List *clauses, Index *relid)
+{
+ int cnt;
+ Bitmapset *varnos = NULL;
+
+ varnos = pull_varnos((Node *) clauses);
+ cnt = bms_num_members(varnos);
+
+ /* if there's a single varno in the clauses, remember it */
+ if (bms_num_members(varnos) == 1)
+ *relid = bms_singleton_member(varnos);
+
+ bms_free(varnos);
+
+ return cnt;
+}
+
+static List *
+clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ Index relid, int types, bool remove)
+{
+ int i;
+ Bitmapset *stat_attnums = NULL;
+ List *matching_clauses = NIL;
+ ListCell *lc;
+
+ /* build attnum bitmapset for this statistics */
+ for (i = 0; i < statistic->stakeys->dim1; i++)
+ stat_attnums = bms_add_member(stat_attnums,
+ statistic->stakeys->values[i]);
+
+ /*
+ * We can't use foreach here, because we may need to remove some of the
+ * clauses if (remove=true).
+ */
+ lc = list_head(*clauses);
+ while (lc)
+ {
+ Node *clause = (Node*)lfirst(lc);
+ Bitmapset *attnums = NULL;
+
+ /* must advance lc before list_delete possibly pfree's it */
+ lc = lnext(lc);
+
+ /*
+ * skip clauses that are not compatible with stats (just leave them
+ * in the original list)
+ *
+ * XXX Perhaps this should check what stats are actually available in
+ * the statistics (not a big deal now, because MCV and histograms
+ * handle the same types of conditions).
+ */
+ if (! clause_is_mv_compatible(clause, relid, &attnums, types))
+ {
+ bms_free(attnums);
+ continue;
+ }
+
+ /* if the clause is covered by the statistic, add it to the list */
+ if (bms_is_subset(attnums, stat_attnums))
+ {
+ matching_clauses = lappend(matching_clauses, clause);
+
+ /* if remove=true, remove the matching item from the main list */
+ if (remove)
+ *clauses = list_delete_ptr(*clauses, clause);
+ }
+
+ bms_free(attnums);
+ }
+
+ bms_free(stat_attnums);
+
+ return matching_clauses;
+}
+
+/*
+ * Selects the best combination of multivariate statistics, in an exhaustive
+ * way, where 'best' means:
+ *
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ * (c) using the most conditions to exploit dependency
+ *
+ * Don't call this directly but through choose_mv_statistics(), which does some
+ * additional tricks to minimize the runtime.
+ *
+ *
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with maximum
+ * depth equal to the number of multi-variate statistics available on the table.
+ * It actually explores all valid combinations of stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it matches are
+ * divided into 'conditions' (clauses already matched by at least one previous
+ * statistics) and clauses that are estimated.
+ *
+ * Then several checks are performed:
+ *
+ * (a) The statistics covers at least 2 columns, referenced in the estimated
+ * clauses (otherwise multi-variate stats are useless).
+ *
+ * (b) The statistics covers at least 1 new column, i.e. column not refefenced
+ * by the already used stats (and the new column has to be referenced by
+ * the clauses, of couse). Otherwise the statistics would not add any new
+ * information.
+ *
+ * There are some other sanity checks (e.g. stats must not be used twice etc.).
+ *
+ *
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a rather simple optimality criteria, so it may
+ * not do the best choice when
+ *
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but with
+ * statistics in a different order). It's unclear which solution in the best
+ * one - in a sense all of them are equal.
+ *
+ * TODO It might be possible to compute estimate for each of those solutions,
+ * and then combine them to get the final estimate (e.g. by using average
+ * or median).
+ *
+ * (b) Does not consider that some types of stats are a better match for some
+ * types of clauses (e.g. MCV list is generally a better match for equality
+ * conditions than a histogram).
+ *
+ * But maybe this is pointless - generally, each column is either a label
+ * (it's not important whether because of the data type or how it's used),
+ * or a value with ordering that makes sense. So either a MCV list is more
+ * appropriate (labels) or a histogram (values with orderings).
+ *
+ * Now sure what to do with statistics on columns mixing both types of data
+ * (some columns would work best with MCVs, some with histograms). Maybe we
+ * could invent a new type of statistics combining MCV list and histogram
+ * (keeping a small histogram for each MCV item, and a separate histogram
+ * for values not on the MCV list).
+ *
+ * TODO The algorithm should probably count number of Vars (not just attnums)
+ * when computing the 'score' of each solution. Computing the ratio of
+ * (num of all vars) / (num of condition vars) as a measure of how well
+ * the solution uses conditions might be useful.
+ */
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* this may run for a long sime, so let's make it interruptible */
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ /*
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int c;
+
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
+
+ Bitmapset *all_attnums = NULL;
+
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nclauses; c++)
+ {
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
+
+ if (covered)
+ break;
+ }
+
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
+
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
+ }
+
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+
+ all_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
+ {
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+ }
+
+ /*
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
+ */
+ ruled_out[i] = step;
+
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
+
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
+
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics covering
+ * the clauses. This chooses the "best" statistics at each step, so the
+ * resulting solution may not be the best solution globally, but this produces
+ * the solution in only N steps (where N is the number of statistics), while
+ * the exhaustive approach may have to walk through ~N! combinations (although
+ * some of those are terminated early).
+ *
+ * See the comments at choose_mv_statistics_exhaustive() as this does the same
+ * thing (but in a different way).
+ *
+ * Don't call this directly, but through choose_mv_statistics().
+ *
+ * TODO There are probably other metrics we might use - e.g. using number of
+ * columns (num_cond_columns / num_cov_columns), which might work better
+ * with a mix of simple and complex clauses.
+ *
+ * TODO Also the choice at the very first step should be handled in a special
+ * way, because there will be 0 conditions at that moment, so there needs
+ * to be some other criteria - e.g. using the simplest (or most complex?)
+ * clause might be a good idea.
+ *
+ * TODO We might also select multiple stats using different criteria, and branch
+ * the search. This is however tricky, because if we choose k statistics at
+ * each step, we get k^N branches to walk through (with N steps). That's
+ * not really good with large number of stats (yet better than exhaustive
+ * search).
+ */
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
+
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
+
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
+ */
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *attnums_new
+ = bms_union(attnums_covered, clauses_attnums[j]);
+
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = attnums_new;
+
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
+
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ attnums_new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = attnums_new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
+
+ if (gain > max_gain)
+ {
+ max_gain = gain;
+ best_stat = i;
+ }
+ }
+
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
+ }
+
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
+}
+
+/*
+ * Remove clauses not covered by any of the available statistics
+ *
+ * This helps us to reduce the amount of work done in choose_mv_statistics()
+ * by not having to deal with clauses that can't possibly be useful.
+ */
+static List *
+filter_clauses(PlannerInfo *root, Index relid, int type,
+ List *stats, List *clauses, Bitmapset **attnums)
+{
+ ListCell *c;
+ ListCell *s;
+
+ /* results (list of compatible clauses, attnums) */
+ List *rclauses = NIL;
+
+ foreach (c, clauses)
+ {
+ Node *clause = (Node*)lfirst(c);
+ Bitmapset *clause_attnums = NULL;
+
+ /*
+ * We do assume that thanks to previous checks, we should not run into
+ * clauses that are incompatible with multivariate stats here. We also
+ * need to collect the attnums for the clause.
+ *
+ * XXX Maybe turn this into an assert?
+ */
+ if (! clause_is_mv_compatible(clause, relid, &clause_attnums, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ /* Is there a multivariate statistics covering the clause? */
+ foreach (s, stats)
+ {
+ int k, matches = 0;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* skip statistics not matching the required type */
+ if (! stats_type_matches(stat, type))
+ continue;
+
+ /*
+ * see if all clause attributes are covered by the statistic
+ *
+ * We'll do that in the opposite direction, i.e. we'll see how many
+ * attributes of the statistic are referenced in the clause, and then
+ * compare the counts.
+ */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ if (bms_is_member(stat->stakeys->values[k], clause_attnums))
+ matches += 1;
+
+ /*
+ * If the number of matches is equal to attributes referenced by the
+ * clause, then the clause is covered by the statistic.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ *attnums = bms_union(*attnums, clause_attnums);
+ rclauses = lappend(rclauses, clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(clauses) >= list_length(rclauses));
+
+ return rclauses;
+}
+
+/*
+ * Remove statistics not covering any new clauses
+ *
+ * Statistics not covering any new clauses (conditions don't count) are not
+ * really useful, so let's ignore them. Also, we need the statistics to
+ * reference at least two different attributes (both in conditions and clauses
+ * combined), and at least one of them in the clauses alone.
+ *
+ * This check might be made more strict by checking against individual clauses,
+ * because by using the bitmapsets of all attnums we may actually use attnums
+ * from clauses that are not covered by the statistics. For example, we may
+ * have a condition
+ *
+ * (a=1 AND b=2)
+ *
+ * and a new clause
+ *
+ * (c=1 AND d=1)
+ *
+ * With only bitmapsets, statistics on [b,c] will pass through this (assuming
+ * there are some statistics covering both clases).
+ *
+ * Parameters:
+ *
+ * stats - list of statistics to filter
+ * new_attnums - attnums referenced in new clauses
+ * all_attnums - attnums referenced by contidions and new clauses combined
+ *
+ * Returns filtered list of statistics.
+ *
+ * TODO Do the more strict check, i.e. walk through individual clauses and
+ * conditions and only use those covered by the statistics.
+ */
+static List *
+filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+{
+ ListCell *s;
+ List *stats_filtered = NIL;
+
+ foreach (s, stats)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* see how many attributes the statistics covers */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ /* attributes from new clauses */
+ if (bms_is_member(stat->stakeys->values[k], new_attnums))
+ matches_new += 1;
+
+ /* attributes from onditions */
+ if (bms_is_member(stat->stakeys->values[k], all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ stats_filtered = lappend(stats_filtered, stat);
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(list_length(stats) >= list_length(stats_filtered));
+
+ return stats_filtered;
+}
+
+static MVStatisticInfo *
+make_stats_array(List *stats, int *nmvstats)
+{
+ int i;
+ ListCell *l;
+
+ MVStatisticInfo *mvstats = NULL;
+ *nmvstats = list_length(stats);
- /* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
- &fullmatch, &mcv_low);
+ mvstats
+ = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
- /*
- * If we got a full equality match on the MCV list, we're done (and
- * the estimate is pretty good).
- */
- if (fullmatch && (s1 > 0.0))
- return s1;
+ i = 0;
+ foreach (l, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
+ memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
+ }
- /* TODO if (fullmatch) without matching MCV item, use the mcv_low
- * selectivity as upper bound */
+ return mvstats;
+}
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+static Bitmapset **
+make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
+{
+ int i, j;
+ Bitmapset **stats_attnums = NULL;
- /* TODO clamp to <= 1.0 (or more strictly, when possible) */
- return s1 + s2;
+ Assert(nmvstats > 0);
+
+ /* build bitmaps of attnums for the stats (easier to compare) */
+ stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i]
+ = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+
+ return stats_attnums;
}
+
/*
- * Collect attributes from mv-compatible clauses.
+ * Remove redundant statistics
+ *
+ * If there are multiple statistics covering the same set of columns (counting
+ * only those referenced by clauses and conditions), we can apply one of those
+ * anyway and further reduce the size of the optimization problem.
+ *
+ * Thus when redundant stats are detected, we keep the smaller one (the one with
+ * fewer columns), based on the assumption that it's more accurate and also
+ * faster to process. That may be untrue for two reasons - first, the accuracy
+ * really depends on number of buckets/MCV items, not the number of columns.
+ * Second, some types of statistics may work better for certain types of clauses
+ * (e.g. MCV lists for equality conditions) etc.
*/
-static Bitmapset *
-collect_mv_attnums(List *clauses, Index relid, int types)
+static List*
+filter_redundant_stats(List *stats, List *clauses, List *conditions)
{
- Bitmapset *attnums = NULL;
- ListCell *l;
+ int i, j, nmvstats;
+
+ MVStatisticInfo *mvstats;
+ bool *redundant;
+ Bitmapset **stats_attnums;
+ Bitmapset *varattnos;
+ Index relid;
+
+ Assert(list_length(stats) > 0);
+ Assert(list_length(clauses) > 0);
+
+ /*
+ * We'll convert the list of statistics into an array now, because
+ * the reduction of redundant statistics is easier to do that way
+ * (we can mark previous stats as redundant, etc.).
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* by default, none of the stats is redundant (so palloc0) */
+ redundant = palloc0(nmvstats * sizeof(bool));
+
+ /*
+ * We only expect a single relid here, and also we should get the
+ * same relid from clauses and conditions (but we get it from
+ * clauses, because those are certainly non-empty).
+ */
+ relid = bms_singleton_member(pull_varnos((Node*)clauses));
/*
- * Walk through the clauses and identify the ones we can estimate using
- * multivariate stats, and remember the relid/columns. We'll then
- * cross-check if we have suitable stats, and only if needed we'll split
- * the clauses into multivariate and regular lists.
+ * Get the varattnos from both conditions and clauses.
*
- * For now we're only interested in RestrictInfo nodes with nested OpExpr,
- * using either a range or equality.
+ * This skips system attributes, although that should be impossible
+ * thanks to previous filtering out of incompatible clauses.
+ *
+ * XXX Is that really true?
*/
- foreach (l, clauses)
+ varattnos = bms_union(get_varattnos((Node*)clauses, relid),
+ get_varattnos((Node*)conditions, relid));
+
+ for (i = 1; i < nmvstats; i++)
{
- Node *clause = (Node *) lfirst(l);
+ /* intersect with current statistics */
+ Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
- /* ignore the result here - we only need the attnums */
- clause_is_mv_compatible(clause, relid, &attnums, types);
+ /* walk through 'previous' stats and check redundancy */
+ for (j = 0; j < i; j++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *prev;
+
+ /* skip stats already identified as redundant */
+ if (redundant[j])
+ continue;
+
+ prev = bms_intersect(stats_attnums[j], varattnos);
+
+ switch (bms_subset_compare(curr, prev))
+ {
+ case BMS_EQUAL:
+ /*
+ * Use the smaller one (hopefully more accurate).
+ * If both have the same size, use the first one.
+ */
+ if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
+ redundant[i] = TRUE;
+ else
+ redundant[j] = TRUE;
+
+ break;
+
+ case BMS_SUBSET1: /* curr is subset of prev */
+ redundant[i] = TRUE;
+ break;
+
+ case BMS_SUBSET2: /* prev is subset of curr */
+ redundant[j] = TRUE;
+ break;
+
+ case BMS_DIFFERENT:
+ /* do nothing - keep both stats */
+ break;
+ }
+
+ bms_free(prev);
+ }
+
+ bms_free(curr);
}
- /*
- * If there are not at least two attributes referenced by the clause(s),
- * we can throw everything out (as we'll revert to simple stats).
- */
- if (bms_num_members(attnums) <= 1)
+ /* can't reduce all statistics (at least one has to remain) */
+ Assert(nmvstats > 0);
+
+ /* now, let's remove the reduced statistics from the arrays */
+ list_free(stats);
+ stats = NIL;
+
+ for (i = 0; i < nmvstats; i++)
{
- if (attnums != NULL)
- pfree(attnums);
- attnums = NULL;
+ MVStatisticInfo *info;
+
+ pfree(stats_attnums[i]);
+
+ if (redundant[i])
+ continue;
+
+ info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
+
+ stats = lappend(stats, info);
}
- return attnums;
+ pfree(mvstats);
+ pfree(stats_attnums);
+ pfree(redundant);
+
+ return stats;
}
-/*
- * Count the number of attributes in clauses compatible with multivariate stats.
- */
-static int
-count_mv_attnums(List *clauses, Index relid, int type)
+static Node**
+make_clauses_array(List *clauses, int *nclauses)
{
- int c;
- Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
+ int i;
+ ListCell *l;
- c = bms_num_members(attnums);
+ Node** clauses_array;
- bms_free(attnums);
+ *nclauses = list_length(clauses);
+ clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
- return c;
+ i = 0;
+ foreach (l, clauses)
+ clauses_array[i++] = (Node *)lfirst(l);
+
+ *nclauses = i;
+
+ return clauses_array;
}
-/*
- * Count varnos referenced in the clauses, and if there's a single varno then
- * return the index in 'relid'.
- */
-static int
-count_varnos(List *clauses, Index *relid)
+static Bitmapset **
+make_clauses_attnums(PlannerInfo *root, Index relid,
+ int type, Node **clauses, int nclauses)
{
- int cnt;
- Bitmapset *varnos = NULL;
+ int i;
+ Bitmapset **clauses_attnums
+ = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
- varnos = pull_varnos((Node *) clauses);
- cnt = bms_num_members(varnos);
+ for (i = 0; i < nclauses; i++)
+ {
+ Bitmapset * attnums = NULL;
- /* if there's a single varno in the clauses, remember it */
- if (bms_num_members(varnos) == 1)
- *relid = bms_singleton_member(varnos);
+ if (! clause_is_mv_compatible(clauses[i], relid, &attnums, type))
+ elog(ERROR, "should not get non-mv-compatible clause");
- bms_free(varnos);
+ clauses_attnums[i] = attnums;
+ }
- return cnt;
+ return clauses_attnums;
}
-
+
+static bool*
+make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses)
+{
+ int i, j;
+ bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < nclauses; j++)
+ cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+
+ return cover_map;
+}
+
/*
- * We're looking for statistics matching at least 2 attributes, referenced in
- * clauses compatible with multivariate statistics. The current selection
- * criteria is very simple - we choose the statistics referencing the most
- * attributes.
- *
- * If there are multiple statistics referencing the same number of columns
- * (from the clauses), the one with less source columns (as listed in the
- * ADD STATISTICS when creating the statistics) wins. Else the first one wins.
- *
- * This is a very simple criteria, and has several weaknesses:
- *
- * (a) does not consider the accuracy of the statistics
- *
- * If there are two histograms built on the same set of columns, but one
- * has 100 buckets and the other one has 1000 buckets (thus likely
- * providing better estimates), this is not currently considered.
- *
- * (b) does not consider the type of statistics
- *
- * If there are three statistics - one containing just a MCV list, another
- * one with just a histogram and a third one with both, we treat them equally.
+ * Chooses the combination of statistics, optimal for estimation of a particular
+ * clause list.
*
- * (c) does not consider the number of clauses
+ * This only handles a 'preparation' shared by the exhaustive and greedy
+ * implementations (see the previous methods), mostly trying to reduce the size
+ * of the problem (eliminate clauses/statistics that can't be really used in
+ * the solution).
*
- * As explained, only the number of referenced attributes counts, so if
- * there are multiple clauses on a single attribute, this still counts as
- * a single attribute.
+ * It also precomputes bitmaps for attributes covered by clauses and statistics,
+ * so that we don't need to do that over and over in the actual optimizations
+ * (as it's both CPU and memory intensive).
*
- * (d) does not consider type of condition
*
- * Some clauses may work better with some statistics - for example equality
- * clauses probably work better with MCV lists than with histograms. But
- * IS [NOT] NULL conditions may often work better with histograms (thanks
- * to NULL-buckets).
+ * TODO Another way to make the optimization problems smaller might be splitting
+ * the statistics into several disjoint subsets, i.e. if we can split the
+ * graph of statistics (after the elimination) into multiple components
+ * (so that stats in different components share no attributes), we can do
+ * the optimization for each component separately.
*
- * So for example with five WHERE conditions
- *
- * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
- *
- * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be selected
- * as it references the most columns.
- *
- * Once we have selected the multivariate statistics, we split the list of
- * clauses into two parts - conditions that are compatible with the selected
- * stats, and conditions are estimated using simple statistics.
- *
- * From the example above, conditions
- *
- * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
- *
- * will be estimated using the multivariate statistics (a,b,c,d) while the last
- * condition (e = 1) will get estimated using the regular ones.
- *
- * There are various alternative selection criteria (e.g. counting conditions
- * instead of just referenced attributes), but eventually the best option should
- * be to combine multiple statistics. But that's much harder to do correctly.
- *
- * TODO Select multiple statistics and combine them when computing the estimate.
- *
- * TODO This will probably have to consider compatibility of clauses, because
- * 'dependencies' will probably work only with equality clauses.
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew that we
+ * can cover 10 clauses and reuse 8 dependencies, maybe covering 9 clauses
+ * and 7 dependencies would be OK?
*/
-static MVStatisticInfo *
-choose_mv_statistics(List *stats, Bitmapset *attnums)
+static List*
+choose_mv_statistics(PlannerInfo *root, Index relid, List *stats,
+ List *clauses, List *conditions)
{
int i;
- ListCell *lc;
+ mv_solution_t *best = NULL;
+ List *result = NIL;
+
+ int nmvstats;
+ MVStatisticInfo *mvstats;
+
+ /* we only work with MCV lists and histograms here */
+ int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
- MVStatisticInfo *choice = NULL;
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
+ stats = list_copy(stats);
/*
- * Walk through the statistics (simple array with nmvstats elements) and for
- * each one count the referenced attributes (encoded in the 'attnums' bitmap).
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
*/
- foreach (lc, stats)
+ while (true)
{
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
-
- /* columns matching this statistics */
- int matches = 0;
+ List *tmp;
- int2vector * attrs = info->stakeys;
- int numattrs = attrs->dim1;
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
- /* skip dependencies-only stats */
- if (! (info->mcv_built || info->hist_built))
- continue;
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. We'll also keep info
+ * about attnums in clauses (without conditions) so that we can
+ * ignore stats covering just conditions (which is pointless).
+ */
+ tmp = filter_clauses(root, relid, type,
+ stats, clauses, &compatible_attnums);
- /* count columns covered by the histogram */
- for (i = 0; i < numattrs; i++)
- if (bms_is_member(attrs->values[i], attnums))
- matches++;
+ /* discard the original list */
+ list_free(clauses);
+ clauses = tmp;
/*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new
+ * attribute (by comparing with clauses).
*/
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
+ if (conditions != NIL)
{
- choice = info;
- current_matches = matches;
- current_dims = numattrs;
+ tmp = filter_clauses(root, relid, type,
+ stats, conditions, &condition_attnums);
+
+ /* discard the original list */
+ list_free(conditions);
+ conditions = tmp;
}
- }
- return choice;
-}
+ /* get a union of attnums (from conditions and new clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ tmp = filter_stats(stats, compatible_attnums, all_attnums);
+ /* if we've not eliminated anything, terminate */
+ if (list_length(stats) == list_length(tmp))
+ break;
-/*
- * This splits the clauses list into two parts - one containing clauses that
- * will be evaluated using the chosen statistics, and the remaining clauses
- * (either non-mvcompatible, or not related to the histogram).
- */
-static List *
-clauselist_mv_split(PlannerInfo *root, Index relid,
- List *clauses, List **mvclauses,
- MVStatisticInfo *mvstats, int types)
-{
- int i;
- ListCell *l;
- List *non_mvclauses = NIL;
+ /* work only with filtered statistics from now */
+ list_free(stats);
+ stats = tmp;
+ }
- /* FIXME is there a better way to get info on int2vector? */
- int2vector * attrs = mvstats->stakeys;
- int numattrs = mvstats->stakeys->dim1;
+ /* only do the optimization if we have clauses/statistics */
+ if ((list_length(stats) == 0) || (list_length(clauses) == 0))
+ return NULL;
- Bitmapset *mvattnums = NULL;
+ /* remove redundant stats (stats covered by another stats) */
+ stats = filter_redundant_stats(stats, clauses, conditions);
- /* build bitmap of attributes, so we can do bms_is_subset later */
- for (i = 0; i < numattrs; i++)
- mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
- /* erase the list of mv-compatible clauses */
- *mvclauses = NIL;
+ /* collect clauses an bitmap of attnums */
+ clauses_array = make_clauses_array(clauses, &nclauses);
+ clauses_attnums = make_clauses_attnums(root, relid, type,
+ clauses_array, nclauses);
- foreach (l, clauses)
- {
- bool match = false; /* by default not mv-compatible */
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(l);
+ /* collect conditions and bitmap of attnums */
+ conditions_array = make_clauses_array(conditions, &nconditions);
+ conditions_attnums = make_clauses_attnums(root, relid, type,
+ conditions_array, nconditions);
- if (clause_is_mv_compatible(clause, relid, &attnums, types))
+ /*
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
+ */
+ clause_cover_map = make_cover_map(stats_attnums, nmvstats,
+ clauses_attnums, nclauses);
+
+ condition_cover_map = make_cover_map(stats_attnums, nmvstats,
+ conditions_attnums, nconditions);
+
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ /* no stats are ruled out by default */
+ for (i = 0; i < nmvstats; i++)
+ ruled_out[i] = -1;
+
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* create a list of statistics from the array */
+ if (best != NULL)
+ {
+ for (i = 0; i < best->nstats; i++)
{
- /* are all the attributes part of the selected stats? */
- if (bms_is_subset(attnums, mvattnums))
- match = true;
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
+ result = lappend(result, info);
}
- /*
- * The clause matches the selected stats, so put it to the list of
- * mv-compatible clauses. Otherwise, keep it in the list of 'regular'
- * clauses (that may be selected later).
- */
- if (match)
- *mvclauses = lappend(*mvclauses, clause);
- else
- non_mvclauses = lappend(non_mvclauses, clause);
+ pfree(best);
}
- /*
- * Perform regular estimation using the clauses incompatible with the chosen
- * histogram (or MV stats in general).
- */
- return non_mvclauses;
+ /* cleanup (maybe leave it up to the memory context?) */
+ for (i = 0; i < nmvstats; i++)
+ bms_free(stats_attnums[i]);
+ for (i = 0; i < nclauses; i++)
+ bms_free(clauses_attnums[i]);
+
+ for (i = 0; i < nconditions; i++)
+ bms_free(conditions_attnums[i]);
+
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(conditions_attnums);
+
+ pfree(clauses_array);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+ pfree(mvstats);
+
+ list_free(clauses);
+ list_free(conditions);
+ list_free(stats);
+
+ return result;
}
typedef struct
@@ -1637,9 +2878,6 @@ has_stats(List *stats, int type)
/* terminate if we've found at least one matching statistics */
if (stats_type_matches(stat, type))
return true;
-
- if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
- return true;
}
return false;
@@ -1689,22 +2927,26 @@ find_stats(PlannerInfo *root, Index relid)
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -1715,32 +2957,85 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
- nmatches, matches,
- lowsel, fullmatch, false);
+ ((is_or) ? 0 : nmatches), matches,
+ lowsel, fullmatch, is_or);
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
{
- /* used to 'scale' for MCV lists not covering all tuples */
+ /*
+ * Find out what part of the data is covered by the MCV list,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample items got into the MCV, and the rest
+ * is either in a histogram, or not covered by stats).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole list, which might save us a bit of time
+ * spent on accessing the not-matching part of the MCV list.
+ * Although it's likely in a cache, so it's very fast.
+ */
u += mcvlist->items[i]->frequency;
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s*u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -1971,64 +3266,57 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
}
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each MCV item */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching MCV items */
- or_nmatches = mcvlist->nitems;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
/* by default none of the MCV items matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
stakeys, mcvlist,
- or_nmatches, or_matches,
+ tmp_nmatches, tmp_matches,
lowsel, fullmatch, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mcvlist->nitems; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
+ pfree(tmp_matches);
}
else
- {
elog(ERROR, "unknown clause type: %d", clause->type);
- }
}
/*
@@ -2086,15 +3374,18 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
MVSerializedHistogram mvhist = NULL;
@@ -2107,25 +3398,55 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
Assert (mvhist != NULL);
Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
/*
- * Bitmap of bucket matches (mismatch, partial, full). by default
- * all buckets fully match (and we'll eliminate them).
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+ matches = palloc0(sizeof(char) * nmatches);
- nmatches = mvhist->nbuckets;
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, false);
- /* build the match bitmap */
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
- nmatches, matches, false);
+ ((is_or) ? 0 : nmatches), matches,
+ is_or);
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
/*
* Find out what part of the data is covered by the histogram,
* so that we can 'scale' the selectivity properly (e.g. when
@@ -2139,10 +3460,23 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
*/
u += mvhist->buckets[i]->ntuples;
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
#ifdef DEBUG_MVHIST
@@ -2151,9 +3485,14 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s * u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/* cached result of bucket boundary comparison for a single dimension */
@@ -2301,7 +3640,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
{
int i;
ListCell * l;
-
+
/*
* Used for caching function calls, only once per deduplicated value.
*
@@ -2344,7 +3683,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
FmgrInfo opproc; /* operator */
fmgr_info(get_opcode(expr->opno), &opproc);
-
+
/* reset the cache (per clause) */
memset(callcache, 0, mvhist->nbuckets);
@@ -2504,64 +3843,57 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each bucket */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching buckets */
- or_nmatches = mvhist->nbuckets;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mvhist->nbuckets;
- /* by default none of the buckets matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ /* by default none of the buckets matches the clauses (OR clause) */
+ tmp_matches = palloc0(sizeof(char) * mvhist->nbuckets);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* but AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ tmp_nmatches = update_match_bitmap_histogram(root, tmp_clauses,
stakeys, mvhist,
- or_nmatches, or_matches, or_clause(clause));
+ tmp_nmatches, tmp_matches, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mvhist->nbuckets; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
-
+ pfree(tmp_matches);
}
else
elog(ERROR, "unknown clause type: %d", clause->type);
}
- /* free the call cache */
pfree(callcache);
return nmatches;
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 5350329..57214e0 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3518,7 +3518,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3541,7 +3542,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3708,7 +3710,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3744,7 +3746,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3781,7 +3784,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3919,12 +3923,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3936,7 +3942,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index ea831f5..6299e75 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index d396ef1..805d633 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1627,13 +1627,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6260,7 +6262,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6580,7 +6583,8 @@ btcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7331,7 +7335,8 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7561,7 +7566,7 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index edcafce..b7aabed 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -393,6 +394,15 @@ static const struct config_enum_entry force_parallel_mode_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3743,6 +3753,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 3e4f4d1..d404914 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -90,6 +90,137 @@ even attempting to do the more expensive estimation.
Whenever we find there are no suitable stats, we skip the expensive steps.
+Combining multiple statistics
+-----------------------------
+
+When estimating selectivity of a list of clauses, there may exist no statistics
+covering all of them. If there are multiple statistics, each covering some
+subset of the attributes, the optimizer needs to figure out which of those
+statistics to apply.
+
+When the statistics do not overlap, the solution is trivial - we can simply
+split the groups of conditions by the matching statistics, and then multiply the
+selectivities. For example assume multivariate statistics on (b,c) and (d,e),
+and a condition like this:
+
+ (a=1) AND (b=2) AND (c=3) AND (d=4) AND (e=5)
+
+Then (a=1) is not covered by any of the statistics, so will be estimated using
+the regular per-column statistics. The two conditions ((b=2) AND (c=3)) will be
+estimated using the (b,c) statistics, and ((d=4) AND (e=5)) will be estimated
+using (d,e) statistics. And the resulting selectivities will be estimated.
+
+Now, what if the statistics overlap? For example assume the same condition as
+above, but let's say we have statistics on (a,b,c) and (a,c,d,e). What then?
+
+As selectivity is just a probability that the condition holds for a random row,
+we can write the selectivity like this:
+
+ P(a=1 & b=2 & c=3 & d=4 & e=5)
+
+and we can rewrite it using conditional probability like this
+
+ P(a=1 & b=2 & c=3) * P(d=4 & e=5 | a=1 & b=2 & c=3)
+
+Notice that the first part already matches to (a,b,c) statistics. If we assume
+that columns that are not referenced by the same statistics are independent, we
+may rewrite the second half like this
+
+ P(d=4 & e=5 | a=1 & b=2 & c=3) = P(d=4 & e=5 | a=1 & c=3)
+
+which corresponds to the statistics on (a,c,d,e).
+
+If there are multiple statistics defined on a table, it's not difficult to come
+up with examples when there are multiple ways to combine them to cover a list of
+clauses. We need a way to find the best combination of statistics.
+
+This is the purpose of choose_mv_statistics(). It searches through the possible
+combinations of statistics, and searches such combination that
+
+ (a) covers the most clauses of the list
+
+ (b) reuses the maximum number of clauses as conditions
+ (in conditional probabilities)
+
+While (a) criteria seems natural, the (b) may seem a bit awkward at first. The
+idea is that conditions in a way of transfering information about dependencies
+between statistics.
+
+There are two alternative implementations of choose_mv_statistics() - greedy
+and exhaustive. Exhaustive actually searches through all possible combinations
+of statistics, and for larger numbers of statistics may get quite expensive
+(as it, unsurprisingly, has exponential cost). Greedy terminates in less than
+K steps (when K is the number of clauses), and in each step chooses the best
+next statistics. I've been unable to come up with an example where those two
+approaches would produce different combinations.
+
+It's possible to choose the optimization using mvstat_search_type, with either
+'greedy' or 'exhaustive' values (default is 'greedy').
+
+ SET mvstat_search_type = 'exhaustive';
+
+Note: This is meant mostly for experimentation. I do expect we'll choose one of
+the algorithms and remove the GUC before commit.
+
+
+Limitations of combining statistics
+-----------------------------------
+
+As described in the section 'Combining multiple statistics', the current appoach
+is based on transfering information between statistics by means of conditional
+probabilities. This is a relatively cheap and efficient approach, but it is
+based on two assumptions:
+
+ (1) The overlap between the statistics needs to be sufficiently large, i.e.
+ there needs to be enough columns shared by the statistics to transfer
+ information about dependencies between the remaining columns.
+
+ (2) The query needs to include sufficient clauses on the shared columns.
+
+How a violation of those assumptions may be a problem can be illustrated by
+a simple example. Assume a table with three columns (a,b,c) containing exactly
+the same values, and statistics on (a,b) and (b,c):
+
+ CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+
+ CREATE STATISTICS s1 ON test (a,b) WITH (mcv);
+ CREATE STATISTICS s2 ON test (b,c) WITH (mcv);
+
+ ANALYZE test;
+
+First, let's estimate this query:
+
+ SELECT * FROM test WHERE (a < 10) AND (c < 10);
+
+Clearly, there are no conditions on 'b' (which is the only column shared by the
+two statistics), so we'll end up with an estimate based on assumption of
+independence:
+
+ P(a < 10) * P(c < 10) = 0.01 * 0.01 = 0.0001
+
+Which is a significant under-estimate, as the proper selectivity is 0.01.
+
+But let's estimate another query:
+
+ SELECT * FROM test WHERE (a < 10) AND (b < 500) AND (c < 10);
+
+In this case, the estimate may be computed for example like this:
+
+ P[(a < 10) & (b < 500) & (c < 10)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (a < 10) & (b < 500)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (b < 500)]
+
+The trouble is the probability P(c < 10 | b < 500) evaluates to 0.02, because
+we have assumed (a) and (c) are independent because there is no statistic
+containing both these columns, and the condition on (b) does not transfer
+sufficient amount of information between the two statistics.
+
+Currently, the only solution is to build statistics on all three columns, but
+see the 'combining statistics using convolution' section for ideas on how to
+improve this.
+
+
Further (possibly crazy) ideas
------------------------------
@@ -111,3 +242,38 @@ But of course, this may result in expensive estimation (CPU-wise).
So we might add a GUC to choose between a simple (single statistics) and thus
multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
+
+
+Combining stats using convolution
+---------------------------------
+
+While the current approach for combining statistics is based on conditional
+probabilities, and thus only works when the query includes conditions on the
+overlapping parts of the statistics. But there may be other ways to combine
+statistics, relaxing this requirement.
+
+Let's assume two histograms H1 and H2 - then combining them might work about
+like this:
+
+
+ for (buckets of H1, satisfying local conditions)
+ {
+ for (buckets of H2, overlapping with H1 bucket)
+ {
+ mark H2 bucket as 'valid'
+ }
+ }
+
+ s1 = s2 = 0.0
+ for (buckets of H2 marked as valid)
+ {
+ s1 += frequency
+
+ if (bucket satistifes local conditions)
+ s2 += frequency
+ }
+
+ s = (s2 / s1) /* final selectivity estimate */
+
+However this may quickly get non-trivial, e.g. when combining two statistics
+of different types (histogram vs. MCV).
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index fea2bb7..33f5a1b 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -192,11 +192,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 6708139..80bf96f 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,6 +17,14 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
--
2.5.0
0005-multivariate-histograms.patchtext/x-patch; charset=UTF-8; name=0005-multivariate-histograms.patchDownload
From 93e428970d3d814f6b61e4a5f4384237cf94ed41 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 5/9] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
doc/src/sgml/ref/create_statistics.sgml | 44 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 44 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 574 +++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.histogram | 299 ++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 37 +-
src/backend/utils/mvstats/histogram.c | 2023 ++++++++++++++++++++++++++++
src/bin/psql/describe.c | 17 +-
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 136 +-
src/test/regress/expected/mv_histogram.out | 207 +++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 176 +++
21 files changed, 3570 insertions(+), 38 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.histogram
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index d6973e8..f7336fd 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -133,6 +133,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</varlistentry>
<varlistentry>
+ <term><literal>histogram</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables histogram for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>max_buckets</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of histogram buckets.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><literal>max_mcv_items</> (<type>integer</>)</term>
<listitem>
<para>
@@ -220,6 +238,32 @@ EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
</programlisting>
</para>
+ <para>
+ Create table <structname>t3</> with two strongly correlated columns, and
+ a histogram on those two columns:
+
+<programlisting>
+CREATE TABLE t3 (
+ a float,
+ b float
+);
+
+INSERT INTO t3 SELECT mod(i,1000), mod(i,1000) + 50 * (r - 0.5) FROM (
+ SELECT i, random() r FROM generate_series(1,1000000) s(i)
+ ) foo;
+
+CREATE STATISTICS s3 ON t3 (a, b) WITH (histogram);
+
+ANALYZE t2;
+
+-- small overlap
+EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a < 500) AND (b > 500);
+
+-- no overlap
+EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a < 400) AND (b > 600);
+</programlisting>
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5c40334..b151db1 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -167,7 +167,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index c480fbe..e0b085f 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -71,12 +71,15 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -175,6 +178,29 @@ CreateStatistics(CreateStatsStmt *stmt)
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("maximum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -183,10 +209,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -194,6 +220,11 @@ CreateStatistics(CreateStatsStmt *stmt)
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -214,11 +245,14 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 333e24b..9172f21 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2163,10 +2163,12 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
WRITE_BOOL_FIELD(mcv_enabled);
+ WRITE_BOOL_FIELD(hist_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
WRITE_BOOL_FIELD(mcv_built);
+ WRITE_BOOL_FIELD(hist_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 7fc0c49..5e73a4e 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -49,6 +49,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
int type);
@@ -74,6 +75,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStatisticInfo *mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -81,6 +84,12 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
@@ -95,6 +104,7 @@ static bool stats_type_matches(MVStatisticInfo *stat, int type);
#define UPDATE_RESULT(m,r,isor) \
(m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -123,7 +133,7 @@ static bool stats_type_matches(MVStatisticInfo *stat, int type);
*
* First we try to reduce the list of clauses by applying (soft) functional
* dependencies, and then we try to estimate the selectivity of the reduced
- * list of clauses using the multivariate MCV list.
+ * list of clauses using the multivariate MCV list and histograms.
*
* Finally we remove the portion of clauses estimated using multivariate stats,
* and process the rest of the clauses using the regular per-column stats.
@@ -216,11 +226,13 @@ clauselist_selectivity(PlannerInfo *root,
* with the multivariate code and simply skip to estimation using the
* regular per-column stats.
*/
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
- (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV) >= 2))
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) &&
+ (count_mv_attnums(clauses, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
/* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV);
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* and search for the statistic covering the most attributes */
MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
@@ -232,7 +244,7 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
- mvstat, MV_CLAUSE_TYPE_MCV);
+ mvstat, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -944,6 +956,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -957,9 +970,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* TODO if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1129,7 +1157,7 @@ choose_mv_statistics(List *stats, Bitmapset *attnums)
int numattrs = attrs->dim1;
/* skip dependencies-only stats */
- if (! info->mcv_built)
+ if (! (info->mcv_built || info->hist_built))
continue;
/* count columns covered by the histogram */
@@ -1360,7 +1388,7 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (! (context->types & MV_CLAUSE_TYPE_MCV))
+ if (! (context->types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST)))
return true; /* terminate */
break;
@@ -1588,6 +1616,9 @@ stats_type_matches(MVStatisticInfo *stat, int type)
if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
return true;
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
+
return false;
}
@@ -1606,6 +1637,9 @@ has_stats(List *stats, int type)
/* terminate if we've found at least one matching statistics */
if (stats_type_matches(stat, type))
return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
}
return false;
@@ -2010,3 +2044,525 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ int nmatches = 0;
+ char *matches = NULL;
+
+ MVSerializedHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = load_mv_histogram(mvstats->mvoid);
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * Find out what part of the data is covered by the histogram,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample got into the histogram, and the rest
+ * is in a MCV list).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole histogram, which might save us some time
+ * spent accessing the not-matching part of the histogram.
+ * Although it's likely in a cache, so it's very fast.
+ */
+ u += mvhist->buckets[i]->ntuples;
+
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+#ifdef DEBUG_MVHIST
+ debug_histogram_matches(mvhist, matches);
+#endif
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s * u;
+}
+
+/* cached result of bucket boundary comparison for a single dimension */
+
+#define HIST_CACHE_NOT_FOUND 0x00
+#define HIST_CACHE_FALSE 0x01
+#define HIST_CACHE_TRUE 0x03
+#define HIST_CACHE_MASK 0x02
+
+static char
+bucket_contains_value(FmgrInfo ltproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache)
+{
+ bool a, b;
+
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /*
+ * First some quick checks on equality - if any of the boundaries equals,
+ * we have a partial match (so no need to call the comparator).
+ */
+ if (((min_value == constvalue) && (min_include)) ||
+ ((max_value == constvalue) && (max_include)))
+ return MVSTATS_MATCH_PARTIAL;
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ /*
+ * If result for the bucket lower bound not in cache, evaluate the function
+ * and store the result in the cache.
+ */
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, min_value));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /* And do the same for the upper bound. */
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, max_value));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ return (a ^ b) ? MVSTATS_MATCH_PARTIAL : MVSTATS_MATCH_NONE;
+}
+
+static char
+bucket_is_smaller_than_value(FmgrInfo opproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache, bool isgt)
+{
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ bool a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ bool b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ min_value,
+ constvalue));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ max_value,
+ constvalue));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /*
+ * Now, we need to combine both results into the final answer, and we need
+ * to be careful about the 'isgt' variable which kinda inverts the meaning.
+ *
+ * First, we handle the case when each boundary returns different results.
+ * In that case the outcome can only be 'partial' match.
+ */
+ if (a != b)
+ return MVSTATS_MATCH_PARTIAL;
+
+ /*
+ * When the results are the same, then it depends on the 'isgt' value. There
+ * are four options:
+ *
+ * isgt=false a=b=true => full match
+ * isgt=false a=b=false => empty
+ * isgt=true a=b=true => empty
+ * isgt=true a=b=false => full match
+ *
+ * We'll cheat a bit, because we know that (a=b) so we'll use just one of them.
+ */
+ if (isgt)
+ return (!a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+ else
+ return ( a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ /*
+ * Used for caching function calls, only once per deduplicated value.
+ *
+ * We know may have up to (2 * nbuckets) values per dimension. It's
+ * probably overkill, but let's allocate that once for all clauses,
+ * to minimize overhead.
+ *
+ * Also, we only need two bits per value, but this allocates byte
+ * per value. Might be worth optimizing.
+ *
+ * 0x00 - not yet called
+ * 0x01 - called, result is 'false'
+ * 0x03 - called, result is 'true'
+ */
+ char *callcache = palloc(mvhist->nbuckets);
+
+ Assert(mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ /* reset the cache (per clause) */
+ memset(callcache, 0, mvhist->nbuckets);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_LT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ char res = MVSTATS_MATCH_NONE;
+
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /* histogram boundaries */
+ Datum minval, maxval;
+ bool mininclude, maxinclude;
+ int minidx, maxidx;
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* lookup the values and cache of function calls */
+ minidx = bucket->min[idx];
+ maxidx = bucket->max[idx];
+
+ minval = mvhist->values[idx][bucket->min[idx]];
+ maxval = mvhist->values[idx][bucket->max[idx]];
+
+ mininclude = bucket->min_inclusive[idx];
+ maxinclude = bucket->max_inclusive[idx];
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ */
+ switch (oprrest)
+ {
+ case F_SCALARLTSEL: /* Var < Const */
+ case F_SCALARGTSEL: /* Var > Const */
+
+ res = bucket_is_smaller_than_value(opproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache, isgt);
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the
+ * lt operator, and we also check for equality with the boundaries.
+ */
+
+ res = bucket_contains_value(ltproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache);
+ break;
+ }
+
+ UPDATE_RESULT(matches[i], res, is_or);
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the bucket, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+
+ /* free the call cache */
+ pfree(callcache);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8394111..2519249 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -412,7 +412,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
{
info = makeNode(MVStatisticInfo);
@@ -422,10 +422,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
+ info->hist_enabled = mvstat->hist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
+ info->hist_built = mvstat->hist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index f9bf10c..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.histogram b/src/backend/utils/mvstats/README.histogram
new file mode 100644
index 0000000..cd640e5
--- /dev/null
+++ b/src/backend/utils/mvstats/README.histogram
@@ -0,0 +1,299 @@
+Multivariate histograms
+=======================
+
+Histograms on individual attributes consist of buckets represented by ranges,
+covering the domain of the attribute. That is, each bucket is a [min,max]
+interval, and contains all values in this range. The histogram is built in such
+a way that all buckets have about the same frequency.
+
+Multivariate histograms are an extension into n-dimensional space - the buckets
+are n-dimensional intervals (i.e. n-dimensional rectagles), covering the domain
+of the combination of attributes. That is, each bucket has a vector of lower
+and upper boundaries, denoted min[i] and max[i] (where i = 1..n).
+
+In addition to the boundaries, each bucket tracks additional info:
+
+ * frequency (fraction of tuples in the bucket)
+ * whether the boundaries are inclusive or exclusive
+ * whether the dimension contains only NULL values
+ * number of distinct values in each dimension (for building only)
+
+It's possible that in the future we'll multiple histogram types, with different
+features. We do however expect all the types to share the same representation
+(buckets as ranges) and only differ in how we build them.
+
+The current implementation builds non-overlapping buckets, that may not be true
+for some histogram types and the code should not rely on this assumption. There
+are interesting types of histograms (or algorithms) with overlapping buckets.
+
+When used on low-cardinality data, histograms usually perform considerably worse
+than MCV lists (which are a good fit for this kind of data). This is especially
+true on label-like values, where ordering of the values is mostly unrelated to
+meaning of the data, as proper ordering is crucial for histograms.
+
+On high-cardinality data the histograms are usually a better choice, because MCV
+lists can't represent the distribution accurately enough.
+
+
+Selectivity estimation
+----------------------
+
+The estimation is implemented in clauselist_mv_selectivity_histogram(), and
+works very similarly to clauselist_mv_selectivity_mcvlist.
+
+The main difference is that while MCV lists support exact matches, histograms
+often result in approximate matches - e.g. with equality we can only say if
+the constant would be part of the bucket, but not whether it really is there
+or what fraction of the bucket it corresponds to. In this case we rely on
+some defaults just like in the per-column histograms.
+
+The current implementation uses histograms to estimates those types of clauses
+(think of WHERE conditions):
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR-clauses WHERE (a = 1) OR (b = 2)
+
+Similarly to MCV lists, it's possible to add support for additional types of
+clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and so on. These are tasks for the future, not yet implemented.
+
+
+When evaluating a clause on a bucket, we may get one of three results:
+
+ (a) FULL_MATCH - The bucket definitely matches the clause.
+
+ (b) PARTIAL_MATCH - The bucket matches the clause, but not necessarily all
+ the tuples it represents.
+
+ (c) NO_MATCH - The bucket definitely does not match the clause.
+
+This may be illustrated using a range [1, 5], which is essentially a 1-D bucket.
+With clause
+
+ WHERE (a < 10) => FULL_MATCH (all range values are below
+ 10, so the whole bucket matches)
+
+ WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ the clause, but we don't know how many)
+
+ WHERE (a < 0) => NO_MATCH (the whole range is above 1, so
+ no values from the bucket can match)
+
+Some clauses may produce only some of those results - for example equality
+clauses may never produce FULL_MATCH as we always hit only part of the bucket
+(we can't match both boundaries at the same time). This results in less accurate
+estimates compared to MCV lists, where we can hit a MCV items exactly (there's
+no PARTIAL match in MCV).
+
+There are also clauses that may not produce any PARTIAL_MATCH results. A nice
+example of that is 'IS [NOT] NULL' clause, which either matches the bucket
+completely (FULL_MATCH) or not at all (NO_MATCH), thanks to how the NULL-buckets
+are constructed.
+
+Computing the total selectivity estimate is trivial - simply sum selectivities
+from all the FULL_MATCH and PARTIAL_MATCH buckets (but for buckets marked with
+PARTIAL_MATCH, multiply the frequency by 0.5 to minimize the average error).
+
+
+Building a histogram
+---------------------
+
+The algorithm of building a histogram in general is quite simple:
+
+ (a) create an initial bucket (containing all sample rows)
+
+ (b) create NULL buckets (by splitting the initial bucket)
+
+ (c) repeat
+
+ (1) choose bucket to split next
+
+ (2) terminate if no bucket that might be split found, or if we've
+ reached the maximum number of buckets (16384)
+
+ (3) choose dimension to partition the bucket by
+
+ (4) partition the bucket by the selected dimension
+
+The main complexity is hidden in steps (c.1) and (c.3), i.e. how we choose the
+bucket and dimension for the split, as discussed in the next section.
+
+
+Partitioning criteria
+---------------------
+
+Similarly to one-dimensional histograms, we want to produce buckets with roughly
+the same frequency.
+
+We also need to produce "regular" buckets, because buckets with one dimension
+much longer than the others are very likely to match a lot of conditions (which
+increases error, even if the bucket frequency is very low).
+
+This is especially important when handling OR-clauses, because in that case each
+clause may add buckets independently. With AND-clauses all the clauses have to
+match each bucket, which makes this issue somewhat less concenrning.
+
+To achieve this, we choose the largest bucket (containing the most sample rows),
+but we only choose buckets that can actually be split (have at least 3 different
+combinations of values).
+
+Then we choose the "longest" dimension of the bucket, which is computed by using
+the distinct values in the sample as a measure.
+
+For details see functions select_bucket_to_partition() and partition_bucket(),
+which also includes further discussion.
+
+
+The current limit on number of buckets (16384) is mostly arbitrary, but chosen
+so that it guarantees we don't exceed the number of distinct values indexable by
+uint16 in any of the dimensions. In practice we could handle more buckets as we
+index each dimension separately and the splits should use the dimensions evenly.
+
+Also, histograms this large (with 16k values in multiple dimensions) would be
+quite expensive to build and process, so the 16k limit is rather reasonable.
+
+The actual number of buckets is also related to statistics target, because we
+require MIN_BUCKET_ROWS (10) tuples per bucket before a split, so we can't have
+more than (2 * 300 * target / 10) buckets. For the default target (100) this
+evaluates to ~6k.
+
+
+NULL handling (create_null_buckets)
+-----------------------------------
+
+When building histograms on a single attribute, we first filter out NULL values.
+In the multivariate case, we can't really do that because the rows may contain
+a mix of NULL and non-NULL values in different columns (so we can't simply
+filter all of them out).
+
+For this reason, the histograms are built in a way so that for each bucket, each
+dimension only contains only NULL or non-NULL values. Building the NULL-buckets
+happens as the first step in the build, by the create_null_buckets() function.
+The number of NULL buckets, as produced by this function, has a clear upper
+boundary (2^N) where N is the number of dimensions (attributes the histogram is
+built on). Or rather 2^K where K is the number of attributes that are not marked
+as not-NULL.
+
+The buckets with NULL dimensions are then subject to the same build algorithm
+(i.e. may be split into smaller buckets) just like any other bucket, but may
+only be split by non-NULL dimension.
+
+
+Serialization
+-------------
+
+To store the histogram in pg_mv_statistic table, it is serialized into a more
+efficient form. We also use the representation for estimation, i.e. we don't
+fully deserialize the histogram.
+
+For example the boundary values are deduplicated to minimize the required space.
+How much redundancy is there, actually? Let's assume there are no NULL values,
+so we start with a single bucket - in that case we have 2*N boundaries. Each
+time we split a bucket we introduce one new value (in the "middle" of one of
+the dimensions), and keep boundries for all the other dimensions. So after K
+splits, we have up to
+
+ 2*N + K
+
+unique boundary values (we may have fewe values, if the same value is used for
+several splits). But after K splits we do have (K+1) buckets, so
+
+ (K+1) * 2 * N
+
+boundary values. Using e.g. N=4 and K=999, we arrive to those numbers:
+
+ 2*N + K = 1007
+ (K+1) * 2 * N = 8000
+
+wich means a lot of redundancy. It's somewhat counter-intuitive that the number
+of distinct values does not really depend on the number of dimensions (except
+for the initial bucket, but that's negligible compared to the total).
+
+By deduplicating the values and replacing them with 16-bit indexes (uint16), we
+reduce the required space to
+
+ 1007 * 8 + 8000 * 2 ~= 24kB
+
+which is significantly less than 64kB required for the 'raw' histogram (assuming
+the values are 8B).
+
+While the bytea compression (pglz) might achieve the same reduction of space,
+the deduplicated representation is used to optimize the estimation by caching
+results of function calls for already visited values. This significantly
+reduces the number of calls to (often quite expensive) operators.
+
+Note: Of course, this reasoning only holds for histograms built by the algorithm
+that simply splits the buckets in half. Other histograms types (e.g. containing
+overlapping buckets) may behave differently and require different serialization.
+
+Serialized histograms are marked with 'magic' constant, to make it easier to
+check the bytea value really is a serialized histogram.
+
+
+varlena compression
+-------------------
+
+This serialization may however disable automatic varlena compression, the array
+of unique values is placed at the beginning of the serialized form. Which is
+exactly the chunk used by pglz to check if the data is compressible, and it
+will probably decide it's not very compressible. This is similar to the issue
+we had with JSONB initially.
+
+Maybe storing buckets first would make it work, as the buckets may be better
+compressible.
+
+On the other hand the serialization is actually a context-aware compression,
+usually compressing to ~30% (or even less, with large data types). So the lack
+of additional pglz compression may be acceptable.
+
+
+Deserialization
+---------------
+
+The deserialization is not a perfect inverse of the serialization, as we keep
+the deduplicated arrays. This reduces the amount of memory and also allows
+optimizations during estimation (e.g. we can cache results for the distinct
+values, saving expensive function calls).
+
+
+Inspecting the histogram
+------------------------
+
+Inspecting the regular (per-attribute) histograms is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarray, so we
+simply get the text representation of the array.
+
+With multivariate histograms it's not that simple due to the possible mix of
+data types in the histogram. It might be possible to produce similar array-like
+text representation, but that'd unnecessarily complicate further processing
+and analysis of the histogram. Instead, there's a SRF function that allows
+access to lower/upper boundaries, frequencies etc.
+
+ SELECT * FROM pg_mv_histogram_buckets();
+
+It has two input parameters:
+
+ oid - OID of the histogram (pg_mv_statistic.staoid)
+ otype - type of output
+
+and produces a table with these columns:
+
+ - bucket ID (0...nbuckets-1)
+ - lower bucket boundaries (string array)
+ - upper bucket boundaries (string array)
+ - nulls only dimensions (boolean array)
+ - lower boundary inclusive (boolean array)
+ - upper boundary includive (boolean array)
+ - frequency (double precision)
+
+The 'otype' accepts three values, determining what will be returned in the
+lower/upper boundary arrays:
+
+ - 0 - values stored in the histogram, encoded as text
+ - 1 - indexes into the deduplicated arrays
+ - 2 - idnexes into the deduplicated arrays, scaled to [0,1]
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 5c5c59a..3e4f4d1 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -18,6 +18,8 @@ Currently we only have two kinds of multivariate statistics
(b) MCV lists (README.mcv)
+ (c) multivariate histograms (README.histogram)
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index 4f5a842..f6d1074 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -13,11 +13,11 @@
*
*-------------------------------------------------------------------------
*/
+#include "postgres.h"
+#include "utils/array.h"
#include "common.h"
-#include "utils/array.h"
-
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts,
VacAttrStats **vacattrstats);
@@ -52,7 +52,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -95,8 +96,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (stat->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
}
}
@@ -176,6 +181,8 @@ list_mv_stats(Oid relid)
info->deps_built = stats->deps_built;
info->mcv_enabled = stats->mcv_enabled;
info->mcv_built = stats->mcv_built;
+ info->hist_enabled = stats->hist_enabled;
+ info->hist_built = stats->hist_built;
result = lappend(result, info);
}
@@ -190,7 +197,6 @@ list_mv_stats(Oid relid)
return result;
}
-
/*
* Find attnims of MV stats using the mvoid.
*/
@@ -236,9 +242,16 @@ find_mv_attnums(Oid mvoid, Oid *relid)
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -271,22 +284,34 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..6b07b51
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,2023 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+#include <math.h>
+
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static Datum * build_ndistinct(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int i, int *nvals);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(uint16))
+ * - max boundary indexes (2 * ndim * sizeof(uint16))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(uint16) + 3 * sizeof(bool)) + (2 * sizeof(float))
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) (*(float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* can't split bucket with less than 10 rows */
+#define MIN_BUCKET_ROWS 10
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * builds a multivariate algorithm
+ *
+ * The build algorithm is iterative - initially a single bucket containing all
+ * the sample rows is formed, and then repeatedly split into smaller buckets.
+ * In each step the largest bucket (in some sense) is chosen to be split next.
+ *
+ * The criteria for selecting the largest bucket (and the dimension for the
+ * split) needs to be elaborate enough to produce buckets of roughly the same
+ * size, and also regular shape (not very long in one dimension).
+ *
+ * The current algorithm works like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [maximum number of buckets not reached]
+ *
+ * choose bucket to partition (largest bucket)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (largest dimension)
+ * split the bucket into two buckets
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket for
+ * more details about the algorithm.
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ int *ndistvalues;
+ Datum **distvalues;
+
+ MVHistogram histogram;
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* build histogram header */
+
+ histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+
+ histogram->nbuckets = 1;
+ histogram->ndimensions = numattrs;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets
+ = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * Collect info on distinct values in each dimension (used later to select
+ * dimension to partition).
+ */
+ ndistvalues = (int*)palloc0(sizeof(int) * numattrs);
+ distvalues = (Datum**)palloc0(sizeof(Datum*) * numattrs);
+
+ for (i = 0; i < numattrs; i++)
+ distvalues[i] = build_ndistinct(numrows, rows, attrs, stats, i,
+ &ndistvalues[i]);
+
+ /*
+ * Split the initial bucket into buckets that don't mix NULL and non-NULL
+ * values in a single dimension.
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ /*
+ * Do the actual histogram build - select a bucket and split it.
+ *
+ * FIXME This should use the max_buckets specified in CREATE STATISTICS.
+ */
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no buckets eligible for partitioning */
+ if (bucket == NULL)
+ break;
+
+ /* we modify the bucket in-place and add one new bucket */
+ histogram->buckets[histogram->nbuckets++]
+ = partition_bucket(bucket, attrs, stats, ndistvalues, distvalues);
+ }
+
+ /* finalize the histogram build - compute the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data
+ = ((HistogramBuild)histogram->buckets[i]->build_data);
+
+ /*
+ * The frequency has to be computed from the whole sample, in case some
+ * of the rows were used for MCV.
+ *
+ * XXX Perhaps this should simply compute frequency with respect to the
+ * local freuquency, and then factor-in the MCV later.
+ *
+ * FIXME The 'ntuples' sounds a bit inappropriate for frequency.
+ */
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* build array of distinct values for a single attribute */
+static Datum *
+build_ndistinct(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int i, int *nvals)
+{
+ int j;
+ int nvalues,
+ ndistinct;
+ Datum *values,
+ *distvalues;
+
+ SortSupportData ssup;
+ StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ nvalues = 0;
+ values = (Datum*)palloc0(sizeof(Datum) * numrows);
+
+ /* collect values from the sample rows, ignore NULLs */
+ for (j = 0; j < numrows; j++)
+ {
+ Datum value;
+ bool isnull;
+
+ /* remember the index of the sample row, to make the partitioning simpler */
+ value = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+
+ if (isnull)
+ continue;
+
+ values[nvalues++] = value;
+ }
+
+ /* if no non-NULL values were found, free the memory and terminate */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ return NULL;
+ }
+
+ /* sort the array of values using the SortSupport */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /* count the distinct values first, and allocate just enough memory */
+ ndistinct = 1;
+ for (j = 1; j < nvalues; j++)
+ if (compare_scalars_simple(&values[j], &values[j-1], &ssup) != 0)
+ ndistinct += 1;
+
+ distvalues = (Datum*)palloc0(sizeof(Datum) * ndistinct);
+
+ /* now collect distinct values into the array */
+ distvalues[0] = values[0];
+ ndistinct = 1;
+
+ for (j = 1; j < nvalues; j++)
+ {
+ if (compare_scalars_simple(&values[j], &values[j-1], &ssup) != 0)
+ {
+ distvalues[ndistinct] = values[j];
+ ndistinct += 1;
+ }
+ }
+
+ pfree(values);
+
+ *nvals = ndistinct;
+ return distvalues;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVSerializedHistogram
+load_mv_histogram(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram(DatumGetByteaP(histogram));
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVSerializedHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm is quite
+ * simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ *
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ *
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we're mixing different
+ * datatypes, and we we need to use the right operators to compare/sort them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ DimensionInfo *info;
+ SortSupport ssup;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ info = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs (we won't use
+ * them, but we don't know how many are there), and then collect all
+ * non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* make sure we fit into uint16 */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typlen > 0)
+ /* byval or byref, but with fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (1024 * 1024))
+ elog(ERROR, "serialized histogram exceeds 1MB (%ld > %d)",
+ total_length, (1024 * 1024));
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* serialize the deduplicated values for all attributes */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ Datum v = values[i][j];
+
+ if (info[i].typbyval) /* passed by value */
+ {
+ memcpy(data, &v, info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0) /* pased by reference */
+ {
+ memcpy(data, DatumGetPointer(v), info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1) /* varlena */
+ {
+ memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2) /* cstring */
+ {
+ memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v))+1);
+ data += strlen(DatumGetPointer(v)) + 1;
+ }
+ }
+
+ /* make sure we got exactly the amount of data we expected */
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* finally serialize the items, with uint16 indexes instead of the values */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ uint16 idx;
+ Datum * v = NULL;
+
+ /* min boundary */
+ v = (Datum*)bsearch_arg(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ compare_scalars_simple, &ssup[j]);
+
+ Assert(v != NULL); /* serialization or deduplication error */
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch_arg(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ compare_scalars_simple, &ssup[j]);
+
+ Assert(v != NULL); /* serialization or deduplication error */
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* free the values/counts arrays here */
+ pfree(counts);
+ pfree(info);
+ pfree(ssup);
+
+ for (i = 0; i < ndims; i++)
+ pfree(values[i]);
+
+ pfree(values);
+
+ return output;
+}
+
+/*
+ * Returns histogram in a partially-serialized form (keeps the boundary values
+ * deduplicated, so that it's possible to optimize the estimation part by
+ * caching function call results between buckets etc.).
+ */
+MVSerializedHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+
+ MVSerializedHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram
+ = (MVSerializedHistogram)palloc(sizeof(MVSerializedHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVSerializedHistogramData, buckets));
+ tmp += offsetof(MVSerializedHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete, as we yet
+ * have to count the array sizes (from DimensionInfo records).
+ */
+ expected_size = offsetof(MVSerializedHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* a single buffer for all the values and counts */
+ bufflen = (sizeof(int) + sizeof(Datum*)) * ndims;
+
+ for (i = 0; i < ndims; i++)
+ /* don't allocate space for byval types, matching Datum */
+ if (! (info[i].typbyval && (info[i].typlen == sizeof(Datum))))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ /* also, include space for the result, tracking the buckets */
+ bufflen += nbuckets * (
+ sizeof(MVSerializedBucket) + /* bucket pointer */
+ sizeof(MVSerializedBucketData)); /* bucket data */
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ histogram->nvalues = (int*)ptr;
+ ptr += (sizeof(int) * ndims);
+
+ histogram->values = (Datum**)ptr;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ histogram->nvalues[i] = info[i].nvalues;
+
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ histogram->values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&histogram->values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the other types need a chunk of the buffer */
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ histogram->buckets = (MVSerializedBucket*)ptr;
+ ptr += (sizeof(MVSerializedBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVSerializedBucket bucket = (MVSerializedBucket)ptr;
+ ptr += sizeof(MVSerializedBucketData);
+
+ bucket->ntuples = BUCKET_NTUPLES(tmp);
+ bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+ bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+ bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+ bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+ bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ /* we should exhaust the output buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller ones.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which we use
+ * when selecting bucket to partition), and then number of distinct values
+ * for each partition (which we use when choosing which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm produces
+ * buckets with about equal frequency and regular size. We select the bucket
+ * with the highest number of distinct values, and then split it by the longest
+ * dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this is used
+ * to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this contains
+ * values for all the tuples from the sample, not just the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned, or NULL if
+ * there are no buckets that may be split (e.g. if all buckets are too small
+ * or contain too few distinct values).
+ *
+ *
+ * Tricky example
+ * --------------
+ *
+ * Consider this table:
+ *
+ * CREATE TABLE t AS SELECT i AS a, i AS b
+ * FROM generate_series(1,1000000) s(i);
+ *
+ * CREATE STATISTICS s1 ON t (a,b) WITH (histogram);
+ *
+ * ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because every bucket
+ * always has exactly the same number of distinct values in all dimensions,
+ * which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ * SELECT * FROM t WHERE (a < 100) AND (b < 100);
+ *
+ * is estimated to return ~120 rows, while in reality it returns only 99.
+ *
+ * QUERY PLAN
+ * -------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=117 width=8)
+ * (actual time=0.129..82.776 rows=99 loops=1)
+ * Filter: ((a < 100) AND (b < 100))
+ * Rows Removed by Filter: 999901
+ * Planning time: 1.286 ms
+ * Execution time: 82.984 ms
+ * (5 rows)
+ *
+ * So this estimate is reasonably close. Let's change the query to OR clause:
+ *
+ * SELECT * FROM t WHERE (a < 100) OR (b < 100);
+ *
+ * QUERY PLAN
+ * -------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=8100 width=8)
+ * (actual time=0.145..99.910 rows=99 loops=1)
+ * Filter: ((a < 100) OR (b < 100))
+ * Rows Removed by Filter: 999901
+ * Planning time: 1.578 ms
+ * Execution time: 100.132 ms
+ * (5 rows)
+ *
+ * That's clearly a much worse estimate. This happens because the histogram
+ * contains buckets like this:
+ *
+ * bucket 592 [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the length of "b"
+ * is (30593-30134)=459. So the "b" dimension is much narrower than "a".
+ * Of course, there are also buckets where "b" is the wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension but that
+ * only happens after we already selected the bucket. So if we never select the
+ * bucket, this optimization does not apply.
+ *
+ * The other reason why this particular example behaves so poorly is due to the
+ * way we actually split the selected bucket. We do attempt to divide the bucket
+ * into two parts containing about the same number of tuples, but that does not
+ * too well when most of the tuples is squashed on one side of the bucket.
+ *
+ * For example for columns with data on the diagonal (i.e. when a=b), we end up
+ * with a narrow bucket on the diagonal and a huge bucket overing the remaining
+ * part (with much lower density).
+ *
+ * So perhaps we need two partitioning strategies - one aiming to split buckets
+ * with high frequency (number of sampled rows), the other aiming to split
+ * "large" buckets. And alternating between them, somehow.
+ *
+ * TODO Consider using similar lower boundary for row count as for simple
+ * histograms, i.e. 300 tuples per bucket.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int numrows = 0;
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+
+ /* if the number of rows is higher, use this bucket */
+ if ((data->ndistinct > 2) &&
+ (data->numrows > numrows) &&
+ (data->numrows >= MIN_BUCKET_ROWS)) {
+ bucket = buckets[i];
+ numrows = data->numrows;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest bucket
+ * dimension, measured using the array of distinct values built at the very
+ * beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly distributed,
+ * and then use this to measure length. It's essentially a number of distinct
+ * values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts with
+ * roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning the new
+ * bucket (essentially shrinking the existing one in-place and returning the
+ * other "half" as a new bucket). The caller is responsible for adding the new
+ * bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension most in
+ * need of a split. For a nice summary and general overview, see "rK-Hist : an
+ * R-Tree based histogram for multi-dimensional selectivity estimation" thesis
+ * by J. A. Lopez, Concordia University, p.34-37 (and possibly p. 32-34 for
+ * explanation of the terms).
+ *
+ * It requires care to prevent splitting only one dimension and not splitting
+ * another one at all (which might happen easily in case of strongly dependent
+ * columns - e.g. y=x). The current algorithm minimizes this, but may still
+ * happen for perfectly dependent examples (when all the dimensions have equal
+ * length, the first one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ int nrows = 1; /* number of rows below current value */
+ double delta;
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* Look for the next dimension to split. */
+ delta = 0.0;
+ dimension = -1;
+
+ for (i = 0; i < numattrs; i++)
+ {
+ Datum *a, *b;
+
+ mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ /* can't split NULL-only dimension */
+ if (bucket->nullsonly[i])
+ continue;
+
+ /* can't split dimension with a single ndistinct value */
+ if (data->ndistincts[i] <= 1)
+ continue;
+
+ /* search for min boundary in the distinct list */
+ a = (Datum*)bsearch_arg(&bucket->min[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), compare_scalars_simple, &ssup);
+
+ b = (Datum*)bsearch_arg(&bucket->max[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), compare_scalars_simple, &ssup);
+
+ /* if this dimension is 'larger' then partition by it */
+ if (((b-a)*1.0 / ndistvalues[i]) > delta)
+ {
+ delta = ((b-a)*1.0 / ndistvalues[i]);
+ dimension = i;
+ }
+ }
+
+ /*
+ * If we haven't found a dimension here, we've done something
+ * wrong in select_bucket_to_partition.
+ */
+ Assert(dimension != -1);
+
+ /*
+ * Walk through the selected dimension, collect and sort the values and
+ * then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we never split null-only dimension) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values in this
+ * dimension, and we want to split this into half, so walk through the
+ * array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value, and
+ * use it as an exclusive upper boundary (and inclusive lower boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct values
+ * (at least for even distinct counts), but that would require being
+ * able to do an average (which does not work for non-numeric types).
+ *
+ * TODO Another option is to look for a split that'd give about 50% tuples
+ * (not distinct values) in each partition. That might work better
+ * when there are a few very frequent values, and many rare ones.
+ */
+ delta = fabs(data->numrows);
+ split_value = values[0].value;
+
+ for (i = 1; i < data->numrows; i++)
+ {
+ if (values[i].value != values[i-1].value)
+ {
+ /* are we closer to splitting the bucket in half? */
+ if (fabs(i - data->numrows/2.0) < delta)
+ {
+ /* let's assume we'll use this value for the split */
+ split_value = values[i].value;
+ delta = fabs(i - data->numrows/2.0);
+ nrows = i;
+ }
+ }
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno' index. We
+ * know 'nrows' rows should remain in the original bucket and the rest goes
+ * to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should go to the
+ * new one. Use the tupno field to get the actual HeapTuple row from the
+ * original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time data, i.e.
+ * sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies the
+ * Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types (assuming
+ * they don't use collations etc.)
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes above
+ * (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and non-NULL
+ * values in a single dimension. Each dimension may either be marked as 'nulls
+ * only', and thus containing only NULL values, or it must not contain any NULL
+ * values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns, it's
+ * necessary to build those NULL-buckets. This is done in an iterative way
+ * using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL and
+ * non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not marked as
+ * NULL-only, mark it as NULL-only and run the algorithm again (on
+ * this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the bucket
+ * into two parts - one with NULL values, one with non-NULL values
+ * (replacing the current one). Then run the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions should
+ * be quite low - limited by the number of NULL-buckets. Also, in each branch
+ * the number of nested calls is limited by the number of dimensions
+ * (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The number of
+ * buckets produced by this algorithm is rather limited - with N dimensions,
+ * there may be only 2^N such buckets (each dimension may be either NULL or
+ * non-NULL). So with 8 dimensions (current value of MVSTATS_MAX_DIMENSIONS)
+ * there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further optimizing
+ * the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL in a
+ * dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ /*
+ * FIXME We don't need to start from the first attribute here - we can
+ * start from the last known dimension.
+ */
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only, but is
+ * not yet marked like that. It's enough to mark it and repeat the process
+ * recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in the
+ * dimension, one with non-NULL values. We don't need to sort the data or
+ * anything, but otherwise it's similar to what partition_bucket() does.
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each bucket (NULL
+ * is not a value, so NULL buckets get 0, and the other bucket got all
+ * the distinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if the
+ * statistics contains no histogram (or if there's no statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ * - prints actual values
+ * - using the output function of the data type (as string)
+ * - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ * - prints index of the distinct value (into the serialized array)
+ * - makes it easier to spot neighbor buckets, etc.
+ * - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ * - prints index of the distinct value, but normalized into [0,1]
+ * - similar to 1, but shows how 'long' the bucket range is
+ * - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options skew the
+ * lengths by distributing the distinct values uniformly. For data types
+ * without a clear meaning of 'distance' (e.g. strings) that is not a big deal,
+ * but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_histogram_buckets);
+
+#define OUTPUT_FORMAT_RAW 0
+#define OUTPUT_FORMAT_INDEXES 1
+#define OUTPUT_FORMAT_DISTINCT 2
+
+Datum
+pg_mv_histogram_buckets(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ Oid mvoid = PG_GETARG_OID(0);
+ int otype = PG_GETARG_INT32(1);
+
+ if ((otype < 0) || (otype > 2))
+ elog(ERROR, "invalid output type specified");
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MVSerializedHistogram histogram;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ histogram = load_mv_histogram(mvoid);
+
+ funcctx->user_fctx = histogram;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = histogram->nbuckets;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+ double bucket_volume = 1.0;
+ StringInfo bufs;
+
+ char *format;
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MVSerializedHistogram histogram;
+ MVSerializedBucket bucket;
+
+ histogram = (MVSerializedHistogram)funcctx->user_fctx;
+
+ Assert(call_cntr < histogram->nbuckets);
+
+ bucket = histogram->buckets[call_cntr];
+
+ stakeys = find_mv_attnums(mvoid, &relid);
+
+ /*
+ * The scalar values will be formatted directly, using snprintf.
+ *
+ * The 'array' values will be formatted through StringInfo.
+ */
+ values = (char **) palloc0(9 * sizeof(char *));
+ bufs = (StringInfo) palloc0(9 * sizeof(StringInfoData));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ initStringInfo(&bufs[1]); /* lower boundaries */
+ initStringInfo(&bufs[2]); /* upper boundaries */
+ initStringInfo(&bufs[3]); /* nulls-only */
+ initStringInfo(&bufs[4]); /* lower inclusive */
+ initStringInfo(&bufs[5]); /* upper inclusive */
+
+ values[6] = (char *) palloc(64 * sizeof(char));
+ values[7] = (char *) palloc(64 * sizeof(char));
+ values[8] = (char *) palloc(64 * sizeof(char));
+
+ /* we need to do this only when printing the actual values */
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * histogram->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+ /*
+ * lookup output functions for all histogram dimensions
+ *
+ * XXX This might be one in the first call and stored in user_fctx.
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* bucket ID */
+
+ /* for the arrays of lower/upper boundaries, formated according to otype */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ Datum *vals = histogram->values[i];
+
+ uint16 minidx = bucket->min[i];
+ uint16 maxidx = bucket->max[i];
+
+ /* compute bucket volume, using distinct values as a measure
+ *
+ * XXX Not really sure what to do for NULL dimensions here, so let's
+ * simply count them as '1'.
+ */
+ bucket_volume
+ *= (double)(maxidx - minidx + 1) / (histogram->nvalues[i]-1);
+
+ if (i == 0)
+ format = "{%s"; /* fist dimension */
+ else if (i < (histogram->ndimensions - 1))
+ format = ", %s"; /* medium dimensions */
+ else
+ format = ", %s}"; /* last dimension */
+
+ appendStringInfo(&bufs[3], format, bucket->nullsonly[i] ? "t" : "f");
+ appendStringInfo(&bufs[4], format, bucket->min_inclusive[i] ? "t" : "f");
+ appendStringInfo(&bufs[5], format, bucket->max_inclusive[i] ? "t" : "f");
+
+ /* for NULL-only dimension, simply put there the NULL and continue */
+ if (bucket->nullsonly[i])
+ {
+ if (i == 0)
+ format = "{%s";
+ else if (i < (histogram->ndimensions - 1))
+ format = ", %s";
+ else
+ format = ", %s}";
+
+ appendStringInfo(&bufs[1], format, "NULL");
+ appendStringInfo(&bufs[2], format, "NULL");
+
+ continue;
+ }
+
+ /* otherwise we really need to format the value */
+ switch (otype)
+ {
+ case OUTPUT_FORMAT_RAW: /* actual boundary values */
+
+ if (i == 0)
+ format = "{%s";
+ else if (i < (histogram->ndimensions - 1))
+ format = ", %s";
+ else
+ format = ", %s}";
+
+ appendStringInfo(&bufs[1], format,
+ FunctionCall1(&fmgrinfo[i], vals[minidx]));
+
+ appendStringInfo(&bufs[2], format,
+ FunctionCall1(&fmgrinfo[i], vals[maxidx]));
+
+ break;
+
+ case OUTPUT_FORMAT_INDEXES: /* indexes into deduplicated arrays */
+
+ if (i == 0)
+ format = "{%d";
+ else if (i < (histogram->ndimensions - 1))
+ format = ", %d";
+ else
+ format = ", %d}";
+
+ appendStringInfo(&bufs[1], format, minidx);
+
+ appendStringInfo(&bufs[2], format, maxidx);
+
+ break;
+
+ case OUTPUT_FORMAT_DISTINCT: /* distinct arrays as measure */
+
+ if (i == 0)
+ format = "{%f";
+ else if (i < (histogram->ndimensions - 1))
+ format = ", %f";
+ else
+ format = ", %f}";
+
+ appendStringInfo(&bufs[1], format,
+ (minidx * 1.0 / (histogram->nvalues[i]-1)));
+
+ appendStringInfo(&bufs[2], format,
+ (maxidx * 1.0 / (histogram->nvalues[i]-1)));
+
+ break;
+
+ default:
+ elog(ERROR, "unknown output type: %d", otype);
+ }
+ }
+
+ values[1] = bufs[1].data;
+ values[2] = bufs[2].data;
+ values[3] = bufs[3].data;
+ values[4] = bufs[4].data;
+ values[5] = bufs[5].data;
+
+ snprintf(values[6], 64, "%f", bucket->ntuples); /* frequency */
+ snprintf(values[7], 64, "%f", bucket->ntuples / bucket_volume); /* density */
+ snprintf(values[8], 64, "%f", bucket_volume); /* volume (as a fraction) */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[6]);
+ pfree(values[7]);
+ pfree(values[8]);
+
+ resetStringInfo(&bufs[1]);
+ resetStringInfo(&bufs[2]);
+ resetStringInfo(&bufs[3]);
+ resetStringInfo(&bufs[4]);
+ resetStringInfo(&bufs[5]);
+
+ pfree(bufs);
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
+
+#ifdef DEBUG_MVHIST
+/*
+ * prints debugging info about matched histogram buckets (full/partial)
+ *
+ * XXX Currently works only for INT data type.
+ */
+void
+debug_histogram_matches(MVSerializedHistogram mvhist, char *matches)
+{
+ int i, j;
+
+ float ffull = 0, fpartial = 0;
+ int nfull = 0, npartial = 0;
+
+ StringInfoData buf;
+
+ initStringInfo(&buf);
+
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ if (! matches[i])
+ continue;
+
+ /* increment the counters */
+ nfull += (matches[i] == MVSTATS_MATCH_FULL) ? 1 : 0;
+ npartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? 1 : 0;
+
+ /* and also update the frequencies */
+ ffull += (matches[i] == MVSTATS_MATCH_FULL) ? bucket->ntuples : 0;
+ fpartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? bucket->ntuples : 0;
+
+ resetStringInfo(&buf);
+
+ /* build ranges for all the dimentions */
+ for (j = 0; j < mvhist->ndimensions; j++)
+ {
+ appendStringInfo(&buf, '[%d %d]',
+ DatumGetInt32(mvhist->values[j][bucket->min[j]]),
+ DatumGetInt32(mvhist->values[j][bucket->max[j]]));
+ }
+
+ elog(WARNING, "bucket %d %s => %d [%f]", i, buf.data, matches[i], bucket->ntuples);
+ }
+
+ elog(WARNING, "full=%f partial=%f (%f)", ffull, fpartial, (ffull + 0.5 * fpartial));
+}
+#endif
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2c22d31..b693f36 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,9 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled, mcv_enabled,\n"
- " deps_built, mcv_built,\n"
- " mcv_max_items,\n"
+ " deps_enabled, mcv_enabled, hist_enabled,\n"
+ " deps_built, mcv_built, hist_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2154,8 +2154,17 @@ describeOneTableDetails(const char *schemaname,
first = false;
}
+ if (!strcmp(PQgetvalue(result, i, 6), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", histogram");
+ else
+ appendPQExpBuffer(&buf, "(histogram");
+ first = false;
+ }
+
appendPQExpBuffer(&buf, ") ON (%s)",
- PQgetvalue(result, i, 9));
+ PQgetvalue(result, i, 12));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 3529b03..7020772 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -39,13 +39,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -53,6 +56,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -68,18 +72,22 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 12
+#define Natts_pg_mv_statistic 16
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_staowner 4
#define Anum_pg_mv_statistic_deps_enabled 5
#define Anum_pg_mv_statistic_mcv_enabled 6
-#define Anum_pg_mv_statistic_mcv_max_items 7
-#define Anum_pg_mv_statistic_deps_built 8
-#define Anum_pg_mv_statistic_mcv_built 9
-#define Anum_pg_mv_statistic_stakeys 10
-#define Anum_pg_mv_statistic_stadeps 11
-#define Anum_pg_mv_statistic_stamcv 12
+#define Anum_pg_mv_statistic_hist_enabled 7
+#define Anum_pg_mv_statistic_mcv_max_items 8
+#define Anum_pg_mv_statistic_hist_max_buckets 9
+#define Anum_pg_mv_statistic_deps_built 10
+#define Anum_pg_mv_statistic_mcv_built 11
+#define Anum_pg_mv_statistic_hist_built 12
+#define Anum_pg_mv_statistic_stakeys 13
+#define Anum_pg_mv_statistic_stadeps 14
+#define Anum_pg_mv_statistic_stamcv 15
+#define Anum_pg_mv_statistic_stahist 16
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index f8ceabf..0ca4957 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2674,6 +2674,10 @@ DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f
DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
DESCR("details about MCV list items");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3374 ( pg_mv_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_volume}" _null_ _null_ pg_mv_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 2bcd582..8c50bfb 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -654,10 +654,12 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
+ bool hist_enabled; /* histogram enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
+ bool hist_built; /* histogram built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index ce7c3ad..6708139 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -93,6 +93,123 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ *
+ * TODO add more detailed description here
+ */
+
+typedef struct MVSerializedBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ uint16 *min;
+ bool *min_inclusive;
+
+ /* indexes of upper boundaries - values and information about the
+ * inequalities (exclusive vs. inclusive) */
+ uint16 *max;
+ bool *max_inclusive;
+
+} MVSerializedBucketData;
+
+typedef MVSerializedBucketData *MVSerializedBucket;
+
+typedef struct MVSerializedHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ /*
+ * keep this the same with MVHistogramData, because of
+ * deserialization (same offset)
+ */
+ MVSerializedBucket *buckets; /* array of buckets */
+
+ /*
+ * serialized boundary values, one array per dimension, deduplicated
+ * (the min/max indexes point into these arrays)
+ */
+ int *nvalues;
+ Datum **values;
+
+} MVSerializedHistogramData;
+
+typedef MVSerializedHistogramData *MVSerializedHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -100,20 +217,25 @@ typedef MCVListData *MCVList;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
+MVSerializedHistogram load_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVSerializedHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
* dimension within the stats).
*/
int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
@@ -122,6 +244,8 @@ extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_histogram_buckets(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -131,10 +255,20 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
+#ifdef DEBUG_MVHIST
+extern void debug_histogram_matches(MVSerializedHistogram mvhist, char *matches);
+#endif
+
+
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..e830816
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s7 ON mv_histogram (unknown_column) WITH (histogram);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s7 ON mv_histogram (a) WITH (histogram);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a, b) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+ERROR: maximum number of buckets is 16384
+-- correct command
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s8 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s9 ON mv_histogram (a, b, c, d) WITH (histogram);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 3d55ffe..528ac36 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1375,7 +1375,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 85d94f1..a885235 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 6584d73..2efdcd7 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -164,3 +164,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..27c2510
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,176 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s7 ON mv_histogram (unknown_column) WITH (histogram);
+
+-- single column
+CREATE STATISTICS s7 ON mv_histogram (a) WITH (histogram);
+
+-- single column, duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a) WITH (histogram);
+
+-- two columns, one duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a, b) WITH (histogram);
+
+-- unknown option
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (unknown_option);
+
+-- missing histogram statistics
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+
+-- invalid max_buckets value / too low
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+
+-- invalid max_buckets value / too high
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+
+-- correct command
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s8 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s9 ON mv_histogram (a, b, c, d) WITH (histogram);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DROP TABLE mv_histogram;
--
2.5.0
0004-multivariate-MCV-lists.patchtext/x-patch; charset=UTF-8; name=0004-multivariate-MCV-lists.patchDownload
From 9786256d4dec9b3d6ea90ebbbeebd41568453b1b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 16:52:15 +0200
Subject: [PATCH 4/9] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
---
doc/src/sgml/ref/create_statistics.sgml | 43 ++
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 45 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 814 +++++++++++++++++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.mcv | 137 ++++
src/backend/utils/mvstats/README.stats | 89 ++-
src/backend/utils/mvstats/common.c | 133 +++-
src/backend/utils/mvstats/common.h | 17 +-
src/backend/utils/mvstats/mcv.c | 1120 +++++++++++++++++++++++++++++++
src/bin/psql/describe.c | 25 +-
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 69 +-
src/test/regress/expected/mv_mcv.out | 207 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 178 +++++
22 files changed, 2847 insertions(+), 73 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.mcv
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index ff09fa5..d6973e8 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -132,6 +132,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>max_mcv_items</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of MCV list items.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>mcv</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables MCV list for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
@@ -177,6 +195,31 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 2);
</programlisting>
</para>
+ <para>
+ Create table <structname>t2</> with two perfectly correlated columns
+ (containing identical data), and a MCV list on those columns:
+
+<programlisting>
+CREATE TABLE t2 (
+ a int,
+ b int
+);
+
+INSERT INTO t2 SELECT mod(i,100), mod(i,100)
+ FROM generate_series(1,1000000) s(i);
+
+CREATE STATISTICS s2 ON t2 (a, b) WITH (mcv);
+
+ANALYZE t2;
+
+-- valid combination (found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1);
+
+-- invalid combination (not found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
+</programlisting>
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 31dbb2c..5c40334 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -165,7 +165,9 @@ CREATE VIEW pg_mv_stats AS
S.staname AS staname,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index f43b053..c480fbe 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -70,7 +70,13 @@ CreateStatistics(CreateStatsStmt *stmt)
ObjectAddress parentobject, childobject;
/* by default build nothing */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -146,6 +152,29 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -154,10 +183,16 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -178,8 +213,12 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 07206d7..333e24b 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2162,9 +2162,11 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
+ WRITE_BOOL_FIELD(mcv_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
+ WRITE_BOOL_FIELD(mcv_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 5ab7f15..7fc0c49 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -15,6 +15,7 @@
#include "postgres.h"
#include "access/sysattr.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
@@ -47,18 +48,39 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
-static bool clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum);
+static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
+ int type);
-static Bitmapset *collect_mv_attnums(List *clauses, Index relid);
+static Bitmapset *collect_mv_attnums(List *clauses, Index relid, int type);
-static int count_mv_attnums(List *clauses, Index relid);
+static int count_mv_attnums(List *clauses, Index relid, int type);
static int count_varnos(List *clauses, Index *relid);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Index relid, List *stats);
+static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
+
+static List *clauselist_mv_split(PlannerInfo *root, Index relid,
+ List *clauses, List **mvclauses,
+ MVStatisticInfo *mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
+
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
@@ -66,6 +88,13 @@ static List * find_stats(PlannerInfo *root, Index relid);
static bool stats_type_matches(MVStatisticInfo *stat, int type);
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,isor) \
+ (m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -91,11 +120,13 @@ static bool stats_type_matches(MVStatisticInfo *stat, int type);
* to verify that suitable multivariate statistics exist.
*
* If we identify such multivariate statistics apply, we try to apply them.
- * Currently we only have (soft) functional dependencies, so we try to reduce
- * the list of clauses.
*
- * Then we remove the clauses estimated using multivariate stats, and process
- * the rest of the clauses using the regular per-column stats.
+ * First we try to reduce the list of clauses by applying (soft) functional
+ * dependencies, and then we try to estimate the selectivity of the reduced
+ * list of clauses using the multivariate MCV list.
+ *
+ * Finally we remove the portion of clauses estimated using multivariate stats,
+ * and process the rest of the clauses using the regular per-column stats.
*
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
@@ -172,12 +203,46 @@ clauselist_selectivity(PlannerInfo *root,
* that need to be estimated by other types of stats (MCV, histograms etc).
*/
if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
- (count_mv_attnums(clauses, relid) >= 2))
+ (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_FDEP) >= 2))
{
clauses = clauselist_apply_dependencies(root, clauses, relid, stats);
}
/*
+ * Check that there are statistics with MCV list or histogram, and also the
+ * number of attributes covered by these types of statistics.
+ *
+ * If there are no such stats or not enough attributes, don't waste time
+ * with the multivariate code and simply skip to estimation using the
+ * regular per-column stats.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
+ (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV) >= 2))
+ {
+ /* collect attributes from the compatible conditions */
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV);
+
+ /* and search for the statistic covering the most attributes */
+ MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+
+ if (mvstat != NULL) /* we have a matching stats */
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
+ mvstat, MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -834,32 +899,93 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * estimate selectivity of clauses using multivariate statistic
+ *
+ * Perform estimation of the clauses using a MCV list.
+ *
+ * This assumes all the clauses are compatible with the selected statistics
+ * (e.g. only reference columns covered by the statistics, use supported
+ * operator, etc.).
+ *
+ * TODO We may support some additional conditions, most importantly those
+ * matching multiple columns (e.g. "a = b" or "a < b").
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities (i.e. the
+ * selectivity of the most restrictive clause), because that's the maximum
+ * we can ever get from ANDed list of clauses. This may probably prevent
+ * issues with hitting too many buckets and low precision histograms.
+ *
+ * TODO We may remember the lowest frequency in the MCV list, and then later use
+ * it as a upper boundary for the selectivity (had there been a more
+ * frequent item, it'd be in the MCV list). This might improve cases with
+ * low-detail histograms.
+ *
+ * TODO We may also derive some additional boundaries for the selectivity from
+ * the MCV list, because
+ *
+ * (a) if we have a "full equality condition" (one equality condition on
+ * each column of the statistic) and we found a match in the MCV list,
+ * then this is the final selectivity (and pretty accurate),
+ *
+ * (b) if we have a "full equality condition" and we haven't found a match
+ * in the MCV list, then the selectivity is below the lowest frequency
+ * found in the MCV list,
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive). But this
+ * requires really knowing the per-clause selectivities in advance,
+ * and that's not what we do now.
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
-collect_mv_attnums(List *clauses, Index relid)
+collect_mv_attnums(List *clauses, Index relid, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
/*
- * Walk through the clauses and identify the ones we can estimate
- * using multivariate stats, and remember the relid/columns. We'll
- * then cross-check if we have suitable stats, and only if needed
- * we'll split the clauses into multivariate and regular lists.
+ * Walk through the clauses and identify the ones we can estimate using
+ * multivariate stats, and remember the relid/columns. We'll then
+ * cross-check if we have suitable stats, and only if needed we'll split
+ * the clauses into multivariate and regular lists.
*
- * For now we're only interested in RestrictInfo nodes with nested
- * OpExpr, using either a range or equality.
+ * For now we're only interested in RestrictInfo nodes with nested OpExpr,
+ * using either a range or equality.
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(clause, relid, &attnum))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, relid, &attnums, types);
}
/*
@@ -880,10 +1006,10 @@ collect_mv_attnums(List *clauses, Index relid)
* Count the number of attributes in clauses compatible with multivariate stats.
*/
static int
-count_mv_attnums(List *clauses, Index relid)
+count_mv_attnums(List *clauses, Index relid, int type)
{
int c;
- Bitmapset *attnums = collect_mv_attnums(clauses, relid);
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
c = bms_num_members(attnums);
@@ -913,9 +1039,183 @@ count_varnos(List *clauses, Index *relid)
return cnt;
}
+
+/*
+ * We're looking for statistics matching at least 2 attributes, referenced in
+ * clauses compatible with multivariate statistics. The current selection
+ * criteria is very simple - we choose the statistics referencing the most
+ * attributes.
+ *
+ * If there are multiple statistics referencing the same number of columns
+ * (from the clauses), the one with less source columns (as listed in the
+ * ADD STATISTICS when creating the statistics) wins. Else the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns, but one
+ * has 100 buckets and the other one has 1000 buckets (thus likely
+ * providing better estimates), this is not currently considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list, another
+ * one with just a histogram and a third one with both, we treat them equally.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts, so if
+ * there are multiple clauses on a single attribute, this still counts as
+ * a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example equality
+ * clauses probably work better with MCV lists than with histograms. But
+ * IS [NOT] NULL conditions may often work better with histograms (thanks
+ * to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be selected
+ * as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list of
+ * clauses into two parts - conditions that are compatible with the selected
+ * stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while the last
+ * condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting conditions
+ * instead of just referenced attributes), but eventually the best option should
+ * be to combine multiple statistics. But that's much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses, because
+ * 'dependencies' will probably work only with equality clauses.
+ */
+static MVStatisticInfo *
+choose_mv_statistics(List *stats, Bitmapset *attnums)
+{
+ int i;
+ ListCell *lc;
+
+ MVStatisticInfo *choice = NULL;
+
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements) and for
+ * each one count the referenced attributes (encoded in the 'attnums' bitmap).
+ */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = info->stakeys;
+ int numattrs = attrs->dim1;
+
+ /* skip dependencies-only stats */
+ if (! info->mcv_built)
+ continue;
+
+ /* count columns covered by the histogram */
+ for (i = 0; i < numattrs; i++)
+ if (bms_is_member(attrs->values[i], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = info;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses that
+ * will be evaluated using the chosen statistics, and the remaining clauses
+ * (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, Index relid,
+ List *clauses, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes, so we can do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(clause, relid, &attnums, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list of
+ * mv-compatible clauses. Otherwise, keep it in the list of 'regular'
+ * clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible with the chosen
+ * histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
typedef struct
{
+ int types; /* types of statistics ? */
Index varno; /* relid we're interested in */
Bitmapset *varattnos; /* attnums referenced by the clauses */
} mv_compatible_context;
@@ -933,23 +1233,66 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
{
if (node == NULL)
return false;
-
+
if (IsA(node, RestrictInfo))
{
RestrictInfo *rinfo = (RestrictInfo *) node;
-
+
/* Pseudoconstants are not really interesting here. */
if (rinfo->pseudoconstant)
return true;
-
+
/* clauses referencing multiple varnos are incompatible */
if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
return true;
-
+
/* check the clause inside the RestrictInfo */
return mv_compatible_walker((Node*)rinfo->clause, (void *) context);
}
+ if (or_clause(node) || and_clause(node) || not_clause(node))
+ {
+ /*
+ * AND/OR/NOT-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses are
+ * supported and some are not, and treat all supported subclauses
+ * as a single clause, compute it's selectivity using mv stats,
+ * and compute the total selectivity using the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the orclause
+ * with nested RestrictInfo - we won't have to call pull_varnos()
+ * for each clause, saving time.
+ *
+ * TODO Perhaps this needs a bit more thought for functional
+ * dependencies? Those don't quite work for NOT cases.
+ */
+ BoolExpr *expr = (BoolExpr *) node;
+ ListCell *lc;
+
+ foreach (lc, expr->args)
+ {
+ if (mv_compatible_walker((Node *) lfirst(lc), context))
+ return true;
+ }
+
+ return false;
+ }
+
+ if (IsA(node, NullTest))
+ {
+ NullTest* nt = (NullTest*)node;
+
+ /*
+ * Only simple (Var IS NULL) expressions supported for now. Maybe we could
+ * use examine_variable to fix this?
+ */
+ if (! IsA(nt->arg, Var))
+ return true;
+
+ return mv_compatible_walker((Node*)(nt->arg), context);
+ }
+
if (IsA(node, Var))
{
Var * var = (Var*)node;
@@ -1000,7 +1343,7 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
/* unsupported structure (two variables or so) */
if (! ok)
return true;
-
+
/*
* If it's not a "<" or ">" or "=" operator, just ignore the clause.
* Otherwise note the relid and attnum for the variable. This uses the
@@ -1010,10 +1353,18 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
switch (get_oprrest(expr->opno))
{
case F_EQSEL:
-
/* equality conditions are compatible with all statistics */
break;
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+
+ /* not compatible with functional dependencies */
+ if (! (context->types & MV_CLAUSE_TYPE_MCV))
+ return true; /* terminate */
+
+ break;
+
default:
/* unknown estimator */
@@ -1024,11 +1375,11 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
return mv_compatible_walker((Node *) var, context);
}
-
+
/* Node not explicitly supported, so terminate */
return true;
}
-
+
/*
* Determines whether the clause is compatible with multivariate stats,
* and if it is, returns some additional information - varno (index
@@ -1047,10 +1398,11 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
* evaluate them using multivariate stats.
*/
static bool
-clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
+clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums, int types)
{
mv_compatible_context context;
+ context.types = types;
context.varno = relid;
context.varattnos = NULL; /* no attnums */
@@ -1058,7 +1410,7 @@ clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
return false;
/* remember the newly collected attnums */
- *attnum = bms_singleton_member(context.varattnos);
+ *attnums = bms_add_members(*attnums, context.varattnos);
return true;
}
@@ -1075,15 +1427,15 @@ fdeps_reduce_clauses(List *clauses, Index relid, Bitmapset *reduced_attnums)
foreach (lc, clauses)
{
- AttrNumber attnum = InvalidAttrNumber;
+ Bitmapset *attnums = NULL;
Node * clause = (Node*)lfirst(lc);
/* ignore clauses that are not compatible with functional dependencies */
- if (! clause_is_mv_compatible(clause, relid, &attnum))
+ if (! clause_is_mv_compatible(clause, relid, &attnums, MV_CLAUSE_TYPE_FDEP))
reduced_clauses = lappend(reduced_clauses, clause);
/* for equality clauses, only keep those not on reduced attributes */
- if (! bms_is_member(attnum, reduced_attnums))
+ if (! bms_is_subset(attnums, reduced_attnums))
reduced_clauses = lappend(reduced_clauses, clause);
}
@@ -1208,7 +1560,7 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
return clauses;
/* collect attnums from clauses compatible with dependencies (equality) */
- clause_attnums = collect_mv_attnums(clauses, relid);
+ clause_attnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_FDEP);
/* decide which attnums may be eliminated */
reduced_attnums = fdeps_reduce_attnums(stats, clause_attnums);
@@ -1233,6 +1585,9 @@ stats_type_matches(MVStatisticInfo *stat, int type)
if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
return true;
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
+
return false;
}
@@ -1266,3 +1621,392 @@ find_stats(PlannerInfo *root, Index relid)
return root->simple_rel_array[relid]->mvstatlist;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = load_mv_mcvlist(mvstats->mvoid);
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* used to 'scale' for MCV lists not covering all tuples */
+ u += mcvlist->items[i]->frequency;
+
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s*u;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /*
+ * find the lowest frequency in the MCV list
+ *
+ * We need to do that here, because we do various tricks in the following
+ * code - skipping items already ruled out, etc.
+ *
+ * XXX A loop is necessary because the MCV list is not sorted by frequency.
+ */
+ *lowsel = 1.0;
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+ }
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr *expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+ FmgrInfo opproc;
+
+ /* get procedure computing operator selectivity */
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ switch (oprrest)
+ {
+ case F_EQSEL:
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ mismatch = ! DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (! mismatch)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ break;
+
+ case F_SCALARLTSEL: /* column < constant */
+ case F_SCALARGTSEL: /* column > constant */
+
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ /* invert the result if isgt=true */
+ mismatch = (isgt) ? (! mismatch) : mismatch;
+ break;
+ }
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! item->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (item->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 7fb2088..8394111 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -412,7 +412,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built)
+ if (mvstat->deps_built || mvstat->mcv_built)
{
info = makeNode(MVStatisticInfo);
@@ -421,9 +421,11 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
+ info->mcv_enabled = mvstat->mcv_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
+ info->mcv_built = mvstat->mcv_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..f9bf10c 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o dependencies.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.mcv b/src/backend/utils/mvstats/README.mcv
new file mode 100644
index 0000000..e93cfe4
--- /dev/null
+++ b/src/backend/utils/mvstats/README.mcv
@@ -0,0 +1,137 @@
+MCV lists
+=========
+
+Multivariate MCV (most-common values) lists are a straightforward extension of
+regular MCV list, tracking most frequent combinations of values for a group of
+attributes.
+
+This works particularly well for columns with a small number of distinct values,
+as the list may include all the combinations and approximate the distribution
+very accurately.
+
+For columns with large number of distinct values (e.g. those with continuous
+domains), the list will only track the most frequent combinations. If the
+distribution is mostly uniform (all combinations about equally frequent), the
+MCV list will be empty.
+
+Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
+than when using histograms.
+
+Also, MCV lists don't necessarily require sorting of the values (the fact that
+we use sorting when building them is implementation detail), but even more
+importantly the ordering is not built into the approximation (while histograms
+are built on ordering). So MCV lists work well even for attributes where the
+ordering of the data type is disconnected from the meaning of the data. For
+example we know how to sort strings, but it's unlikely to make much sense for
+city names (or other label-like attributes).
+
+
+Selectivity estimation
+----------------------
+
+The estimation, implemented in clauselist_mv_selectivity_mcvlist(), is quite
+simple in principle - we need to identify MCV items matching all the clauses
+and sum frequencies of all those items.
+
+Currently MCV lists support estimation of the following clause types:
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR clauses WHERE (a < 1) OR (b >= 2)
+
+It's possible to add support for additional clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and possibly others. These are tasks for the future, not yet implemented.
+
+
+Estimating equality clauses
+---------------------------
+
+When computing selectivity estimate for equality clauses
+
+ (a = 1) AND (b = 2)
+
+we can do this estimate pretty exactly assuming that two conditions are met:
+
+ (1) there's an equality condition on all attributes of the statistic
+
+ (2) we find a matching item in the MCV list
+
+In this case we know the MCV item represents all tuples matching the clauses,
+and the selectivity estimate is complete (i.e. we don't need to perform
+estimation using the histogram). This is what we call 'full match'.
+
+When only (1) holds, but there's no matching MCV item, we don't know whether
+there are no such rows or just are not very frequent. We can however use the
+frequency of the least frequent MCV item as an upper bound for the selectivity.
+
+For a combination of equality conditions (not full-match case) we can clamp the
+selectivity by the minimum of selectivities for each condition. For example if
+we know the number of distinct values for each column, we can use 1/ndistinct
+as a per-column estimate. Or rather 1/ndistinct + selectivity derived from the
+MCV list.
+
+We should also probably only use the 'residual ndistinct' by exluding the items
+included in the MCV list (and also residual frequency):
+
+ f = (1.0 - sum(MCV frequencies)) / (ndistinct - ndistinct(MCV list))
+
+but it's worth pointing out the ndistinct values are multi-variate for the
+columns referenced by the equality conditions.
+
+Note: Only the "full match" limit is currently implemented.
+
+
+Hashed MCV (not yet implemented)
+--------------------------------
+
+Regular MCV lists have to include actual values for each item, so if those items
+are large the list may be quite large. This is especially true for multi-variate
+MCV lists, although the current implementation partially mitigates this by
+performing de-duplicating the values before storing them on disk.
+
+It's possible to only store hashes (32-bit values) instead of the actual values,
+significantly reducing the space requirements. Obviously, this would only make
+the MCV lists useful for estimating equality conditions (assuming the 32-bit
+hashes make the collisions rare enough).
+
+This might also complicate matching the columns to available stats.
+
+
+TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
+ of the actual values. This type of MCV list will be useful only for
+ estimating equality clauses, and will reduce space requirements for large
+ varlena types (in such cases we usually only want equality anyway).
+
+TODO Currently there's no logic to consider building only a MCV list (and not
+ building the histogram at all), except for doing this decision manually in
+ ADD STATISTICS.
+
+
+Inspecting the MCV list
+-----------------------
+
+Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarrays, so we
+simply get the text representation of the arrays.
+
+With multivariate MCV lits it's not that simple due to the possible mix of
+data types. It might be possible to produce similar array-like representation,
+but that'd unnecessarily complicate further processing and analysis of the MCV
+list. Instead, there's a SRF function providing values, frequencies etc.
+
+ SELECT * FROM pg_mv_mcv_items();
+
+It has two input parameters:
+
+ oid - OID of the MCV list (pg_mv_statistic.staoid)
+
+and produces a table with these columns:
+
+ - item ID (0...nitems-1)
+ - values (string array)
+ - nulls only (boolean array)
+ - frequency (double precision)
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index a38ea7b..5c5c59a 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -8,9 +8,50 @@ not true, resulting in estimation errors.
Multivariate stats track different types of dependencies between the columns,
hopefully improving the estimates.
-Currently we only have one kind of multivariate statistics - soft functional
-dependencies, and we use it to improve estimates of equality clauses. See
-README.dependencies for details.
+
+Types of statistics
+-------------------
+
+Currently we only have two kinds of multivariate statistics
+
+ (a) soft functional dependencies (README.dependencies)
+
+ (b) MCV lists (README.mcv)
+
+
+Compatible clause types
+-----------------------
+
+Each type of statistics may be used to estimate some subset of clause types.
+
+ (a) functional dependencies - equality clauses (AND), possibly IS NULL
+
+ (b) MCV list - equality and inequality clauses, IS [NOT] NULL, AND/OR
+
+Currently only simple operator clauses (Var op Const) are supported, but it's
+possible to support more complex clause types, e.g. (Var op Var).
+
+
+Complex clauses
+---------------
+
+We also support estimating more complex clauses - essentially AND/OR clauses
+with (Var op Const) as leaves, as long as all the referenced attributes are
+covered by a single statistics.
+
+For example this condition
+
+ (a=1) AND ((b=2) OR ((c=3) AND (d=4)))
+
+may be estimated using statistics on (a,b,c,d). If we only have statistics on
+(b,c,d) we may estimate the second part, and estimate (a=1) using simple stats.
+
+If we only have statistics on (a,b,c) we can't apply it at all at this point,
+but it's worth pointing out clauselist_selectivity() works recursively and when
+handling the second part (the OR-clause), we'll be able to apply the statistics.
+
+Note: The multi-statistics estimation patch also makes it possible to pass some
+clauses as 'conditions' into the deeper parts of the expression tree.
Selectivity estimation
@@ -23,14 +64,48 @@ When estimating selectivity, we aim to achieve several things:
(b) minimize the overhead, especially when no suitable multivariate stats
exist (so if you are not using multivariate stats, there's no overhead)
-This clauselist_selectivity() performs several inexpensive checks first, before
+Thus clauselist_selectivity() performs several inexpensive checks first, before
even attempting to do the more expensive estimation.
(1) check if there are multivariate stats on the relation
- (2) check there are at least two attributes referenced by clauses compatible
- with multivariate statistics (equality clauses for func. dependencies)
+ (2) check that there are functional dependencies on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equality clauses for func. dependencies)
(3) perform reduction of equality clauses using func. dependencies
- (4) estimate the reduced list of clauses using regular statistics
+ (4) check that there are multivariate MCV lists on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equalities, inequalities, etc.)
+
+ (5) find the best multivariate statistics (matching the most conditions)
+ and use it to compute the estimate
+
+ (6) estimate the remaining clauses (not estimated using multivariate stats)
+ using the regular per-column statistics
+
+Whenever we find there are no suitable stats, we skip the expensive steps.
+
+
+Further (possibly crazy) ideas
+------------------------------
+
+Currently the clauses are only estimated using a single statistics, even if
+there are multiple candidate statistics - for example assume we have statistics
+on (a,b,c) and (b,c,d), and estimate conditions
+
+ (b = 1) AND (c = 2)
+
+Then both statistics may be used, but we only use one of them. Maybe we could
+use compute estimates using all candidate stats, and somehow aggregate them
+into the final estimate by using average or median.
+
+Some stats may give better estimates than others, but it's very difficult to say
+in advance which stats are the best (it depends on the number of buckets, number
+of additional columns not referenced in the clauses, type of condition etc.).
+
+But of course, this may result in expensive estimation (CPU-wise).
+
+So we might add a GUC to choose between a simple (single statistics) and thus
+multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index dcb7c78..4f5a842 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,12 +16,14 @@
#include "common.h"
+#include "utils/array.h"
+
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+ int natts,
+ VacAttrStats **vacattrstats);
static List* list_mv_stats(Oid relid);
-
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -49,6 +51,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int j;
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -87,8 +91,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (stat->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, attrs);
+ update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -166,6 +174,8 @@ list_mv_stats(Oid relid)
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
+ info->mcv_enabled = stats->mcv_enabled;
+ info->mcv_built = stats->mcv_built;
result = lappend(result, info);
}
@@ -180,8 +190,56 @@ list_mv_stats(Oid relid)
return result;
}
+
+/*
+ * Find attnims of MV stats using the mvoid.
+ */
+int2vector*
+find_mv_attnums(Oid mvoid, Oid *relid)
+{
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ HeapTuple htup;
+ int2vector *keys;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ htup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+ /* starelid */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_starelid, &isnull);
+ Assert(!isnull);
+
+ *relid = DatumGetObjectId(adatum);
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+ ReleaseSysCache(htup);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return keys;
+}
+
+
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -206,18 +264,29 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
@@ -246,6 +315,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -256,11 +340,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
@@ -377,3 +465,32 @@ multi_sort_compare_dims(int start, int end,
return 0;
}
+
+/* simple counterpart to qsort_arg */
+void *
+bsearch_arg(const void *key, const void *base, size_t nmemb, size_t size,
+ int (*compar) (const void *, const void *, void *),
+ void *arg)
+{
+ size_t l, u, idx;
+ const void *p;
+ int comparison;
+
+ l = 0;
+ u = nmemb;
+ while (l < u)
+ {
+ idx = (l + u) / 2;
+ p = (void *) (((const char *) base) + (idx * size));
+ comparison = (*compar) (key, p, arg);
+
+ if (comparison < 0)
+ u = idx;
+ else if (comparison > 0)
+ l = idx + 1;
+ else
+ return (void *) p;
+ }
+
+ return NULL;
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index a019ea6..350760b 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -46,7 +46,15 @@ typedef struct
Datum value; /* a data value */
int tupno; /* position index for tuple it came from */
} ScalarItem;
-
+
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -58,6 +66,7 @@ typedef MultiSortSupportData* MultiSortSupport;
typedef struct SortItem {
Datum *values;
bool *isnull;
+ int count;
} SortItem;
MultiSortSupport multi_sort_init(int ndims);
@@ -74,5 +83,11 @@ int multi_sort_compare_dims(int start, int end, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
+
+void * bsearch_arg(const void *key, const void *base,
+ size_t nmemb, size_t size,
+ int (*compar) (const void *, const void *, void *),
+ void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..b300c1a
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1120 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(uint16))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(uint16) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/* Macros for convenient access to parts of the serialized MCV item */
+#define ITEM_INDEXES(item) ((uint16*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+static MultiSortSupport build_mss(VacAttrStats **stats, int2vector *attrs);
+
+static SortItem *build_sorted_items(int numrows, HeapTuple *rows,
+ TupleDesc tdesc, MultiSortSupport mss,
+ int2vector *attrs);
+
+static SortItem *build_distinct_groups(int numrows, SortItem *items,
+ MultiSortSupport mss, int *ndistinct);
+
+static int count_distinct_groups(int numrows, SortItem *items,
+ MultiSortSupport mss);
+
+/*
+ * Builds MCV list from the set of sampled rows.
+ *
+ * The algorithm is quite simple:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * The method also removes rows matching the MCV items from the input array,
+ * and passes the number of remaining rows (useful for building histograms)
+ * using the numrows_filtered parameter.
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We should
+ * do that too, because when walking through the list we want to check
+ * the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or float4).
+ * Maybe we could save some space here, but the bytea compression should
+ * handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed from
+ * the table, but rather estimate the number of distinct values in the
+ * table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* comparator for all the columns */
+ MultiSortSupport mss = build_mss(stats, attrs);
+
+ /* sort the rows */
+ SortItem *items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+ mss, attrs);
+
+ /* transform the sorted rows into groups (sorted by frequency) */
+ SortItem *groups = build_distinct_groups(numrows, items, mss, &ndistinct);
+
+ /*
+ * Determine the minimum size of a group to be eligible for MCV list, and
+ * check how many groups actually pass that threshold. We use 1.25x the
+ * avarage group size, just like for regular statistics.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e. if there
+ * are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS), we'll require
+ * only 2 rows per group.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog) instead
+ * of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /* Walk through the groups and stop once we fall below the threshold. */
+ nitems = 0;
+ for (i = 0; i < ndistinct; i++)
+ {
+ if (groups[i].count < mcv_threshold)
+ break;
+
+ nitems++;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as we will
+ * pass the result outside and thus it needs to be easy to pfree().
+ *
+ * XXX Although we're the only ones dealing with this.
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /* Copy the first chunk of groups into the result. */
+ for (i = 0; i < nitems; i++)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[i];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, groups[i].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, groups[i].isnull, sizeof(bool) * numattrs);
+
+ /* and finally the group frequency */
+ item->frequency = (double)groups[i].count / numrows;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows that are
+ * not represented by the MCV list). We will first sort the groups
+ * by the keys (not by count) and then use binary search.
+ */
+ if (nitems > ndistinct)
+ {
+ int i, j;
+ int nfiltered = 0;
+
+ /* used for the searches */
+ SortItem key;
+
+ /* wfill this with data from the rows */
+ key.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ key.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * Sort the groups for bsearch_r (but only the items that actually
+ * made it to the MCV list).
+ */
+ qsort_arg((void *) groups, nitems, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ key.values[j]
+ = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &key.isnull[j]);
+
+ /* if not included in the MCV list, keep it in the array */
+ if (bsearch_arg(&key, groups, nitems, sizeof(SortItem),
+ multi_sort_compare, mss) == NULL)
+ rows[nfiltered++] = rows[i];
+ }
+
+ /* remember how many rows we actually kept */
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(key.values);
+ pfree(key.isnull);
+ }
+ else
+ /* the MCV list convers all the rows */
+ *numrows_filtered = 0;
+ }
+
+ pfree(items);
+ pfree(groups);
+
+ return mcvlist;
+}
+
+/* build MultiSortSupport for the attributes passed in attrs */
+static MultiSortSupport
+build_mss(VacAttrStats **stats, int2vector *attrs)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ return mss;
+}
+
+/* build sorted array of SortItem with values from rows */
+static SortItem *
+build_sorted_items(int numrows, HeapTuple *rows, TupleDesc tdesc,
+ MultiSortSupport mss, int2vector *attrs)
+{
+ int i, j, len;
+ int numattrs = attrs->dim1;
+ int nvalues = numrows * numattrs;
+
+ /*
+ * We won't allocate the arrays for each item independenly, but in one large
+ * chunk and then just set the pointers.
+ */
+ SortItem *items;
+ Datum *values;
+ bool *isnull;
+ char *ptr;
+
+ /* Compute the total amount of memory we need (both items and values). */
+ len = numrows * sizeof(SortItem) + nvalues * (sizeof(Datum) + sizeof(bool));
+
+ /* Allocate the memory and split it into the pieces. */
+ ptr = palloc0(len);
+
+ /* items to sort */
+ items = (SortItem*)ptr;
+ ptr += numrows * sizeof(SortItem);
+
+ /* values and null flags */
+ values = (Datum*)ptr;
+ ptr += nvalues * sizeof(Datum);
+
+ isnull = (bool*)ptr;
+ ptr += nvalues * sizeof(bool);
+
+ /* make sure we consumed the whole buffer exactly */
+ Assert((ptr - (char*)items) == len);
+
+ /* fix the pointers to Datum and bool arrays */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numattrs; j++)
+ {
+ items[i].values[j] = heap_getattr(rows[i],
+ attrs->values[j], /* attnum */
+ tdesc,
+ &items[i].isnull[j]); /* isnull */
+ }
+ }
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ return items;
+}
+
+/* count distinct combinations of SortItems in the array */
+static int
+count_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss)
+{
+ int i;
+ int ndistinct;
+
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ return ndistinct;
+}
+
+/* compares frequencies of the SortItem entries (in descending order) */
+static int
+compare_sort_item_count(const void *a, const void *b)
+{
+ SortItem *ia = (SortItem *)a;
+ SortItem *ib = (SortItem *)b;
+
+ if (ia->count == ib->count)
+ return 0;
+ else if (ia->count > ib->count)
+ return -1;
+
+ return 1;
+}
+
+/* builds SortItems for distinct groups and counts the matching items */
+static SortItem *
+build_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss,
+ int *ndistinct)
+{
+ int i, j;
+ int ngroups = count_distinct_groups(numrows, items, mss);
+
+ SortItem *groups = (SortItem*)palloc0(ngroups * sizeof(SortItem));
+
+ j = 0;
+ groups[0] = items[0];
+ groups[0].count = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ groups[++j] = items[i];
+
+ groups[j].count++;
+ }
+
+ pg_qsort((void *) groups, ngroups, sizeof(SortItem),
+ compare_sort_item_count);
+
+ *ndistinct = ngroups;
+ return groups;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+MCVList
+load_mv_mcvlist(Oid mvoid)
+{
+ bool isnull = false;
+ Datum mcvlist;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->mcv_enabled && mvstat->mcv_built);
+#endif
+
+ mcvlist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_mcvlist(DatumGetByteaP(mcvlist));
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/*
+ * serialize MCV list into a bytea value
+ *
+ *
+ * The basic algorithm is simple:
+ *
+ * (1) perform deduplication (for each attribute separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we may be mixing
+ * different datatypes, with different sort operators, etc.
+ *
+ * We'll use uint16 values for the indexes in step (3), as we don't allow more
+ * than 8k MCV items (see list max_mcv_items), although that's mostly arbitrary
+ * limit. We might increase this to 65k and still fit into uint16.
+ *
+ * We don't really expect the serialization to save as much space as for
+ * histograms, because we are not doing any bucket splits (which is the source
+ * of high redundancy in histograms).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into a single char
+ * (or a longer type) instead of using an array of bool items.
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ SortSupport ssup;
+ DimensionInfo *info;
+
+ Size total_length;
+
+ /* allocate just once */
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /*
+ * We'll include some rudimentary information about the attributes (type
+ * length, etc.), so that we don't have to look them up while deserializing
+ * the MCV list.
+ */
+ info = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data for all attributes included in the MCV list */
+ ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for all attributes */
+ for (i = 0; i < ndims; i++)
+ {
+ int ndistinct;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* copy important info about the data type (length, by-value) */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for values in the attribute and collect them */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /* skip NULL values - we don't need to serialize them */
+ if (mcvlist->items[j]->isnull[i])
+ continue;
+
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+
+ /* there are just NULL values in this dimension, we're done */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate the data */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicate values, but keep the
+ * ordering (so that we can do bsearch later). We know there's at least
+ * one item as (counts[i] != 0), so we can skip the first element.
+ */
+ ndistinct = 1; /* number of distinct values */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if the value is the same as the previous one, we can skip it */
+ if (! compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]))
+ continue;
+
+ values[i][ndistinct] = values[i][j];
+ ndistinct += 1;
+ }
+
+ /* we must not exceed UINT16_MAX, as we use uint16 indexes */
+ Assert(ndistinct <= UINT16_MAX);
+
+ /*
+ * Store additional info about the attribute - number of deduplicated
+ * values, and also size of the serialized data. For fixed-length data
+ * types this is trivial to compute, for varwidth types we need to
+ * actually walk the array and sum the sizes.
+ */
+ info[i].nvalues = ndistinct;
+
+ if (info[i].typlen > 0) /* fixed-length data types */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1) /* varlena */
+ {
+ info[i].nbytes = 0;
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2) /* cstring */
+ {
+ info[i].nbytes = 0;
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ }
+
+ /* we know (count>0) so there must be some data */
+ Assert(info[i].nbytes > 0);
+ }
+
+ /*
+ * Now we can finally compute how much space we'll actually need for the
+ * serialized MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and then we
+ * will place all the data (values + indexes).
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (1024 * 1024))
+ elog(ERROR, "serialized MCV list exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* 'data' points to the current position in the output buffer */
+ data = VARDATA(output);
+
+ /* MCV list header (number of items, ...) */
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ /* information about the attributes */
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* now serialize the deduplicated values for all attributes */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data; /* remember the starting point */
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ Datum v = values[i][j];
+
+ if (info[i].typbyval) /* passed by value */
+ {
+ memcpy(data, &v, info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0) /* pased by reference */
+ {
+ memcpy(data, DatumGetPointer(v), info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1) /* varlena */
+ {
+ memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+ data += VARSIZE_ANY(v);
+ }
+ else if (info[i].typlen == -2) /* cstring */
+ {
+ memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v))+1);
+ data += strlen(DatumGetPointer(v)) + 1; /* terminator */
+ }
+ }
+
+ /* make sure we got exactly the amount of data we expected */
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* finally serialize the items, with uint16 indexes instead of the values */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem mcvitem = mcvlist->items[i];
+
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the item (we only allocate it once and reuse it) */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ Datum *v = NULL;
+
+ /* do the lookup only for non-NULL values */
+ if (mcvlist->items[i]->isnull[j])
+ continue;
+
+ v = (Datum*)bsearch_arg(&mcvitem->values[j], values[j],
+ info[j].nvalues, sizeof(Datum),
+ compare_scalars_simple, &ssup[j]);
+
+ Assert(v != NULL); /* serialization or deduplication error */
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims), mcvitem->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims), &mcvitem->frequency, sizeof(double));
+
+ /* copy the serialized item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/*
+ * deserialize MCV list from the varlena value
+ *
+ *
+ * We deserialize the MCV list fully, because we don't expect there bo be a lot
+ * of duplicate values. But perhaps we should keep the MCV in serialized form
+ * just like histograms.
+ */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ uint16 *indexes = NULL;
+ Datum **values = NULL;
+
+ /* local allocation buffer (used only for deserialization) */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* buffer used for the result */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ /* we can't deserialize the MCV if there's not even a complete header */
+ expected_size = offsetof(MCVListData,items);
+
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform further sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert((nitems > 0) && (nitems <= MVSTAT_MCVLIST_MAX_ITEMS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Check amount of data including DimensionInfo for all dimensions and
+ * also the serialized items (including uint16 indexes). Also, walk
+ * through the dimension information and add it to the sum.
+ */
+ expected_size += ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ {
+ Assert(info[i].nvalues >= 0);
+ Assert(info[i].nbytes >= 0);
+
+ expected_size += info[i].nbytes;
+ }
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * Allocate one large chunk of memory for the intermediate data, needed
+ * only for deserializing the MCV list (and allocate densely to minimize
+ * the palloc overhead).
+ *
+ * Let's see how much space we'll actually need, and also include space
+ * for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ /* for full-size byval types, we reuse the serialized value */
+ if (! (info[i].typbyval && info[i].typlen == sizeof(Datum)))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * XXX This uses pointers to the original data array (the types not passed
+ * by value), so when someone frees the memory, e.g. by doing something
+ * like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. Should copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the other types need a chunk of the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ /* pased by reference, but fixed length (name, tid, ...) */
+ if (info[i].typlen > 0)
+ {
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should have exhausted the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for all the MCV items in a single piece */
+ rbufflen = (sizeof(MCVItem) + sizeof(MCVItemData) +
+ sizeof(Datum)*ndims + sizeof(bool)*ndims) * nitems;
+
+ rbuff = palloc0(rbufflen);
+ rptr = rbuff;
+
+ mcvlist->items = (MCVItem*)rbuff;
+ rptr += (sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)rptr;
+ rptr += (sizeof(MCVItemData));
+
+ item->values = (Datum*)rptr;
+ rptr += (sizeof(Datum)*ndims);
+
+ item->isnull = (bool*)rptr;
+ rptr += (sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+ for (j = 0; j < ndims; j++)
+ Assert(indexes[j] <= UINT16_MAX);
+#endif
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ /* release the temporary buffer */
+ pfree(buff);
+
+ return mcvlist;
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if
+ * the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_mcv_items);
+
+Datum
+pg_mv_mcv_items(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MCVList mcvlist;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ mcvlist = load_mv_mcvlist(PG_GETARG_OID(0));
+
+ funcctx->user_fctx = mcvlist;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = mcvlist->nitems;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /* build metadata needed later to produce tuples from raw C-strings */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MCVList mcvlist;
+ MCVItem item;
+
+ mcvlist = (MCVList)funcctx->user_fctx;
+
+ Assert(call_cntr < mcvlist->nitems);
+
+ item = mcvlist->items[call_cntr];
+
+ stakeys = find_mv_attnums(PG_GETARG_OID(0), &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple. This should
+ * be an array of C strings which will be processed later by the type
+ * input functions.
+ */
+ values = (char **) palloc(4 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+
+ /* frequency */
+ values[3] = (char *) palloc(64 * sizeof(char));
+
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * mcvlist->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* item ID */
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ Datum val, valout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == mcvlist->ndimensions-1)
+ format = "%s, %s}";
+
+ if (item->isnull[i])
+ valout = CStringGetDatum("NULL");
+ else
+ {
+ val = item->values[i];
+ valout = FunctionCall1(&fmgrinfo[i], val);
+ }
+
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8ce9c0e..2c22d31 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,8 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled,\n"
- " deps_built,\n"
+ " deps_enabled, mcv_enabled,\n"
+ " deps_built, mcv_built,\n"
+ " mcv_max_items,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2128,6 +2129,8 @@ describeOneTableDetails(const char *schemaname,
printTableAddFooter(&cont, _("Statistics:"));
for (i = 0; i < tuples; i++)
{
+ bool first = true;
+
printfPQExpBuffer(&buf, " ");
/* statistics name (qualified with namespace) */
@@ -2137,10 +2140,22 @@ describeOneTableDetails(const char *schemaname,
/* options */
if (!strcmp(PQgetvalue(result, i, 4), "t"))
- appendPQExpBuffer(&buf, "(dependencies)");
+ {
+ appendPQExpBuffer(&buf, "(dependencies");
+ first = false;
+ }
+
+ if (!strcmp(PQgetvalue(result, i, 5), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", mcv");
+ else
+ appendPQExpBuffer(&buf, "(mcv");
+ first = false;
+ }
- appendPQExpBuffer(&buf, " ON (%s)",
- PQgetvalue(result, i, 6));
+ appendPQExpBuffer(&buf, ") ON (%s)",
+ PQgetvalue(result, i, 9));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index c74af47..3529b03 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -38,15 +38,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -62,14 +68,18 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 8
+#define Natts_pg_mv_statistic 12
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_staowner 4
#define Anum_pg_mv_statistic_deps_enabled 5
-#define Anum_pg_mv_statistic_deps_built 6
-#define Anum_pg_mv_statistic_stakeys 7
-#define Anum_pg_mv_statistic_stadeps 8
+#define Anum_pg_mv_statistic_mcv_enabled 6
+#define Anum_pg_mv_statistic_mcv_max_items 7
+#define Anum_pg_mv_statistic_deps_built 8
+#define Anum_pg_mv_statistic_mcv_built 9
+#define Anum_pg_mv_statistic_stakeys 10
+#define Anum_pg_mv_statistic_stadeps 11
+#define Anum_pg_mv_statistic_stamcv 12
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index ff2d797..f8ceabf 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2670,6 +2670,10 @@ DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index e10dcf1..2bcd582 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -653,9 +653,11 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
+ bool mcv_enabled; /* MCV list enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
+ bool mcv_built; /* MCV list built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index c6f45ab..ce7c3ad 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -52,30 +52,89 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
*/
MVDependencies load_mv_dependencies(Oid mvoid);
+MCVList load_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..075320b
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s4 ON mcv_list (unknown_column) WITH (mcv);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s4 ON mcv_list (a) WITH (mcv);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a, b) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+ERROR: max number of MCV items is 8192
+-- correct command
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s5 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s6 ON mcv_list (a, b, c, d) WITH (mcv);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 06f2231..3d55ffe 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1373,7 +1373,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
s.staname,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 4f2ffb8..85d94f1 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 097a04f..6584d73 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -163,3 +163,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..b31d32d
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,178 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s4 ON mcv_list (unknown_column) WITH (mcv);
+
+-- single column
+CREATE STATISTICS s4 ON mcv_list (a) WITH (mcv);
+
+-- single column, duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a) WITH (mcv);
+
+-- two columns, one duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a, b) WITH (mcv);
+
+-- unknown option
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (unknown_option);
+
+-- missing MCV statistics
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+
+-- correct command
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s5 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s6 ON mcv_list (a, b, c, d) WITH (mcv);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DROP TABLE mcv_list;
--
2.5.0
0003-clause-reduction-using-functional-dependencies.patchtext/x-patch; charset=UTF-8; name=0003-clause-reduction-using-functional-dependencies.patchDownload
From 70cf8ed9f0d161a335be1e72a25f325261ea9e46 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 19:42:18 +0200
Subject: [PATCH 3/9] clause reduction using functional dependencies
During planning, use functional dependencies to decide which clauses to
skip during cardinality estimation. Initial and rather simplistic
implementation.
This only works with regular WHERE clauses, not clauses used for join
clauses.
Note: The clause_is_mv_compatible() needs to identify the relation (so
that we can fetch the list of multivariate stats by OID).
planner_rt_fetch() seems like the appropriate way to get the relation
OID, but apparently it only works with simple vars. Maybe
examine_variable() would make this work with more complex vars too?
Includes regression tests analyzing functional dependencies (part of
ANALYZE) on several datasets (no dependencies, no transitive
dependencies, ...).
Checks that a query with conditions on two columns, where one (B) is
functionally dependent on the other one (A), correctly ignores the
clause on (B) and chooses bitmap index scan instead of plain index scan
(which is what happens otherwise, thanks to assumption of
independence).
Note: Functional dependencies only work with equality clauses, no
inequalities etc.
---
src/backend/optimizer/path/clausesel.c | 505 ++++++++++++++++-
src/backend/utils/mvstats/README.dependencies | 63 +--
src/backend/utils/mvstats/README.stats | 36 ++
src/backend/utils/mvstats/common.c | 25 +-
src/backend/utils/mvstats/common.h | 3 +
src/backend/utils/mvstats/dependencies.c | 775 +++++++++++++++++---------
src/include/utils/mvstats.h | 21 +-
src/test/regress/expected/mv_dependencies.out | 172 ++++++
src/test/regress/parallel_schedule | 3 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 150 +++++
11 files changed, 1457 insertions(+), 297 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.stats
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 02660c2..5ab7f15 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -14,14 +14,19 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
+#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/selfuncs.h"
+#include "utils/typcache.h"
/*
@@ -41,6 +46,25 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+
+static bool clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum);
+
+static Bitmapset *collect_mv_attnums(List *clauses, Index relid);
+
+static int count_mv_attnums(List *clauses, Index relid);
+
+static int count_varnos(List *clauses, Index *relid);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Index relid, List *stats);
+
+static bool has_stats(List *stats, int type);
+
+static List * find_stats(PlannerInfo *root, Index relid);
+
+static bool stats_type_matches(MVStatisticInfo *stat, int type);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
@@ -60,7 +84,19 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
+ * The first thing we try to do is applying multivariate statistics, in a way
+ * that intends to minimize the overhead when there are no multivariate stats
+ * on the relation. Thus we do several simple (and inexpensive) checks first,
+ * to verify that suitable multivariate statistics exist.
+ *
+ * If we identify such multivariate statistics apply, we try to apply them.
+ * Currently we only have (soft) functional dependencies, so we try to reduce
+ * the list of clauses.
+ *
+ * Then we remove the clauses estimated using multivariate stats, and process
+ * the rest of the clauses using the regular per-column stats.
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -99,6 +135,22 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+
+ /* list of multivariate stats on the relation */
+ List *stats = NIL;
+
+ /*
+ * To fetch the statistics, we first need to determine the rel. Currently
+ * point we only support estimates of simple restrictions with all Vars
+ * referencing a single baserel. However set_baserel_size_estimates() sets
+ * varRelid=0 so we have to actually inspect the clauses by pull_varnos
+ * and see if there's just a single varno referenced.
+ */
+ if ((count_varnos(clauses, &relid) == 1) && ((varRelid == 0) || (varRelid == relid)))
+ stats = find_stats(root, relid);
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +160,24 @@ clauselist_selectivity(PlannerInfo *root,
varRelid, jointype, sjinfo);
/*
+ * Apply functional dependencies, but first check that there are some stats
+ * with functional dependencies built (by simply walking the stats list),
+ * and that there are at two or more attributes referenced by clauses that
+ * may be reduced using functional dependencies.
+ *
+ * We would find that anyway when trying to actually apply the functional
+ * dependencies, but let's do the cheap checks first.
+ *
+ * After applying the functional dependencies we get the remainig clauses
+ * that need to be estimated by other types of stats (MCV, histograms etc).
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
+ (count_mv_attnums(clauses, relid) >= 2))
+ {
+ clauses = clauselist_apply_dependencies(root, clauses, relid, stats);
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -763,3 +833,436 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Index relid)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(clause, relid, &attnum))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Index relid)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Count varnos referenced in the clauses, and if there's a single varno then
+ * return the index in 'relid'.
+ */
+static int
+count_varnos(List *clauses, Index *relid)
+{
+ int cnt;
+ Bitmapset *varnos = NULL;
+
+ varnos = pull_varnos((Node *) clauses);
+ cnt = bms_num_members(varnos);
+
+ /* if there's a single varno in the clauses, remember it */
+ if (bms_num_members(varnos) == 1)
+ *relid = bms_singleton_member(varnos);
+
+ bms_free(varnos);
+
+ return cnt;
+}
+
+typedef struct
+{
+ Index varno; /* relid we're interested in */
+ Bitmapset *varattnos; /* attnums referenced by the clauses */
+} mv_compatible_context;
+
+/*
+ * Recursive walker that checks compatibility of the clause with multivariate
+ * statistics, and collects attnums from the Vars.
+ *
+ * XXX The original idea was to combine this with expression_tree_walker, but
+ * I've been unable to make that work - seems that does not quite allow
+ * checking the structure. Hence the explicit calls to the walker.
+ */
+static bool
+mv_compatible_walker(Node *node, mv_compatible_context *context)
+{
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) node;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return true;
+
+ /* clauses referencing multiple varnos are incompatible */
+ if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+ return true;
+
+ /* check the clause inside the RestrictInfo */
+ return mv_compatible_walker((Node*)rinfo->clause, (void *) context);
+ }
+
+ if (IsA(node, Var))
+ {
+ Var * var = (Var*)node;
+
+ /*
+ * Also, the variable needs to reference the right relid (this might be
+ * unnecessary given the other checks, but let's be sure).
+ */
+ if (var->varno != context->varno)
+ return true;
+
+ /* Also skip system attributes (we don't allow stats on those). */
+ if (! AttrNumberIsForUserDefinedAttr(var->varattno))
+ return true;
+
+ /* Seems fine, so let's remember the attnum. */
+ context->varattnos = bms_add_member(context->varattnos, var->varattno);
+
+ return false;
+ }
+
+ /*
+ * And finally the operator expressions - we only allow simple expressions
+ * with two arguments, where one is a Var and the other is a constant, and
+ * it's a simple comparison (which we detect using estimator function).
+ */
+ if (is_opclause(node))
+ {
+ OpExpr *expr = (OpExpr *) node;
+ Var *var;
+ bool varonleft = true;
+ bool ok;
+
+ /*
+ * Only expressions with two arguments are considered compatible.
+ *
+ * XXX Possibly unnecessary (can OpExpr have different arg count?).
+ */
+ if (list_length(expr->args) != 2)
+ return true;
+
+ /* see if it actually has the right */
+ ok = (NumRelids((Node*)expr) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ /* unsupported structure (two variables or so) */
+ if (! ok)
+ return true;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the clause.
+ * Otherwise note the relid and attnum for the variable. This uses the
+ * function for estimating selectivity, ont the operator directly (a bit
+ * awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+
+ /* equality conditions are compatible with all statistics */
+ break;
+
+ default:
+
+ /* unknown estimator */
+ return true;
+ }
+
+ var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ return mv_compatible_walker((Node *) var, context);
+ }
+
+ /* Node not explicitly supported, so terminate */
+ return true;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
+{
+ mv_compatible_context context;
+
+ context.varno = relid;
+ context.varattnos = NULL; /* no attnums */
+
+ if (mv_compatible_walker(clause, (void *) &context))
+ return false;
+
+ /* remember the newly collected attnums */
+ *attnum = bms_singleton_member(context.varattnos);
+
+ return true;
+}
+
+
+/*
+ * Reduce clauses using functional dependencies
+ */
+static List*
+fdeps_reduce_clauses(List *clauses, Index relid, Bitmapset *reduced_attnums)
+{
+ ListCell *lc;
+ List *reduced_clauses = NIL;
+
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum = InvalidAttrNumber;
+ Node * clause = (Node*)lfirst(lc);
+
+ /* ignore clauses that are not compatible with functional dependencies */
+ if (! clause_is_mv_compatible(clause, relid, &attnum))
+ reduced_clauses = lappend(reduced_clauses, clause);
+
+ /* for equality clauses, only keep those not on reduced attributes */
+ if (! bms_is_member(attnum, reduced_attnums))
+ reduced_clauses = lappend(reduced_clauses, clause);
+ }
+
+ return reduced_clauses;
+}
+
+/*
+ * decide which attributes are redundant (for equality clauses)
+ *
+ * We try to apply all functional dependencies available, and for each one we
+ * check if it matches attnums from equality clauses, but only those not yet
+ * reduced.
+ *
+ * XXX Not sure if the order in which we apply the dependencies matters.
+ *
+ * XXX We do not combine functional dependencies from separate stats. That is
+ * if we have dependencies on [a,b] and [b,c], then we don't deduce
+ * a->c from a->b and b->c. Computing such transitive closure is a possible
+ * future improvement.
+ */
+static Bitmapset *
+fdeps_reduce_attnums(List *stats, Bitmapset *attnums)
+{
+ ListCell *lc;
+ Bitmapset *reduced = NULL;
+
+ foreach (lc, stats)
+ {
+ int i;
+ MVDependencies dependencies = NULL;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* skip statistics without dependencies */
+ if (! stats_type_matches(info, MV_CLAUSE_TYPE_FDEP))
+ continue;
+
+ /* fetch and deserialize dependencies */
+ dependencies = load_mv_dependencies(info->mvoid);
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ int j;
+ bool matched = true;
+ MVDependency dep = dependencies->deps[i];
+
+ /* we don't bother to break the loop early (only few attributes) */
+ for (j = 0; j < dep->nattributes; j++)
+ {
+ if (! bms_is_member(dep->attributes[j], attnums))
+ matched = false;
+
+ if (bms_is_member(dep->attributes[j], reduced))
+ matched = false;
+ }
+
+ /* if dependency applies, mark the last attribute as reduced */
+ if (matched)
+ reduced = bms_add_member(reduced,
+ dep->attributes[dep->nattributes-1]);
+ }
+ }
+
+ return reduced;
+}
+
+/*
+ * reduce list of equality clauses using soft functional dependencies
+ *
+ * We simply walk through list of functional dependencies, and for each one we
+ * check whether the dependency 'matches' the clauses, i.e. if there's a clause
+ * matching the condition. If yes, we attempt to remove all clauses matching
+ * the implied part of the dependency from the list.
+ *
+ * This only reduces equality clauses, and ignores all the other types. We might
+ * extend it to handle IS NULL clause, in the future.
+ *
+ * We also assume the equality clauses are 'compatible'. For example we can't
+ * identify when the clauses use a mismatching zip code and city name. In such
+ * case the usual approach (product of selectivities) would produce a better
+ * estimate, although mostly by chance.
+ *
+ * The implementation needs to be careful about cyclic dependencies, e.g. when
+ *
+ * (a -> b) and (b -> a)
+ *
+ * at the same time, which means there's 1:1 relationship between te columns.
+ * In this case we must not reduce clauses on both attributes at the same time.
+ *
+ * TODO Currently we only apply functional dependencies at the same level, but
+ * maybe we could transfer the clauses from upper levels to the subtrees?
+ * For example let's say we have (a->b) dependency, and condition
+ *
+ * (a=1) AND (b=2 OR c=3)
+ *
+ * Currently, we won't be able to perform any reduction, because we'll
+ * consider (a=1) and (b=2 OR c=3) independently. But maybe we could pass
+ * (a=1) into the other expression, and only check it against conditions
+ * of the functional dependencies?
+ *
+ * In this case we'd end up with
+ *
+ * (a=1)
+ *
+ * as we'd consider (b=2) implied thanks to the rule, rendering the whole
+ * OR clause valid.
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Index relid, List *stats)
+{
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *reduced_attnums = NULL;
+
+ /*
+ * Is there at least one statistics with functional dependencies?
+ * If not, return the original clauses right away.
+ *
+ * XXX Isn't this a bit pointless, thanks to exactly the same check in
+ * clauselist_selectivity()? Can we trigger the condition here?
+ */
+ if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ return clauses;
+
+ /* collect attnums from clauses compatible with dependencies (equality) */
+ clause_attnums = collect_mv_attnums(clauses, relid);
+
+ /* decide which attnums may be eliminated */
+ reduced_attnums = fdeps_reduce_attnums(stats, clause_attnums);
+
+ /*
+ * Walk through the clauses, and see which other clauses we may reduce.
+ */
+ clauses = fdeps_reduce_clauses(clauses, relid, reduced_attnums);
+
+ bms_free(clause_attnums);
+ bms_free(reduced_attnums);
+
+ return clauses;
+}
+
+/*
+ * Check that there are stats with at least one of the requested types.
+ */
+static bool
+stats_type_matches(MVStatisticInfo *stat, int type)
+{
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+
+ return false;
+}
+
+/*
+ * Check that there are stats with at least one of the requested types.
+ */
+static bool
+has_stats(List *stats, int type)
+{
+ ListCell *s;
+
+ foreach (s, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* terminate if we've found at least one matching statistics */
+ if (stats_type_matches(stat, type))
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Lookups stats for a given baserel.
+ */
+static List *
+find_stats(PlannerInfo *root, Index relid)
+{
+ Assert(root->simple_rel_array[relid] != NULL);
+
+ return root->simple_rel_array[relid]->mvstatlist;
+}
diff --git a/src/backend/utils/mvstats/README.dependencies b/src/backend/utils/mvstats/README.dependencies
index 1f96fbc..f248459 100644
--- a/src/backend/utils/mvstats/README.dependencies
+++ b/src/backend/utils/mvstats/README.dependencies
@@ -156,37 +156,24 @@ estimates - especially compared to histograms, that are quite bad in estimating
equality clauses.
-Limitations
------------
-
-Let's see the main liminations of functional dependencies, especially those
-related to the current implementation.
+Multi-column dependencies
+-------------------------
-The current implementation supports only dependencies between two columns, but
-this is merely a simplification of the initial implementation. It's certainly
-useful to mine for dependencies involving multiple columns on the 'left' side,
-i.e. a condition for the dependency. That is dependencies like (a,b -> c).
+The implementation supports dependencies with multiple columns on the left side
+(i.e. condition of the dependency). The detection starts from dependencies with
+a single condition, and then proceeds to higher condition counts.
-The implementation may/should be smart enough not to mine redundant conditions,
-e.g. (a->b) and (a,c -> b), because the latter is a trivial consequence of the
-former one (if values of 'a' determine 'b', adding another column won't change
-that relationship). The ANALYZE should first analyze 1:1 dependencies, then 2:1
-dependencies (and skip the already identified ones), etc.
+It also detects dependencies that are implied by already identified ones, and
+ignores them. For example if we know that (a->b) then we won't add (a,c->b) as
+this dependency is a trivial consequence of (a->b).
-For example the dependency
+For a more practical example, consider these two dependencies
(city name -> zip code)
-
-is much stronger, i.e. whenever it hold, then
-
(city name, state name -> zip code)
-holds too. But in case there are cities with the same name in different states,
-then only the latter dependency will be valid.
-
-Of course, there probably are cities with the same name within a single state,
-but hopefully this is relatively rare occurence (and thus we'll still detect
-the 'soft' dependency).
+We could say that the former dependency is stronger because if it's valid, then
+the second dependency is valid too.
Handling multiple columns on the right side of the dependency, is not necessary,
as those dependencies may be simply decomposed into a set of dependencies with
@@ -199,24 +186,22 @@ is exactly the same as
(a -> b) & (a -> c)
Of course, storing the first form may be more efficient thant storing multiple
-'simple' dependencies separately.
-
+'simple' dependencies separately. This is left as a future work.
-TODO Support dependencies with multiple columns on left/right.
-TODO Investigate using histogram and MCV list to verify the dependencies.
+Future work
+-----------
-TODO Investigate statistical testing of the distribution (to decide whether it
- makes sense to build the histogram/MCV list).
+* Investigate using histogram and MCV list to verify the dependencies.
-TODO Using a min/max of selectivities would probably make more sense for the
- associated columns.
+* Investigate statistical testing of the distribution (to decide whether it
+ makes sense to build the histogram/MCV list).
-TODO Consider eliminating the implied columns from the histogram and MCV lists
- (but maybe that's not a good idea, because that'd make it impossible to use
- these stats for non-equality clauses and also it wouldn't be possible to
- use the stats for verification of the dependencies).
+* Consider eliminating the implied columns from the histogram and MCV lists
+ (but maybe that's not a good idea, because that'd make it impossible to use
+ these stats for non-equality clauses and also it wouldn't be possible to
+ use the stats for verification of the dependencies).
-TODO The reduction probably might be extended to also handle IS NULL clauses,
- assuming we fix the ANALYZE to properly handle NULL values. We however
- won't be able to reduce IS NOT NULL (unless I'm missing something).
+* The reduction probably might be extended to also handle IS NULL clauses,
+ assuming we fix the ANALYZE to properly handle NULL values. We however
+ won't be able to reduce IS NOT NULL (unless I'm missing something).
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
new file mode 100644
index 0000000..a38ea7b
--- /dev/null
+++ b/src/backend/utils/mvstats/README.stats
@@ -0,0 +1,36 @@
+Multivariate statististics
+==========================
+
+When estimating various quantities (e.g. condition selectivities) the default
+approach relies on the assumption of independence. In practice that's often
+not true, resulting in estimation errors.
+
+Multivariate stats track different types of dependencies between the columns,
+hopefully improving the estimates.
+
+Currently we only have one kind of multivariate statistics - soft functional
+dependencies, and we use it to improve estimates of equality clauses. See
+README.dependencies for details.
+
+
+Selectivity estimation
+----------------------
+
+When estimating selectivity, we aim to achieve several things:
+
+ (a) maximize the estimate accuracy
+
+ (b) minimize the overhead, especially when no suitable multivariate stats
+ exist (so if you are not using multivariate stats, there's no overhead)
+
+This clauselist_selectivity() performs several inexpensive checks first, before
+even attempting to do the more expensive estimation.
+
+ (1) check if there are multivariate stats on the relation
+
+ (2) check there are at least two attributes referenced by clauses compatible
+ with multivariate statistics (equality clauses for func. dependencies)
+
+ (3) perform reduction of equality clauses using func. dependencies
+
+ (4) estimate the reduced list of clauses using regular statistics
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index a755c49..dcb7c78 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -84,7 +84,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(stat->mvoid, deps, attrs);
@@ -163,6 +164,7 @@ list_mv_stats(Oid relid)
info->mvoid = HeapTupleGetOid(htup);
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
result = lappend(result, info);
@@ -274,6 +276,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
@@ -354,3 +357,23 @@ multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
b->values[dim], b->isnull[dim],
&mss->ssup[dim]);
}
+
+int
+multi_sort_compare_dims(int start, int end,
+ const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ int dim;
+
+ for (dim = start; dim <= end; dim++)
+ {
+ int r = ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+
+ if (r != 0)
+ return r;
+ }
+
+ return 0;
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index d96422d..a019ea6 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -70,6 +70,9 @@ int multi_sort_compare(const void *a, const void *b, void *arg);
int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
+int multi_sort_compare_dims(int start, int end, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
/* comparators, used when constructing multivariate stats */
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 2a064a0..412dc30 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -17,293 +17,521 @@
#include "common.h"
#include "utils/lsyscache.h"
+/* internal state for generator of variations (k-permutations of n elements) */
+typedef struct VariationGeneratorData {
+
+ int k; /* size of the k-permutation */
+ int current; /* index of the next variation to return */
+
+ int nvariations; /* number of variations generated (size of array) */
+ int variations[1]; /* array of pre-built variations */
+
+} VariationGeneratorData;
+
+typedef VariationGeneratorData* VariationGenerator;
+
+/*
+ * generate all variations (k-permutations of n elements)
+ */
+static void
+generate_variations(VariationGenerator state,
+ int n, int maxlevel, int level, int *current)
+{
+ int i, j;
+
+ /* initialize */
+ if (level == 0)
+ {
+ current = (int*)palloc0(sizeof(int) * (maxlevel+1));
+ state->current = 0;
+ }
+
+ for (i = 0; i < n; i++)
+ {
+ /* check if the value is already used current variation */
+ bool found = false;
+ for (j = 0; j < level; j++)
+ {
+ if (current[j] == i)
+ {
+ found = true;
+ break;
+ }
+ }
+
+ /* already used, so try the next element */
+ if (found)
+ continue;
+
+ /* ok, we can use this element, so store it */
+ current[level] = i;
+
+ /* and check if we do have a complete variation of k elements */
+ if (level == maxlevel)
+ {
+ /* yep, store the variation */
+ Assert(state->current < state->nvariations);
+ memcpy(&state->variations[(state->k * state->current)], current,
+ sizeof(int) * (maxlevel+1));
+ state->current++;
+ }
+ else
+ /* nope, look for additional elements */
+ generate_variations(state, n, maxlevel, level+1, current);
+ }
+
+ if (level == 0)
+ pfree(current);
+}
+
/*
- * Detect functional dependencies between columns.
+ * initialize the generator of variations, and prebuild the variations
*
- * TODO This builds a complete set of dependencies, i.e. including transitive
- * dependencies - if we identify [A => B] and [B => C], we're likely to
- * identify [A => C] too. It might be better to keep only the minimal set
- * of dependencies, i.e. prune all the dependencies that we can recreate
- * by transivitity.
- *
- * There are two conceptual ways to do that:
- *
- * (a) generate all the rules, and then prune the rules that may be
- * recteated by combining other dependencies, or
- *
- * (b) performing the 'is combination of other dependencies' check before
- * actually doing the work
- *
- * The second option has the advantage that we don't really need to perform
- * the sort/count. It's not sufficient alone, though, because we may
- * discover the dependencies in the wrong order. For example we may find
+ * This pre-builds all the variations. We could also generate them in
+ * generator_next(), but this seems simpler.
+ */
+static VariationGenerator
+generator_init(int2vector *attrs, int k)
+{
+ int i;
+ int n = attrs->dim1;
+ int nvariations;
+ VariationGenerator state;
+
+ Assert((n >= k) && (k > 0));
+
+ /* compute the total number of variations as n!/(n-k)! */
+ nvariations = n;
+ for (i = 1; i < k; i++)
+ nvariations *= (n - i);
+
+ /* allocate the generator state as a single chunk of memory */
+ state = (VariationGenerator)palloc0(
+ offsetof(VariationGeneratorData, variations)
+ + (nvariations * k * sizeof(int))); /* variations */
+
+ state->nvariations = nvariations;
+ state->k = k;
+
+ /* now actually pre-generate all the variations */
+ generate_variations(state, n, (k-1), 0, NULL);
+
+ /* we expect to generate exactly the right number of variations */
+ Assert(state->nvariations == state->current);
+
+ /* reset the index */
+ state->current = 0;
+
+ return state;
+}
+
+/* free the generator state */
+static void
+generator_free(VariationGenerator state)
+{
+ /* we've allocated a single chunk, so just free it */
+ pfree(state);
+}
+
+/* generate next combination */
+static int*
+generator_next(VariationGenerator state, int2vector *attrs)
+{
+ if (state->current == state->nvariations)
+ return NULL;
+
+ return &state->variations[state->k * state->current++];
+}
+
+/*
+ * check if the dependency is implied by existing dependencies
+ *
+ * A dependency is considered implied, if there exists a dependency with the
+ * same column on the left, and a subset of columns on the right side. So for
+ * example if we have a dependency
+ *
+ * (a,b,c) -> d
*
- * (a -> b), (a -> c) and then (b -> c)
+ * then we are looking for these six dependencies
*
- * None of those dependencies is a combination of the already known ones,
- * yet (a -> C) is a combination of (a -> b) and (b -> c).
+ * (a) -> d
+ * (b) -> d
+ * (c) -> d
+ * (a,b) -> d
+ * (a,c) -> d
+ * (b,c) -> d
*
- *
- * FIXME Currently we simply replace NULL values with 0 and then handle is as
- * a regular value, but that groups NULL and actual 0 values. That's
- * clearly incorrect - we need to handle NULL values as a separate value.
+ * This does not detect transitive dependencies. For example if we have
+ *
+ * (a) -> b
+ * (b) -> c
+ *
+ * then obviously
+ *
+ * (a) -> c
+ *
+ * but this is not detected. Extending the method to handle transitive cases
+ * is future work.
*/
-MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
- VacAttrStats **stats)
+static bool
+dependency_is_implied(MVDependencies dependencies, int k, int *dependency,
+ int2vector * attrs)
{
- int i;
- int numattrs = attrs->dim1;
+ bool implied = false;
+ int i, j, l;
+ int *tmp;
- /* result */
- int ndeps = 0;
- MVDependencies dependencies = NULL;
- MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+ if (dependencies == NULL)
+ return false;
- /* TODO Maybe this should be somehow related to the number of
- * distinct values in the two columns we're currently analyzing.
- * Assuming the distribution is uniform, we can estimate the
- * average group size and use it as a threshold. Or something
- * like that. Seems better than a static approach.
- */
- int min_group_size = 3;
+ tmp = (int*)palloc0(sizeof(int) * k);
+
+ /* translate the indexes to actual attribute numbers */
+ for (i = 0; i < k; i++)
+ tmp[i] = attrs->values[dependency[i]];
+
+ /* search for a smaller */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ bool contained = true;
+ MVDependency dep = dependencies->deps[i];
+
+ /* does the last attribute match? */
+ if (tmp[k-1] != dep->attributes[dep->nattributes-1])
+ continue; /* nope, no need to check this dependency further */
+
+ /* are the conditions superset of the existing dependency? */
+ for (j = 0; j < (dep->nattributes-1); j++)
+ {
+ bool found = false;
+
+ for (l = 0; l < (k-1); l++)
+ {
+ if (tmp[l] == dep->attributes[j])
+ {
+ found = true;
+ break;
+ }
+ }
+
+ /* we've found an attribute not included in the new dependency */
+ if (! found)
+ {
+ contained = false;
+ break;
+ }
+ }
+
+ /* we've found an existing dependency, trivially proving the new one */
+ if (contained)
+ {
+ implied = true;
+ break;
+ }
+ }
- /* dimension indexes we'll check for associations [a => b] */
- int dima, dimb;
+ pfree(tmp);
+
+ return implied;
+}
+
+/*
+ * validates functional dependency on the data
+ *
+ * An actual work horse of detecting functional dependencies. Given a variation
+ * of k attributes, it checks that the first (k-1) are sufficient to determine
+ * the last one.
+ */
+static bool
+dependency_is_valid(int numrows, HeapTuple *rows, int k, int * dependency,
+ VacAttrStats **stats, int2vector *attrs)
+{
+ int i, j;
+ int nvalues = numrows * k;
/*
- * We'll reuse the same array for all the 2-column combinations.
- *
- * It's possible to sort the sample rows directly, but this seemed
- * somehow simples / less error prone. Another option would be to
- * allocate the arrays for each SortItem separately, but that'd be
- * significant overhead (not just CPU, but especially memory bloat).
+ * XXX Maybe the threshold should be somehow related to the number of
+ * distinct values in the combination of columns we're analyzing.
+ * Assuming the distribution is uniform, we can estimate the average
+ * group size and use it as a threshold, similarly to what we do for
+ * MCV lists.
*/
- SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ int min_group_size = 3;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
- Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
- bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+ /* sort info for all attributes columns */
+ MultiSortSupport mss = multi_sort_init(k);
+
+ /* data for the sort */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * nvalues);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * nvalues);
+
+ /* fix the pointers to values/isnull */
for (i = 0; i < numrows; i++)
{
- items[i].values = &values[i * 2];
- items[i].isnull = &isnull[i * 2];
+ items[i].values = &values[i * k];
+ items[i].isnull = &isnull[i * k];
}
- Assert(numattrs >= 2);
-
/*
- * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ * Verify the dependency (a,b,...)->z, using a rather simple algorithm:
+ *
+ * (a) sort the data lexicographically
*
- * (a) sort the data by [A,B]
- * (b) split the data into groups by A (new group whenever a value changes)
- * (c) count different values in the B column (again, value changes)
+ * (b) split the data into groups by first (k-1) columns
*
- * TODO It should be rather simple to merge [A => B] and [A => C] into
- * [A => B,C]. Just keep A constant, collect all the "implied" columns
- * and you're done.
+ * (c) for each group count different values in the last column
*/
- for (dima = 0; dima < numattrs; dima++)
+
+ /* prepare the sort function for the first dimension, and SortItem array */
+ for (i = 0; i < k; i++)
{
- /* prepare the sort function for the first dimension */
- multi_sort_add_dimension(mss, 0, dima, stats);
+ multi_sort_add_dimension(mss, i, dependency[i], stats);
- for (dimb = 0; dimb < numattrs; dimb++)
+ /* accumulate all the data for both columns into an array and sort it */
+ for (j = 0; j < numrows; j++)
{
- SortItem current;
-
- /* number of groups supporting / contradicting the dependency */
- int n_supporting = 0;
- int n_contradicting = 0;
-
- /* counters valid within a group */
- int group_size = 0;
- int n_violations = 0;
-
- int n_supporting_rows = 0;
- int n_contradicting_rows = 0;
-
- /* make sure the columns are different (A => A) */
- if (dima == dimb)
- continue;
-
- /* prepare the sort function for the second dimension */
- multi_sort_add_dimension(mss, 1, dimb, stats);
-
- /* reset the values and isnull flags */
- memset(values, 0, sizeof(Datum) * numrows * 2);
- memset(isnull, 0, sizeof(bool) * numrows * 2);
+ items[j].values[i]
+ = heap_getattr(rows[j], attrs->values[dependency[i]],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+ }
+ }
- /* accumulate all the data for both columns into an array and sort it */
- for (i = 0; i < numrows; i++)
- {
- items[i].values[0]
- = heap_getattr(rows[i], attrs->values[dima],
- stats[dima]->tupDesc, &items[i].isnull[0]);
+ /* sort the items so that we can detect the groups */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
- items[i].values[1]
- = heap_getattr(rows[i], attrs->values[dimb],
- stats[dimb]->tupDesc, &items[i].isnull[1]);
- }
+ /*
+ * Walk through the sorted array, split it into rows according to the first
+ * (k-1) columns. If there's a single value in the last column, we count
+ * the group as 'supporting' the functional dependency. Otherwise we count
+ * it as contradicting.
+ *
+ * We also require a group to have a minimum number of rows to be considered
+ * useful for supporting the dependency. Contradicting groups may be of
+ * any size, though.
+ *
+ * XXX The minimum size requirement makes it impossible to identify case
+ * when both columns are unique (or nearly unique), and therefore
+ * trivially functionally dependent.
+ */
- qsort_arg((void *) items, numrows, sizeof(SortItem),
- multi_sort_compare, mss);
+ /* start with the first row forming a group */
+ group_size = 1;
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the preceding group */
+ if (multi_sort_compare_dims(0, (k-2), &items[i-1], &items[i], mss) != 0)
+ {
/*
- * Walk through the array, split it into rows according to
- * the A value, and count distinct values in the other one.
- * If there's a single B value for the whole group, we count
- * it as supporting the association, otherwise we count it
- * as contradicting.
- *
- * Furthermore we require a group to have at least a certain
- * number of rows to be considered useful for supporting the
- * dependency. But when it's contradicting, use it always useful.
+ * If there is a single are no contradicting rows, count the group
+ * as supporting, otherwise contradicting.
*/
-
- /* start with values from the first row */
- current = items[0];
- group_size = 1;
-
- for (i = 1; i < numrows; i++)
- {
- /* end of the group */
- if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
- {
- /*
- * If there are no contradicting rows, count it as
- * supporting (otherwise contradicting), but only if
- * the group is large enough.
- *
- * The requirement of a minimum group size makes it
- * impossible to identify [unique,unique] cases, but
- * that's probably a different case. This is more
- * about [zip => city] associations etc.
- *
- * If there are violations, count the group/rows as
- * a violation.
- *
- * It may ne neither, if the group is too small (does
- * not contain at least min_group_size rows).
- */
- if ((n_violations == 0) && (group_size >= min_group_size))
- {
- n_supporting += 1;
- n_supporting_rows += group_size;
- }
- else if (n_violations > 0)
- {
- n_contradicting += 1;
- n_contradicting_rows += group_size;
- }
-
- /* current values start a new group */
- n_violations = 0;
- group_size = 0;
- }
- /* mismatch of a B value is contradicting */
- else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
- {
- n_violations += 1;
- }
-
- current = items[i];
- group_size += 1;
- }
-
- /* handle the last group (just like above) */
if ((n_violations == 0) && (group_size >= min_group_size))
{
- n_supporting += 1;
+ n_supporting += 1;
n_supporting_rows += group_size;
}
- else if (n_violations)
+ else if (n_violations > 0)
{
- n_contradicting += 1;
+ n_contradicting += 1;
n_contradicting_rows += group_size;
}
- /*
- * See if the number of rows supporting the association is at least
- * 10x the number of rows violating the hypothetical dependency.
- *
- * TODO This is rather arbitrary limit - I guess it's possible to do
- * some math to come up with a better rule (e.g. testing a hypothesis
- * 'this is due to randomness'). We can create a contingency table
- * from the values and use it for testing. Possibly only when
- * there are no contradicting rows?
- *
- * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
- * means there's a 1:1 relation (or one is a 'label'), making the
- * conditions rather redundant. Although it's possible that the
- * query uses incompatible combination of values.
- */
- if (n_supporting_rows > (n_contradicting_rows * 10))
- {
- if (dependencies == NULL)
- {
- dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
- dependencies->magic = MVSTAT_DEPS_MAGIC;
- }
- else
- dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
- + sizeof(MVDependency) * (dependencies->ndeps + 1));
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* first colums match, but the last one does not (so contradicting) */
+ else if (multi_sort_compare_dims((k-1), (k-1), &items[i-1], &items[i], mss) != 0)
+ n_violations += 1;
- /* update the */
- dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
- dependencies->deps[ndeps]->a = attrs->values[dima];
- dependencies->deps[ndeps]->b = attrs->values[dimb];
+ group_size += 1;
+ }
- dependencies->ndeps = (++ndeps);
- }
- }
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
}
pfree(items);
pfree(values);
pfree(isnull);
- pfree(stats);
pfree(mss);
- return dependencies;
+ /*
+ * See if the number of rows supporting the association is at least 10x the
+ * number of rows violating the hypothetical dependency.
+ */
+ return (n_supporting_rows > (n_contradicting_rows * 10));
}
/*
- * Store the dependencies into a bytea, so that it can be stored in the
- * pg_mv_statistic catalog.
+ * detects functional dependencies between groups of columns
+ *
+ * Generates all possible subsets of columns (variations) and checks if the
+ * last one is determined by the preceding ones. For example given 3 columns,
+ * there are 12 variations (6 for variations on 2 columns, 6 for 3 columns):
+ *
+ * two columns three columns
+ * ----------- -------------
+ * (a) -> c (a,b) -> c
+ * (b) -> c (b,a) -> c
+ * (a) -> b (a,c) -> b
+ * (c) -> b (c,a) -> b
+ * (c) -> a (c,b) -> a
+ * (b) -> a (b,c) -> a
+ *
+ * Clearly some of the variations are redundant, as the order of columns on the
+ * left side does not matter. This is detected in dependency_is_implied, and
+ * those dependencies are ignored.
*
- * Currently this only supports simple two-column rules, and stores them
- * as a sequence of attnum pairs. In the future, this needs to be made
- * more complex to support multiple columns on both sides of the
- * implication (using AND on left, OR on right).
+ * We however do not detect that dependencies are transitively implied. For
+ * example given dependencies
+ *
+ * (a) -> b
+ * (b) -> c
+ *
+ * then
+ *
+ * (a) -> c
+ *
+ * is trivially implied. However we don't detect that and all three dependencies
+ * will get included in the resulting set. Eliminating such transitively implied
+ * dependencies is future work.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int k;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ MVDependencies dependencies = NULL;
+
+ Assert(numattrs >= 2);
+
+ /*
+ * We'll try build functional dependencies starting from the smallest ones
+ * covering jut 2 columns, to the largest ones, covering all columns
+ * included int the statistics. We start from the smallest ones because
+ * we want to be able to skip already implied ones.
+ */
+ for (k = 2; k <= numattrs; k++)
+ {
+ int *dependency; /* array with k elements */
+
+ /* prepare a generator of variation */
+ VariationGenerator generator = generator_init(attrs, k);
+
+ /* generate all possible variations of k values (out of n) */
+ while ((dependency = generator_next(generator, attrs)))
+ {
+ MVDependency d;
+
+ /* skip dependencies that are already trivially implied */
+ if (dependency_is_implied(dependencies, k, dependency, attrs))
+ continue;
+
+ /* also skip dependencies that don't seem to be valid */
+ if (! dependency_is_valid(numrows, rows, k, dependency, stats, attrs))
+ continue;
+
+ d = (MVDependency)palloc0(offsetof(MVDependencyData, attributes)
+ + k * sizeof(int));
+
+ /* copy the dependency, but translate it to actuall attnums */
+ d->nattributes = k;
+ for (i = 0; i < k; i++)
+ d->attributes[i] = attrs->values[dependency[i]];
+
+ /* initialize the list of dependencies */
+ if (dependencies == NULL)
+ {
+ dependencies
+ = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ dependencies->type = MVSTAT_DEPS_TYPE_BASIC;
+ dependencies->ndeps = 0;
+ }
+
+ dependencies->ndeps++;
+ dependencies = (MVDependencies)repalloc(dependencies,
+ offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * sizeof(MVDependency));
+
+ dependencies->deps[dependencies->ndeps-1] = d;
+ }
+
+ /* we're done with variations of k elements, so free the generator */
+ generator_free(generator);
+ }
+
+ return dependencies;
+}
+
+
+/*
+ * serialize list of dependencies into a bytea
*/
bytea *
serialize_mv_dependencies(MVDependencies dependencies)
{
int i;
+ bytea * output;
+ char *tmp;
- /* we need to store ndeps, and each needs 2 * int16 */
+ /* we need to store ndeps, with a number of attributes for each one */
Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
- + dependencies->ndeps * (sizeof(int16) * 2);
-
- bytea * output = (bytea*)palloc0(len);
+ + sizeof(int) * dependencies->ndeps;
- char * tmp = VARDATA(output);
+ /* and also include space for the actual attribute numbers */
+ for (i = 0; i < dependencies->ndeps; i++)
+ len += (sizeof(int16) * dependencies->deps[i]->nattributes);
+ output = (bytea*)palloc0(len);
SET_VARSIZE(output, len);
+ tmp = VARDATA(output);
+
/* first, store the number of dimensions / items */
memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
tmp += offsetof(MVDependenciesData, deps);
- /* walk through the dependencies and copy both columns into the bytea */
+ /* store number of attributes and attribute numbers for each dependency */
for (i = 0; i < dependencies->ndeps; i++)
{
- memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
- tmp += sizeof(int16);
+ MVDependency d = dependencies->deps[i];
+
+ memcpy(tmp, &(d->nattributes), sizeof(int));
+ tmp += sizeof(int);
- memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
- tmp += sizeof(int16);
+ memcpy(tmp, d->attributes, sizeof(int16) * d->nattributes);
+ tmp += sizeof(int16) * d->nattributes;
+
+ Assert(tmp <= ((char*)output + len));
}
return output;
@@ -338,20 +566,21 @@ deserialize_mv_dependencies(bytea * data)
tmp += offsetof(MVDependenciesData, deps);
if (dependencies->magic != MVSTAT_DEPS_MAGIC)
- {
- pfree(dependencies);
- elog(WARNING, "not a MV Dependencies (magic number mismatch)");
- return NULL;
- }
+ elog(ERROR, "invalid dependency type %d (expected %dd)",
+ dependencies->type, MVSTAT_DEPS_MAGIC);
+
+ if (dependencies->type != MVSTAT_DEPS_TYPE_BASIC)
+ elog(ERROR, "invalid dependency type %d (expected %dd)",
+ dependencies->type, MVSTAT_DEPS_TYPE_BASIC);
Assert(dependencies->ndeps > 0);
- /* what bytea size do we expect for those parameters */
+ /* what minimum bytea size do we expect for those parameters */
expected_size = offsetof(MVDependenciesData,deps) +
- dependencies->ndeps * sizeof(int16) * 2;
+ dependencies->ndeps * (sizeof(int) + sizeof(int16) * 2);
- if (VARSIZE_ANY_EXHDR(data) != expected_size)
- elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected at least %ld)",
VARSIZE_ANY_EXHDR(data), expected_size);
/* allocate space for the MCV items */
@@ -360,15 +589,35 @@ deserialize_mv_dependencies(bytea * data)
for (i = 0; i < dependencies->ndeps; i++)
{
- dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ int k;
+ MVDependency d;
+
+ /* number of attributes */
+ memcpy(&k, tmp, sizeof(int));
+ tmp += sizeof(int);
+
+ /* is the number of attributes valid? */
+ Assert((k >= 2) && (k <= MVSTATS_MAX_DIMENSIONS));
- memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
- tmp += sizeof(int16);
+ /* now that we know the number of attributes, allocate the dependency */
+ d = (MVDependency)palloc0(offsetof(MVDependencyData, attributes)
+ + k * sizeof(int));
- memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
- tmp += sizeof(int16);
+ d->nattributes = k;
+
+ /* copy attribute numbers */
+ memcpy(d->attributes, tmp, sizeof(int16) * d->nattributes);
+ tmp += sizeof(int16) * d->nattributes;
+
+ dependencies->deps[i] = d;
+
+ /* still within the bytea */
+ Assert(tmp <= ((char*)data + VARSIZE_ANY(data)));
}
+ /* we should have consumed the whole bytea exactly */
+ Assert(tmp == ((char*)data + VARSIZE_ANY(data)));
+
return dependencies;
}
@@ -392,46 +641,70 @@ pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(result));
}
-/* print the dependencies
- *
- * TODO Would be nice if this knew the actual column names (instead of
- * the attnums).
+/*
+ * print the dependencies
*
- * FIXME This is really ugly and does not really check the lengths and
- * strcpy/snprintf return values properly. Needs to be fixed.
+ * TODO Would be nice if this printed column names (instead of just attnums).
*/
Datum
pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
{
- int i = 0;
- bytea *data = PG_GETARG_BYTEA_P(0);
- char *result = NULL;
- int len = 0;
+ int i, j;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ StringInfoData buf;
MVDependencies dependencies = deserialize_mv_dependencies(data);
if (dependencies == NULL)
PG_RETURN_NULL();
+ initStringInfo(&buf);
+
for (i = 0; i < dependencies->ndeps; i++)
{
MVDependency dependency = dependencies->deps[i];
- char buffer[128];
- int tmp = snprintf(buffer, 128, "%s%d => %d",
- ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+ if (i > 0)
+ appendStringInfo(&buf, ", ");
- if (tmp < 127)
+ /* conditions */
+ appendStringInfoChar(&buf, '(');
+ for (j = 0; j < dependency->nattributes-1; j++)
{
- if (result == NULL)
- result = palloc0(len + tmp + 1);
- else
- result = repalloc(result, len + tmp + 1);
+ if (j > 0)
+ appendStringInfoChar(&buf, ',');
- strcpy(result + len, buffer);
- len += tmp;
+ appendStringInfo(&buf, "%d", dependency->attributes[j]);
}
+
+ /* the implied attribute */
+ appendStringInfo(&buf, ") => %d",
+ dependency->attributes[dependency->nattributes-1]);
}
- PG_RETURN_TEXT_P(cstring_to_text(result));
+ PG_RETURN_TEXT_P(cstring_to_text(buf.data));
+}
+
+MVDependencies
+load_mv_dependencies(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->deps_enabled && mvstat->deps_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_dependencies(DatumGetByteaP(deps));
}
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 7ebd961..c6f45ab 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,22 +17,31 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+/*
+ * Degree of how much MCV item / histogram bucket matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
-/* An associative rule, tracking [a => b] dependency.
- *
- * TODO Make this work with multiple columns on both sides.
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
*/
typedef struct MVDependencyData {
- int16 a;
- int16 b;
+ int nattributes; /* number of attributes */
+ int16 attributes[1]; /* attribute numbers */
} MVDependencyData;
typedef MVDependencyData* MVDependency;
typedef struct MVDependenciesData {
uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MV Dependencies (BASIC) */
int32 ndeps; /* number of dependencies */
MVDependency deps[1]; /* XXX why not a pointer? */
} MVDependenciesData;
@@ -48,6 +57,8 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
+MVDependencies load_mv_dependencies(Oid mvoid);
+
bytea * serialize_mv_dependencies(MVDependencies dependencies);
/* deserialization of stats (serialization is private to analyze) */
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..e759997
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,172 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | 1 => 2, 1 => 3, 2 => 3
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+----------------------------------------
+ t | t | 2 => 1, 3 => 1, 3 => 2, 4 => 1, 4 => 2
+(1 row)
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index bec0316..4f2ffb8 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -110,3 +110,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7e9b319..097a04f 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -162,3 +162,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..48dea4d
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,150 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
--
2.5.0
0002-shared-infrastructure-and-functional-dependencies.patchtext/x-patch; charset=UTF-8; name=0002-shared-infrastructure-and-functional-dependencies.patchDownload
From 2c719f145ab1f74e87b0da2c4cff67af9f4a8e50 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 2/9] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate stats, most
importantly:
- adds a new system catalog (pg_mv_statistic)
- CREATE STATISTICS name ON table (columns) WITH (options)
- DROP STATISTICS name
- ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME
- implementation of functional dependencies (the simplest type of
multivariate statistics)
- building functional dependencies in ANALYZE
- updates regression tests (new catalog etc.)
This does not include any changes to the optimizer, i.e. it does not
influence the query planning (subject to follow-up patches).
The current implementation requires a valid 'ltopr' for the columns, so
that we can sort the sample rows in various ways, both in this patch
and other kinds of statistics. Maybe this restriction could be relaxed
in the future, requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
Maybe some of the stats (functional dependencies and MCV list with
limited functionality) might be made to work with hashes of the values,
which is sufficient for equality comparisons. But the queries would
require the equality operator anyway, so it's not really a weaker
requirement. The hashes might reduce space requirements, though.
The algorithm detecting the dependencies is rather simple and probably
needs improvements, so that it detects more complicated dependencies,
and also validation of the math.
The name 'functional dependencies' is more correct (than 'association
rules') as it's exactly the name used in relational theory (esp. Normal
Forms) for tracking column-level dependencies.
The multivariate statistics are automatically removed in two situations
(a) after a DROP TABLE (obviously)
(b) after ALTER TABLE ... DROP COLUMN, if the statistics would be
defined on less than 2 columns (remaining)
If there are more at least two remaining columns, we keep the
statistics but perform cleanup on the next ANALYZE. The dropped columns
are removed from stakeys, and the new statistics is built on the
smaller set.
We can't do this at DROP COLUMN, because that'd leave us with invalid
statistics, or we'd have to throw it away although we can still use it.
This lazy approach lets us use the statistics although some of the
columns are dead.
This also adds a simple list of statistics to \d in psql.
This means the statistics are created within a schema by using a
qualified name (or using the default schema)
CREATE STATISTICS schema.statistics ON ...
and then dropped by specifying qualified name
DROP STATISTICS schema.statistics
or searching through search_path (just like with other objects).
This also gets rid of the "(opt_)stats_name" definitions in gram.y and
instead replaces them with just "opt_any_name", although the optional
case is not really handled currently - there's no generated name yet
(so either we should drop it or implement it).
I'm not entirely sure making statistics schema-specific is that a great
idea. Maybe it should be "global", but that does not seem right (e.g.
it makes multi-tenant systems based on schemas more difficult to
manage, because tenants would interact).
---
doc/src/sgml/ref/allfiles.sgml | 3 +
doc/src/sgml/ref/alter_statistics.sgml | 115 +++++++
doc/src/sgml/ref/create_statistics.sgml | 198 ++++++++++++
doc/src/sgml/ref/drop_statistics.sgml | 91 ++++++
doc/src/sgml/reference.sgml | 2 +
src/backend/catalog/Makefile | 1 +
src/backend/catalog/aclchk.c | 27 ++
src/backend/catalog/dependency.c | 11 +-
src/backend/catalog/heap.c | 102 ++++++
src/backend/catalog/namespace.c | 51 +++
src/backend/catalog/objectaddress.c | 54 ++++
src/backend/catalog/system_views.sql | 11 +
src/backend/commands/Makefile | 6 +-
src/backend/commands/alter.c | 3 +
src/backend/commands/analyze.c | 21 ++
src/backend/commands/dropcmds.c | 4 +
src/backend/commands/event_trigger.c | 3 +
src/backend/commands/statscmds.c | 277 ++++++++++++++++
src/backend/nodes/copyfuncs.c | 17 +
src/backend/nodes/outfuncs.c | 18 ++
src/backend/optimizer/util/plancat.c | 59 ++++
src/backend/parser/gram.y | 60 +++-
src/backend/tcop/utility.c | 14 +
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/relcache.c | 59 ++++
src/backend/utils/cache/syscache.c | 23 ++
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/README.dependencies | 222 +++++++++++++
src/backend/utils/mvstats/common.c | 356 +++++++++++++++++++++
src/backend/utils/mvstats/common.h | 75 +++++
src/backend/utils/mvstats/dependencies.c | 437 ++++++++++++++++++++++++++
src/bin/psql/describe.c | 44 +++
src/include/catalog/dependency.h | 5 +-
src/include/catalog/heap.h | 1 +
src/include/catalog/indexing.h | 7 +
src/include/catalog/namespace.h | 2 +
src/include/catalog/pg_mv_statistic.h | 75 +++++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/commands/defrem.h | 4 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 12 +
src/include/nodes/relation.h | 28 ++
src/include/utils/acl.h | 1 +
src/include/utils/mvstats.h | 70 +++++
src/include/utils/rel.h | 4 +
src/include/utils/relcache.h | 1 +
src/include/utils/syscache.h | 2 +
src/test/regress/expected/object_address.out | 7 +-
src/test/regress/expected/rules.out | 9 +
src/test/regress/expected/sanity_check.out | 1 +
src/test/regress/sql/object_address.sql | 4 +-
52 files changed, 2613 insertions(+), 11 deletions(-)
create mode 100644 doc/src/sgml/ref/alter_statistics.sgml
create mode 100644 doc/src/sgml/ref/create_statistics.sgml
create mode 100644 doc/src/sgml/ref/drop_statistics.sgml
create mode 100644 src/backend/commands/statscmds.c
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/README.dependencies
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index bf95453..524ed83 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -32,6 +32,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY alterServer SYSTEM "alter_server.sgml">
<!ENTITY alterSequence SYSTEM "alter_sequence.sgml">
<!ENTITY alterSystem SYSTEM "alter_system.sgml">
+<!ENTITY alterStatistics SYSTEM "alter_statistics.sgml">
<!ENTITY alterTable SYSTEM "alter_table.sgml">
<!ENTITY alterTableSpace SYSTEM "alter_tablespace.sgml">
<!ENTITY alterTSConfig SYSTEM "alter_tsconfig.sgml">
@@ -76,6 +77,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY createSchema SYSTEM "create_schema.sgml">
<!ENTITY createSequence SYSTEM "create_sequence.sgml">
<!ENTITY createServer SYSTEM "create_server.sgml">
+<!ENTITY createStatistics SYSTEM "create_statistics.sgml">
<!ENTITY createTable SYSTEM "create_table.sgml">
<!ENTITY createTableAs SYSTEM "create_table_as.sgml">
<!ENTITY createTableSpace SYSTEM "create_tablespace.sgml">
@@ -119,6 +121,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY dropSchema SYSTEM "drop_schema.sgml">
<!ENTITY dropSequence SYSTEM "drop_sequence.sgml">
<!ENTITY dropServer SYSTEM "drop_server.sgml">
+<!ENTITY dropStatistics SYSTEM "drop_statistics.sgml">
<!ENTITY dropTable SYSTEM "drop_table.sgml">
<!ENTITY dropTableSpace SYSTEM "drop_tablespace.sgml">
<!ENTITY dropTransform SYSTEM "drop_transform.sgml">
diff --git a/doc/src/sgml/ref/alter_statistics.sgml b/doc/src/sgml/ref/alter_statistics.sgml
new file mode 100644
index 0000000..aa421c0
--- /dev/null
+++ b/doc/src/sgml/ref/alter_statistics.sgml
@@ -0,0 +1,115 @@
+<!--
+doc/src/sgml/ref/alter_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-ALTERSTATISTICS">
+ <indexterm zone="sql-alterstatistics">
+ <primary>ALTER STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>ALTER STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>ALTER STATISTICS</refname>
+ <refpurpose>
+ change the definition of a multivariate statistics
+ </refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+ALTER STATISTICS <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable class="PARAMETER">new_owner</replaceable> | CURRENT_USER | SESSION_USER }
+ALTER STATISTICS <replaceable class="parameter">name</replaceable> RENAME TO <replaceable class="parameter">new_name</replaceable>
+ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <replaceable class="parameter">new_schema</replaceable>
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <command>ALTER STATISTICS</command> changes the parameters of an existing
+ multivariate statistics. Any parameters not specifically set in the
+ <command>ALTER STATISTICS</command> command retain their prior settings.
+ </para>
+
+ <para>
+ You must own the statistics to use <command>ALTER STATISTICS</>.
+ To change a statistics' schema, you must also have <literal>CREATE</>
+ privilege on the new schema.
+ To alter the owner, you must also be a direct or indirect member of the new
+ owning role, and that role must have <literal>CREATE</literal> privilege on
+ the statistics' schema. (These restrictions enforce that altering the owner
+ doesn't do anything you couldn't do by dropping and recreating the statistics.
+ However, a superuser can alter ownership of any statistics anyway.)
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><replaceable class="parameter">name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of a statistics to be altered.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">new_owner</replaceable></term>
+ <listitem>
+ <para>
+ The user name of the new owner of the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">new_name</replaceable></term>
+ <listitem>
+ <para>
+ The new name for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">new_schema</replaceable></term>
+ <listitem>
+ <para>
+ The new schema for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>ALTER STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-createstatistics"></member>
+ <member><xref linkend="sql-dropstatistics"></member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
new file mode 100644
index 0000000..ff09fa5
--- /dev/null
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -0,0 +1,198 @@
+<!--
+doc/src/sgml/ref/create_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-CREATESTATISTICS">
+ <indexterm zone="sql-createstatistics">
+ <primary>CREATE STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>CREATE STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>CREATE STATISTICS</refname>
+ <refpurpose>define a new statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable> ON <replaceable class="PARAMETER">table_name</replaceable> ( [
+ { <replaceable class="PARAMETER">column_name</replaceable> } ] [, ...])
+[ WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )
+</synopsis>
+
+ </refsynopsisdiv>
+
+ <refsect1 id="SQL-CREATESTATISTICS-description">
+ <title>Description</title>
+
+ <para>
+ <command>CREATE STATISTICS</command> will create a new multivariate
+ statistics on the table. The statistics will be created in the in the
+ current database. The statistics will be owned by the user issuing
+ the command.
+ </para>
+
+ <para>
+ If a schema name is given (for example, <literal>CREATE STATISTICS
+ myschema.mystat ...</>) then the statistics is created in the specified
+ schema. Otherwise it is created in the current schema. The name of
+ the table must be distinct from the name of any other statistics in the
+ same schema.
+ </para>
+
+ <para>
+ To be able to create a table, you must have <literal>USAGE</literal>
+ privilege on all column types or the type in the <literal>OF</literal>
+ clause, respectively.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>IF NOT EXISTS</></term>
+ <listitem>
+ <para>
+ Do not throw an error if a statistics with the same name already exists.
+ A notice is issued in this case. Note that there is no guarantee that
+ the existing statistics is anything like the one that would have been
+ created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">statistics_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to be created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">table_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the table the statistics should
+ be created on.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name</replaceable></term>
+ <listitem>
+ <para>
+ The name of a column to be included in the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ ...
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <refsect2 id="SQL-CREATESTATISTICS-parameters">
+ <title id="SQL-CREATESTATISTICS-parameters-title">Statistics Parameters</title>
+
+ <indexterm zone="sql-createstatistics-parameters">
+ <primary>statistics parameters</primary>
+ </indexterm>
+
+ <para>
+ The <literal>WITH</> clause can specify <firstterm>statistics parameters</>
+ for statistics. The currently available parameters are listed below.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>dependencies</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables functional dependencies for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1 id="SQL-CREATESTATISTICS-notes">
+ <title>Notes</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+
+ <refsect1 id="SQL-CREATESTATISTICS-examples">
+ <title>Examples</title>
+
+ <para>
+ Create table <structname>t1</> with two functionally dependent columns, i.e.
+ knowledge of a value in the first column is sufficient for detemining the
+ value in the other column. Then functional dependencies are built on those
+ columns:
+
+<programlisting>
+CREATE TABLE t1 (
+ a int,
+ b int
+);
+
+INSERT INTO t1 SELECT i/100, i/500
+ FROM generate_series(1,1000000) s(i);
+
+CREATE STATISTICS s1 ON t1 (a, b) WITH (dependencies);
+
+ANALYZE t1;
+
+-- valid combination of values
+EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 1);
+
+-- invalid combination of values
+EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 2);
+</programlisting>
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>CREATE STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-alterstatistics"></member>
+ <member><xref linkend="sql-dropstatistics"></member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/doc/src/sgml/ref/drop_statistics.sgml b/doc/src/sgml/ref/drop_statistics.sgml
new file mode 100644
index 0000000..dd9047a
--- /dev/null
+++ b/doc/src/sgml/ref/drop_statistics.sgml
@@ -0,0 +1,91 @@
+<!--
+doc/src/sgml/ref/drop_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-DROPSTATISTICS">
+ <indexterm zone="sql-dropstatistics">
+ <primary>DROP STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>DROP STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>DROP STATISTICS</refname>
+ <refpurpose>remove a statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> [, ...]
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <command>DROP STATISTICS</command> removes statistics from the database.
+ Only the statistics owner, the schema owner, and superuser can drop a
+ statistics.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>IF EXISTS</literal></term>
+ <listitem>
+ <para>
+ Do not throw an error if the statistics does not exist. A notice is
+ issued in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to drop.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>DROP STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-alterstatistics"></member>
+ <member><xref linkend="sql-createstatistics"></member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 03020df..2b07b2d 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -104,6 +104,7 @@
&createSchema;
&createSequence;
&createServer;
+ &createStatistics;
&createTable;
&createTableAs;
&createTableSpace;
@@ -147,6 +148,7 @@
&dropSchema;
&dropSequence;
&dropServer;
+ &dropStatistics;
&dropTable;
&dropTableSpace;
&dropTSConfig;
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 25130ec..058b8a9 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 0f3bc07..e21aacd 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -38,6 +38,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -5021,6 +5022,32 @@ pg_extension_ownercheck(Oid ext_oid, Oid roleid)
}
/*
+ * Ownership check for a multivariate statistics (specified by OID).
+ */
+bool
+pg_statistics_ownercheck(Oid stat_oid, Oid roleid)
+{
+ HeapTuple tuple;
+ Oid ownerId;
+
+ /* Superusers bypass all permission checking. */
+ if (superuser_arg(roleid))
+ return true;
+
+ tuple = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(stat_oid));
+ if (!HeapTupleIsValid(tuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics with OID %u does not exist", stat_oid)));
+
+ ownerId = ((Form_pg_mv_statistic) GETSTRUCT(tuple))->staowner;
+
+ ReleaseSysCache(tuple);
+
+ return has_privs_of_role(roleid, ownerId);
+}
+
+/*
* Check whether specified role has CREATEROLE privilege (or is a superuser)
*
* Note: roles do not have owners per se; instead we use this test in
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index c48e37b..8200454 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -40,6 +40,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -160,7 +161,8 @@ static const Oid object_classes[] = {
ExtensionRelationId, /* OCLASS_EXTENSION */
EventTriggerRelationId, /* OCLASS_EVENT_TRIGGER */
PolicyRelationId, /* OCLASS_POLICY */
- TransformRelationId /* OCLASS_TRANSFORM */
+ TransformRelationId, /* OCLASS_TRANSFORM */
+ MvStatisticRelationId /* OCLASS_STATISTICS */
};
@@ -1272,6 +1274,10 @@ doDeletion(const ObjectAddress *object, int flags)
DropTransformById(object->objectId);
break;
+ case OCLASS_STATISTICS:
+ RemoveStatisticsById(object->objectId);
+ break;
+
default:
elog(ERROR, "unrecognized object class: %u",
object->classId);
@@ -2415,6 +2421,9 @@ getObjectClass(const ObjectAddress *object)
case TransformRelationId:
return OCLASS_TRANSFORM;
+
+ case MvStatisticRelationId:
+ return OCLASS_STATISTICS;
}
/* shouldn't get here */
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index e997b57..47ec8cc 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -47,6 +47,7 @@
#include "catalog/pg_constraint_fn.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_tablespace.h"
@@ -1613,7 +1614,10 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
if (attnum > 0)
+ {
RemoveStatistics(relid, attnum);
+ RemoveMVStatistics(relid, attnum);
+ }
relation_close(rel, NoLock);
}
@@ -1841,6 +1845,11 @@ heap_drop_with_catalog(Oid relid)
RemoveStatistics(relid, 0);
/*
+ * delete multi-variate statistics
+ */
+ RemoveMVStatistics(relid, 0);
+
+ /*
* delete attribute tuples
*/
DeleteAttributeTuples(relid);
@@ -2692,6 +2701,99 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
/*
+ * RemoveMVStatistics --- remove entries in pg_mv_statistic for a rel
+ *
+ * If attnum is zero, remove all entries for rel; else remove only the one(s)
+ * for that column.
+ */
+void
+RemoveMVStatistics(Oid relid, AttrNumber attnum)
+{
+ Relation pgmvstatistic;
+ TupleDesc tupdesc = NULL;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ /*
+ * When dropping a column, we'll drop statistics with a single
+ * remaining (undropped column). To do that, we need the tuple
+ * descriptor.
+ *
+ * We already have the relation locked (as we're running ALTER
+ * TABLE ... DROP COLUMN), so we'll just get the descriptor here.
+ */
+ if (attnum != 0)
+ {
+ Relation rel = relation_open(relid, NoLock);
+
+ /* multivariate stats are supported on tables and matviews */
+ if (rel->rd_rel->relkind == RELKIND_RELATION ||
+ rel->rd_rel->relkind == RELKIND_MATVIEW)
+ tupdesc = RelationGetDescr(rel);
+
+ relation_close(rel, NoLock);
+ }
+
+ if (tupdesc == NULL)
+ return;
+
+ pgmvstatistic = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ scan = systable_beginscan(pgmvstatistic,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ bool delete = true;
+
+ if (attnum != 0)
+ {
+ Datum adatum;
+ bool isnull;
+ int i;
+ int ncolumns = 0;
+ ArrayType *arr;
+ int16 *attnums;
+
+ /* get the columns */
+ adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+ attnums = (int16*)ARR_DATA_PTR(arr);
+
+ for (i = 0; i < ARR_DIMS(arr)[0]; i++)
+ {
+ /* count the column unless it's has been / is being dropped */
+ if ((! tupdesc->attrs[attnums[i]-1]->attisdropped) &&
+ (attnums[i] != attnum))
+ ncolumns += 1;
+ }
+
+ /* delete if there are less than two attributes */
+ delete = (ncolumns < 2);
+ }
+
+ if (delete)
+ simple_heap_delete(pgmvstatistic, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(pgmvstatistic, RowExclusiveLock);
+}
+
+
+/*
* RelationTruncateIndexes - truncate all indexes associated
* with the heap relation to zero tuples.
*
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 446b2ac..dfd5bef 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -4201,3 +4201,54 @@ pg_is_other_temp_schema(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(isOtherTempNamespace(oid));
}
+
+Oid
+get_statistics_oid(List *names, bool missing_ok)
+{
+ char *schemaname;
+ char *stats_name;
+ Oid namespaceId;
+ Oid stats_oid = InvalidOid;
+ ListCell *l;
+
+ /* deconstruct the name list */
+ DeconstructQualifiedName(names, &schemaname, &stats_name);
+
+ if (schemaname)
+ {
+ /* use exact schema given */
+ namespaceId = LookupExplicitNamespace(schemaname, missing_ok);
+ if (missing_ok && !OidIsValid(namespaceId))
+ stats_oid = InvalidOid;
+ else
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ }
+ else
+ {
+ /* search for it in search path */
+ recomputeNamespacePath();
+
+ foreach(l, activeSearchPath)
+ {
+ namespaceId = lfirst_oid(l);
+
+ if (namespaceId == myTempNamespace)
+ continue; /* do not look in temp namespace */
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ if (OidIsValid(stats_oid))
+ break;
+ }
+ }
+
+ if (!OidIsValid(stats_oid) && !missing_ok)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics \"%s\" does not exist",
+ NameListToString(names))));
+
+ return stats_oid;
+}
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index d2aaa6d..c13a569 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -39,6 +39,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_opfamily.h"
@@ -438,9 +439,22 @@ static const ObjectPropertyType ObjectProperty[] =
Anum_pg_type_typacl,
ACL_KIND_TYPE,
true
+ },
+ {
+ MvStatisticRelationId,
+ MvStatisticOidIndexId,
+ MVSTATOID,
+ MVSTATNAMENSP,
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ Anum_pg_mv_statistic_staowner,
+ InvalidAttrNumber, /* no ACL (same as relation) */
+ -1, /* no ACL */
+ true
}
};
+
/*
* This struct maps the string object types as returned by
* getObjectTypeDescription into ObjType enum values. Note that some enum
@@ -640,6 +654,10 @@ static const struct object_type_map
/* OCLASS_TRANSFORM */
{
"transform", OBJECT_TRANSFORM
+ },
+ /* OBJECT_STATISTICS */
+ {
+ "statistics", OBJECT_STATISTICS
}
};
@@ -913,6 +931,11 @@ get_object_address(ObjectType objtype, List *objname, List *objargs,
address = get_object_address_defacl(objname, objargs,
missing_ok);
break;
+ case OBJECT_STATISTICS:
+ address.classId = MvStatisticRelationId;
+ address.objectId = get_statistics_oid(objname, missing_ok);
+ address.objectSubId = 0;
+ break;
default:
elog(ERROR, "unrecognized objtype: %d", (int) objtype);
/* placate compiler, in case it thinks elog might return */
@@ -2185,6 +2208,10 @@ check_object_ownership(Oid roleid, ObjectType objtype, ObjectAddress address,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser")));
break;
+ case OBJECT_STATISTICS:
+ if (!pg_statistics_ownercheck(address.objectId, roleid))
+ aclcheck_error_type(ACLCHECK_NOT_OWNER, address.objectId);
+ break;
default:
elog(ERROR, "unrecognized object type: %d",
(int) objtype);
@@ -3610,6 +3637,10 @@ getObjectTypeDescription(const ObjectAddress *object)
appendStringInfoString(&buffer, "transform");
break;
+ case OCLASS_STATISTICS:
+ appendStringInfoString(&buffer, "statistics");
+ break;
+
default:
appendStringInfo(&buffer, "unrecognized %u", object->classId);
break;
@@ -4566,6 +4597,29 @@ getObjectIdentityParts(const ObjectAddress *object,
}
break;
+ case OCLASS_STATISTICS:
+ {
+ HeapTuple tup;
+ Form_pg_mv_statistic formStatistic;
+ char *schema;
+
+ tup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(object->objectId));
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "cache lookup failed for statistics %u",
+ object->objectId);
+ formStatistic = (Form_pg_mv_statistic) GETSTRUCT(tup);
+ schema = get_namespace_name_or_temp(formStatistic->stanamespace);
+ appendStringInfoString(&buffer,
+ quote_qualified_identifier(schema,
+ NameStr(formStatistic->staname)));
+ if (objname)
+ *objname = list_make2(schema,
+ pstrdup(NameStr(formStatistic->staname)));
+ ReleaseSysCache(tup);
+ break;
+ }
+
default:
appendStringInfo(&buffer, "unrecognized object %u %u %d",
object->classId,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 84aa061..31dbb2c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,6 +158,17 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.staname AS staname,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats WITH (security_barrier) AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index b1ac704..5151001 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -18,8 +18,8 @@ OBJS = aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
event_trigger.o explain.o extension.o foreigncmds.o functioncmds.o \
indexcmds.o lockcmds.o matview.o operatorcmds.o opclasscmds.o \
policy.o portalcmds.o prepare.o proclang.o \
- schemacmds.o seclabel.o sequence.o tablecmds.o tablespace.o trigger.o \
- tsearchcmds.o typecmds.o user.o vacuum.o vacuumlazy.o \
- variable.o view.o
+ schemacmds.o seclabel.o sequence.o statscmds.o \
+ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o \
+ user.o vacuum.o vacuumlazy.o variable.o view.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
index 5af0f2f..89985499 100644
--- a/src/backend/commands/alter.c
+++ b/src/backend/commands/alter.c
@@ -359,6 +359,7 @@ ExecRenameStmt(RenameStmt *stmt)
case OBJECT_OPCLASS:
case OBJECT_OPFAMILY:
case OBJECT_LANGUAGE:
+ case OBJECT_STATISTICS:
case OBJECT_TSCONFIGURATION:
case OBJECT_TSDICTIONARY:
case OBJECT_TSPARSER:
@@ -437,6 +438,7 @@ ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt,
case OBJECT_OPERATOR:
case OBJECT_OPCLASS:
case OBJECT_OPFAMILY:
+ case OBJECT_STATISTICS:
case OBJECT_TSCONFIGURATION:
case OBJECT_TSDICTIONARY:
case OBJECT_TSPARSER:
@@ -745,6 +747,7 @@ ExecAlterOwnerStmt(AlterOwnerStmt *stmt)
case OBJECT_OPERATOR:
case OBJECT_OPCLASS:
case OBJECT_OPFAMILY:
+ case OBJECT_STATISTICS:
case OBJECT_TABLESPACE:
case OBJECT_TSDICTIONARY:
case OBJECT_TSCONFIGURATION:
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8a5f07c..9087532 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -17,6 +17,7 @@
#include <math.h>
#include "access/multixact.h"
+#include "access/sysattr.h"
#include "access/transam.h"
#include "access/tupconvert.h"
#include "access/tuptoaster.h"
@@ -27,6 +28,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -45,10 +47,13 @@
#include "storage/procarray.h"
#include "utils/acl.h"
#include "utils/attoptcache.h"
+#include "utils/builtins.h"
#include "utils/datum.h"
+#include "utils/fmgroids.h"
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_rusage.h"
#include "utils/sampling.h"
#include "utils/sortsupport.h"
@@ -460,6 +465,19 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, because we need to build more
+ * detailed stats (more MCV items / histogram buckets) to get
+ * good accuracy. Maybe it'd be appropriate to use samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -562,6 +580,9 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/dropcmds.c b/src/backend/commands/dropcmds.c
index 522027a..cd65b58 100644
--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -292,6 +292,10 @@ does_not_exist_skipping(ObjectType objtype, List *objname, List *objargs)
msg = gettext_noop("schema \"%s\" does not exist, skipping");
name = NameListToString(objname);
break;
+ case OBJECT_STATISTICS:
+ msg = gettext_noop("statistics \"%s\" does not exist, skipping");
+ name = NameListToString(objname);
+ break;
case OBJECT_TSPARSER:
if (!schema_does_not_exist_skipping(objname, &msg, &name))
{
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 9e32f8d..09061bb 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -110,6 +110,7 @@ static event_trigger_support_data event_trigger_support[] = {
{"SCHEMA", true},
{"SEQUENCE", true},
{"SERVER", true},
+ {"STATISTICS", true},
{"TABLE", true},
{"TABLESPACE", false},
{"TRANSFORM", true},
@@ -1106,6 +1107,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
case OBJECT_RULE:
case OBJECT_SCHEMA:
case OBJECT_SEQUENCE:
+ case OBJECT_STATISTICS:
case OBJECT_TABCONSTRAINT:
case OBJECT_TABLE:
case OBJECT_TRANSFORM:
@@ -1167,6 +1169,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
case OCLASS_DEFACL:
case OCLASS_EXTENSION:
case OCLASS_POLICY:
+ case OCLASS_STATISTICS:
return true;
}
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
new file mode 100644
index 0000000..f43b053
--- /dev/null
+++ b/src/backend/commands/statscmds.c
@@ -0,0 +1,277 @@
+/*-------------------------------------------------------------------------
+ *
+ * statscmds.c
+ * Commands for creating and altering multivariate statistics
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/commands/statscmds.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/relscan.h"
+#include "catalog/dependency.h"
+#include "catalog/indexing.h"
+#include "catalog/namespace.h"
+#include "catalog/pg_mv_statistic.h"
+#include "catalog/pg_namespace.h"
+#include "commands/defrem.h"
+#include "miscadmin.h"
+#include "utils/builtins.h"
+#include "utils/inval.h"
+#include "utils/memutils.h"
+#include "utils/mvstats.h"
+#include "utils/rel.h"
+#include "utils/syscache.h"
+
+
+/* used for sorting the attnums in ExecCreateStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the CREATE STATISTICS name ON table (columns) WITH (options)
+ *
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+ObjectAddress
+CreateStatistics(CreateStatsStmt *stmt)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+ ObjectAddress address = InvalidObjectAddress;
+ char *namestr;
+ NameData staname;
+ Oid statoid;
+ Oid namespaceId;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+ Relation rel;
+ ObjectAddress parentobject, childobject;
+
+ /* by default build nothing */
+ bool build_dependencies = false;
+
+ Assert(IsA(stmt, CreateStatsStmt));
+
+ /* resolve the pieces of the name (namespace etc.) */
+ namespaceId = QualifiedNameGetCreationNamespace(stmt->defnames, &namestr);
+ namestrcpy(&staname, namestr);
+
+ /*
+ * If if_not_exists was given and the statistics already exists, bail out.
+ */
+ if (stmt->if_not_exists &&
+ SearchSysCacheExists2(MVSTATNAMENSP,
+ PointerGetDatum(&staname),
+ ObjectIdGetDatum(namespaceId)))
+ {
+ ereport(NOTICE,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("statistics \"%s\" already exists, skipping",
+ namestr)));
+ return InvalidObjectAddress;
+ }
+
+ rel = heap_openrv(stmt->relation, AccessExclusiveLock);
+
+ /* transform the column names to attnum values */
+
+ foreach(l, stmt->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, stmt->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+ values[Anum_pg_mv_statistic_staname -1] = NameGetDatum(&staname);
+ values[Anum_pg_mv_statistic_stanamespace -1] = ObjectIdGetDatum(namespaceId);
+ values[Anum_pg_mv_statistic_staowner-1] = ObjectIdGetDatum(GetUserId());
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ statoid = HeapTupleGetOid(htup);
+
+ heap_freetuple(htup);
+
+
+ /*
+ * Store a dependency too, so that statistics are dropped on DROP TABLE
+ */
+ parentobject.classId = RelationRelationId;
+ parentobject.objectId = ObjectIdGetDatum(RelationGetRelid(rel));
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+ /*
+ * Also record dependency on the schema (to drop statistics on DROP SCHEMA)
+ */
+ parentobject.classId = NamespaceRelationId;
+ parentobject.objectId = ObjectIdGetDatum(namespaceId);
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ relation_close(rel, NoLock);
+
+ /*
+ * Invalidate relcache so that others see the new statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ ObjectAddressSet(address, MvStatisticRelationId, statoid);
+
+ return address;
+}
+
+
+/*
+ * Implements the DROP STATISTICS
+ *
+ * DROP STATISTICS stats_name ON table_name
+ *
+ * The first one requires an exact match, the second one just drops
+ * all the statistics on a table.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+ Relation relation;
+ Oid relid;
+ Relation rel;
+ HeapTuple tup;
+ Form_pg_mv_statistic mvstat;
+
+ /*
+ * Delete the pg_proc tuple.
+ */
+ relation = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ tup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(statsOid));
+ if (!HeapTupleIsValid(tup)) /* should not happen */
+ elog(ERROR, "cache lookup failed for statistics %u", statsOid);
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(tup);
+ relid = mvstat->starelid;
+
+ rel = heap_open(relid, AccessExclusiveLock);
+
+ simple_heap_delete(relation, &tup->t_self);
+
+ CacheInvalidateRelcache(rel);
+
+ ReleaseSysCache(tup);
+
+ heap_close(relation, RowExclusiveLock);
+ heap_close(rel, NoLock);
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df7c2fa..3b7c87f 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4124,6 +4124,20 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static CreateStatsStmt *
+_copyCreateStatsStmt(const CreateStatsStmt *from)
+{
+ CreateStatsStmt *newnode = makeNode(CreateStatsStmt);
+
+ COPY_NODE_FIELD(defnames);
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+ COPY_SCALAR_FIELD(if_not_exists);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4999,6 +5013,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_CreateStatsStmt:
+ retval = _copyCreateStatsStmt(from);
+ break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index eb0fc1e..07206d7 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2153,6 +2153,21 @@ _outIndexOptInfo(StringInfo str, const IndexOptInfo *node)
}
static void
+_outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
+{
+ WRITE_NODE_TYPE("MVSTATISTICINFO");
+
+ /* NB: this isn't a complete set of fields */
+ WRITE_OID_FIELD(mvoid);
+
+ /* enabled statistics */
+ WRITE_BOOL_FIELD(deps_enabled);
+
+ /* built/available statistics */
+ WRITE_BOOL_FIELD(deps_built);
+}
+
+static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
@@ -3636,6 +3651,9 @@ _outNode(StringInfo str, const void *obj)
case T_PlannerParamItem:
_outPlannerParamItem(str, obj);
break;
+ case T_MVStatisticInfo:
+ _outMVStatisticInfo(str, obj);
+ break;
case T_ExtensibleNode:
_outExtensibleNode(str, obj);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index ad715bb..7fb2088 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -28,6 +28,7 @@
#include "catalog/dependency.h"
#include "catalog/heap.h"
#include "catalog/pg_am.h"
+#include "catalog/pg_mv_statistic.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -40,7 +41,9 @@
#include "parser/parsetree.h"
#include "rewrite/rewriteManip.h"
#include "storage/bufmgr.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
+#include "utils/syscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -94,6 +97,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
Relation relation;
bool hasindex;
List *indexinfos = NIL;
+ List *stainfos = NIL;
/*
* We need not lock the relation since it was already locked, either by
@@ -387,6 +391,61 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->indexlist = indexinfos;
+ if (true)
+ {
+ List *mvstatoidlist;
+ ListCell *l;
+
+ mvstatoidlist = RelationGetMVStatList(relation);
+
+ foreach(l, mvstatoidlist)
+ {
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ Oid mvoid = lfirst_oid(l);
+ Form_pg_mv_statistic mvstat;
+ MVStatisticInfo *info;
+
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /* unavailable stats are not interesting for the planner */
+ if (mvstat->deps_built)
+ {
+ info = makeNode(MVStatisticInfo);
+
+ info->mvoid = mvoid;
+ info->rel = rel;
+
+ /* enabled statistics */
+ info->deps_enabled = mvstat->deps_enabled;
+
+ /* built/available statistics */
+ info->deps_built = mvstat->deps_built;
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ info->stakeys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+
+ stainfos = lcons(info, stainfos);
+ }
+
+ ReleaseSysCache(htup);
+ }
+
+ list_free(mvstatoidlist);
+ }
+
+ rel->mvstatlist = stainfos;
+
/* Grab foreign-table info using the relcache, while we have it */
if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b9aeb31..eed9927 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -241,7 +241,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
ConstraintsSetStmt CopyStmt CreateAsStmt CreateCastStmt
CreateDomainStmt CreateExtensionStmt CreateGroupStmt CreateOpClassStmt
CreateOpFamilyStmt AlterOpFamilyStmt CreatePLangStmt
- CreateSchemaStmt CreateSeqStmt CreateStmt CreateTableSpaceStmt
+ CreateSchemaStmt CreateSeqStmt CreateStmt CreateStatsStmt CreateTableSpaceStmt
CreateFdwStmt CreateForeignServerStmt CreateForeignTableStmt
CreateAssertStmt CreateTransformStmt CreateTrigStmt CreateEventTrigStmt
CreateUserStmt CreateUserMappingStmt CreateRoleStmt CreatePolicyStmt
@@ -809,6 +809,7 @@ stmt :
| CreateSchemaStmt
| CreateSeqStmt
| CreateStmt
+ | CreateStatsStmt
| CreateTableSpaceStmt
| CreateTransformStmt
| CreateTrigStmt
@@ -3436,6 +3437,36 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * CREATE STATISTICS stats_name ON relname (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+
+CreateStatsStmt: CREATE STATISTICS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $3;
+ n->relation = $5;
+ n->keys = $7;
+ n->options = $9;
+ n->if_not_exists = false;
+ $$ = (Node *)n;
+ }
+ | CREATE STATISTICS IF_P NOT EXISTS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $6;
+ n->relation = $8;
+ n->keys = $10;
+ n->options = $12;
+ n->if_not_exists = true;
+ $$ = (Node *)n;
+ }
+ ;
+
/*****************************************************************************
*
@@ -5621,6 +5652,7 @@ drop_type: TABLE { $$ = OBJECT_TABLE; }
| TEXT_P SEARCH DICTIONARY { $$ = OBJECT_TSDICTIONARY; }
| TEXT_P SEARCH TEMPLATE { $$ = OBJECT_TSTEMPLATE; }
| TEXT_P SEARCH CONFIGURATION { $$ = OBJECT_TSCONFIGURATION; }
+ | STATISTICS { $$ = OBJECT_STATISTICS; }
;
any_name_list:
@@ -7995,6 +8027,15 @@ RenameStmt: ALTER AGGREGATE func_name aggr_args RENAME TO name
n->missing_ok = false;
$$ = (Node *)n;
}
+ | ALTER STATISTICS any_name RENAME TO name
+ {
+ RenameStmt *n = makeNode(RenameStmt);
+ n->renameType = OBJECT_STATISTICS;
+ n->object = $3;
+ n->newname = $6;
+ n->missing_ok = false;
+ $$ = (Node *)n;
+ }
;
opt_column: COLUMN { $$ = COLUMN; }
@@ -8231,6 +8272,15 @@ AlterObjectSchemaStmt:
n->missing_ok = false;
$$ = (Node *)n;
}
+ | ALTER STATISTICS any_name SET SCHEMA name
+ {
+ AlterObjectSchemaStmt *n = makeNode(AlterObjectSchemaStmt);
+ n->objectType = OBJECT_STATISTICS;
+ n->object = $3;
+ n->newschema = $6;
+ n->missing_ok = false;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
@@ -8421,6 +8471,14 @@ AlterOwnerStmt: ALTER AGGREGATE func_name aggr_args OWNER TO RoleSpec
n->newowner = $7;
$$ = (Node *)n;
}
+ | ALTER STATISTICS name OWNER TO RoleSpec
+ {
+ AlterOwnerStmt *n = makeNode(AlterOwnerStmt);
+ n->objectType = OBJECT_STATISTICS;
+ n->object = list_make1(makeString($3));
+ n->newowner = $6;
+ $$ = (Node *)n;
+ }
;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 045f7f0..96b58f8 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1520,6 +1520,10 @@ ProcessUtilitySlow(Node *parsetree,
address = ExecSecLabelStmt((SecLabelStmt *) parsetree);
break;
+ case T_CreateStatsStmt: /* CREATE STATISTICS */
+ address = CreateStatistics((CreateStatsStmt *) parsetree);
+ break;
+
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(parsetree));
@@ -1878,6 +1882,9 @@ AlterObjectTypeCommandTag(ObjectType objtype)
case OBJECT_MATVIEW:
tag = "ALTER MATERIALIZED VIEW";
break;
+ case OBJECT_STATISTICS:
+ tag = "ALTER STATISTICS";
+ break;
default:
tag = "???";
break;
@@ -2160,6 +2167,9 @@ CreateCommandTag(Node *parsetree)
case OBJECT_TRANSFORM:
tag = "DROP TRANSFORM";
break;
+ case OBJECT_STATISTICS:
+ tag = "DROP STATISTICS";
+ break;
default:
tag = "???";
}
@@ -2527,6 +2537,10 @@ CreateCommandTag(Node *parsetree)
tag = "EXECUTE";
break;
+ case T_CreateStatsStmt:
+ tag = "CREATE STATISTICS";
+ break;
+
case T_DeallocateStmt:
{
DeallocateStmt *stmt = (DeallocateStmt *) parsetree;
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 130c06d..3bc4c8a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -47,6 +47,7 @@
#include "catalog/pg_auth_members.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
@@ -3956,6 +3957,62 @@ RelationGetIndexList(Relation relation)
return result;
}
+
+List *
+RelationGetMVStatList(Relation relation)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result;
+ List *oldlist;
+ MemoryContext oldcxt;
+
+ /* Quick exit if we already computed the list. */
+ if (relation->rd_mvstatvalid != 0)
+ return list_copy(relation->rd_mvstatlist);
+
+ /*
+ * We build the list we intend to return (in the caller's context) while
+ * doing the scan. After successfully completing the scan, we copy that
+ * list into the relcache entry. This avoids cache-context memory leakage
+ * if we get some sort of error partway through.
+ */
+ result = NIL;
+
+ /* Prepare to scan pg_index for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(relation)));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ /* TODO maybe include only already built statistics? */
+ result = insert_ordered_oid(result, HeapTupleGetOid(htup));
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* Now save a copy of the completed list in the relcache entry. */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ oldlist = relation->rd_mvstatlist;
+ relation->rd_mvstatlist = list_copy(result);
+
+ relation->rd_mvstatvalid = true;
+ MemoryContextSwitchTo(oldcxt);
+
+ /* Don't leak the old list, if there is one */
+ list_free(oldlist);
+
+ return result;
+}
+
/*
* insert_ordered_oid
* Insert a new Oid into a sorted list of Oids, preserving ordering
@@ -4920,6 +4977,8 @@ load_relcache_init_file(bool shared)
rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_idattr = NULL;
+ rel->rd_mvstatvalid = false;
+ rel->rd_mvstatlist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 65ffe84..3c1bc4b 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -44,6 +44,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -502,6 +503,28 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATNAMENSP */
+ MvStatisticNameIndexId,
+ 2,
+ {
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ 0,
+ 0
+ },
+ 4
+ },
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 4
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.dependencies b/src/backend/utils/mvstats/README.dependencies
new file mode 100644
index 0000000..1f96fbc
--- /dev/null
+++ b/src/backend/utils/mvstats/README.dependencies
@@ -0,0 +1,222 @@
+Soft functional dependencies
+============================
+
+A type of multivariate statistics used to capture cases when one column (or
+possibly a combination of columns) determines values in another column. We may
+also say that one column implies the other one.
+
+A simple artificial example may be a table with two columns, created like this
+
+ CREATE TABLE t (a INT, b INT)
+ AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+
+Clearly, once we know the value for column 'a' the value for 'b' is trivially
+determined, as it's simply (a/10). A more practical example may be addresses,
+where (ZIP code -> city name), i.e. once we know the ZIP, we probably know the
+city it belongs to, as ZIP codes are usually assigned to one city. Larger cities
+may have multiple ZIP codes, so the dependency can't be reversed.
+
+Functional dependencies are a concept well described in relational theory,
+particularly in definition of normalization and "normal forms". Wikipedia has a
+nice definition of a functional dependency [1]:
+
+ In a given table, an attribute Y is said to have a functional dependency on
+ a set of attributes X (written X -> Y) if and only if each X value is
+ associated with precisely one Y value. For example, in an "Employee" table
+ that includes the attributes "Employee ID" and "Employee Date of Birth", the
+ functional dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ It follows from the previous two sentences that each {Employee ID} is
+ associated with precisely one {Employee Date of Birth}.
+
+ [1] http://en.wikipedia.org/wiki/Database_normalization
+
+Many datasets might be normalized not to contain such dependencies, but often
+it's not practical for various reasons. In some cases it's actually a conscious
+design choice to model the dataset in denormalized way, either because of
+performance or to make querying easier.
+
+The functional dependencies are called 'soft' because the implementation is
+meant to allow small number of rows contradicting the dependency. Many actual
+data sets contain some sort of errors, either because of data entry mistakes
+(user mistyping the ZIP code) or issues in generating the data (e.g. a ZIP code
+mistakenly assigned to two cities in different states). A strict implementation
+would ignore dependencies on such noisy data, rendering the approach unusable on
+such data sets.
+
+
+Mining dependencies (ANALYZE)
+-----------------------------
+
+The current build algorithm is rather simple - for each pair (a,b) of columns,
+the data are sorted lexicographically (first by 'a', then by 'b'). Then for each
+group (rows with the same 'a' value) we decide whether the group is neutral,
+supporting or contradicting the dependency (a->b).
+
+A group is considered neutral when it's too small - e.g. when there's a single
+row in the group, there can't possibly be multiple values in 'b'. For this
+reason we ignore groups smaller than a threshold (currently 3 rows).
+
+For sufficiently large groups (3 rows or more), we count the number of distinct
+values in 'b'. When there's a single 'b' value, the group is considered to
+support the dependency (a->b), otherwise it's condidered as contradicting it.
+
+At the end, we compare the number of rows in supporting and contradicting groups,
+and if there are at least 10x as many supporting rows, we consider the
+functional dependency to be valid.
+
+
+This approach has the negative property that the algorithm is that it's a bit
+fragile with respect to the sample - there may be data sets producing quite
+different results for each ANALYZE execution (as even a single row may change
+the outcome of the final 10x test).
+
+It was proposed to make the dependencies "fuzzy" - e.g. track some coefficient
+between [0,1] determining how much the dependency holds. That would however mean
+we have to keep all the dependencies, as eliminating them based on the value of
+the coefficient (e.g. throw away dependencies <= 0.5) would result in exactly
+the same fragility issues. This would also make it more complicated to combine
+dependencies. So this does not seem like a practical approach.
+
+A better approach might be to replace the constants (min_group_size=3 and 10x)
+with values somehow related to the particular data set.
+
+
+Clause reduction (planner/optimizer)
+------------------------------------
+
+Apllying the functional dependencies is quite simple - given a list of equality
+clauses, check which clauses are redundant (i.e. implied by some other clause).
+For example given clause list
+
+ (a = 2) AND (b = 2) AND (c = 3)
+
+and dependencies (a->b) and (a->d), the list of clauses may be simplified to
+
+ (a = 1) AND (c = 3)
+
+Functional dependencies may only be applied to equality clauses, all other types
+of clauses are ignored. See clauselist_apply_dependencies() for more details.
+
+
+Compatibility of clauses
+------------------------
+
+The reduction assumes the clauses really are redundant, and the value in the
+reduced clause (b=2) is the value determined by (a=1). If that's not the case
+and the values are "incompatible" the result will be over-estimation.
+
+This may happen for example when using conditions on ZIP and city name with
+mismatching values (ZIP for a different city), etc. In such case the result
+set will be empty, but we'll estimate the selectivity using the ZIP condition.
+
+In this case the default estimation based on AVIA principle happens to work
+better, but mostly by chance.
+
+
+Dependencies vs. MCV/histogram
+------------------------------
+
+In some cases the "compatibility" of the conditions might be verified using the
+other types of multivariate stats - MCV lists and histograms.
+
+For MCV lists the verification might be very simple - peek into the list if
+there are any items matching the clause on the 'a' column (e.g. ZIP code), and
+if such item is found, check that the 'b' column matches the other clause. If it
+does not, the clauses are contradictory. We can't really say if such item was
+not found, except maybe restricting the selectivity using the MCV data (e.g.
+using min/max selectivity, or something).
+
+With histograms, it might work similarly - we can't check the values directly
+(because histograms use buckets, unlike MCV lists, storing the actual values).
+So we can only observe the buckets matching the clauses - if those buckets have
+very low frequency, it probably means the two clauses are incompatible.
+
+It's unclear what 'low frequency' is, but if one of the clauses is implied
+(automatically true because of the other clause), then
+
+ selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+
+So we might compute selectivity of the first clause - for example using regular
+statistics. And then check if the selectivity computed from the histogram is
+about the same (or significantly lower).
+
+The problem is that histograms work well only when the data ordering matches the
+natural meaning. For values that serve as labels - like city names or ZIP codes,
+or even generated IDs, histograms really don't work all that well. For example
+sorting cities by name won't match the sorting of ZIP codes, rendering the
+histogram unusable.
+
+So MCVs are probably going to work much better, because they don't really assume
+any sort of ordering. And it's probably more appropriate for the label-like data.
+
+A good question however is why even use functional dependencies in such cases
+and not simply use the MCV/histogram instead. One reason is that the functional
+dependencies allow fallback to regular stats, and often produce more accurate
+estimates - especially compared to histograms, that are quite bad in estimating
+equality clauses.
+
+
+Limitations
+-----------
+
+Let's see the main liminations of functional dependencies, especially those
+related to the current implementation.
+
+The current implementation supports only dependencies between two columns, but
+this is merely a simplification of the initial implementation. It's certainly
+useful to mine for dependencies involving multiple columns on the 'left' side,
+i.e. a condition for the dependency. That is dependencies like (a,b -> c).
+
+The implementation may/should be smart enough not to mine redundant conditions,
+e.g. (a->b) and (a,c -> b), because the latter is a trivial consequence of the
+former one (if values of 'a' determine 'b', adding another column won't change
+that relationship). The ANALYZE should first analyze 1:1 dependencies, then 2:1
+dependencies (and skip the already identified ones), etc.
+
+For example the dependency
+
+ (city name -> zip code)
+
+is much stronger, i.e. whenever it hold, then
+
+ (city name, state name -> zip code)
+
+holds too. But in case there are cities with the same name in different states,
+then only the latter dependency will be valid.
+
+Of course, there probably are cities with the same name within a single state,
+but hopefully this is relatively rare occurence (and thus we'll still detect
+the 'soft' dependency).
+
+Handling multiple columns on the right side of the dependency, is not necessary,
+as those dependencies may be simply decomposed into a set of dependencies with
+the same meaning, one for each column on the right side. For example
+
+ (a -> b,c)
+
+is exactly the same as
+
+ (a -> b) & (a -> c)
+
+Of course, storing the first form may be more efficient thant storing multiple
+'simple' dependencies separately.
+
+
+TODO Support dependencies with multiple columns on left/right.
+
+TODO Investigate using histogram and MCV list to verify the dependencies.
+
+TODO Investigate statistical testing of the distribution (to decide whether it
+ makes sense to build the histogram/MCV list).
+
+TODO Using a min/max of selectivities would probably make more sense for the
+ associated columns.
+
+TODO Consider eliminating the implied columns from the histogram and MCV lists
+ (but maybe that's not a good idea, because that'd make it impossible to use
+ these stats for non-equality clauses and also it wouldn't be possible to
+ use the stats for verification of the dependencies).
+
+TODO The reduction probably might be extended to also handle IS NULL clauses,
+ assuming we fix the ANALYZE to properly handle NULL values. We however
+ won't be able to reduce IS NOT NULL (unless I'm missing something).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..a755c49
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,356 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+static List* list_mv_stats(Oid relid);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ ListCell *lc;
+ List *mvstats;
+
+ TupleDesc tupdesc = RelationGetDescr(onerel);
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel));
+
+ foreach (lc, mvstats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies deps = NULL;
+
+ VacAttrStats **stats = NULL;
+ int numatts = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = stat->stakeys;
+
+ /* see how many of the columns are not dropped */
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ numatts += 1;
+
+ /* if there are dropped attributes, build a filtered int2vector */
+ if (numatts != attrs->dim1)
+ {
+ int16 *tmp = palloc0(numatts * sizeof(int16));
+ int attnum = 0;
+
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ tmp[attnum++] = attrs->values[j];
+
+ pfree(attrs);
+ attrs = buildint2vector(tmp, numatts);
+ }
+
+ /* filter only the interesting vacattrstats records */
+ stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(stat->mvoid, deps, attrs);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+static List*
+list_mv_stats(Oid relid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result = NIL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ info->mvoid = HeapTupleGetOid(htup);
+ info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_built = stats->deps_built;
+
+ result = lappend(result, info);
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_stakeys-1] = false;
+
+ /* use the new attnums, in case we removed some dropped ones */
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_stakeys -1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..d96422d
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/sysattr.h"
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/builtins.h"
+#include "utils/datum.h"
+#include "utils/fmgroids.h"
+#include "utils/mvstats.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..2a064a0
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,437 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/*
+ * Detect functional dependencies between columns.
+ *
+ * TODO This builds a complete set of dependencies, i.e. including transitive
+ * dependencies - if we identify [A => B] and [B => C], we're likely to
+ * identify [A => C] too. It might be better to keep only the minimal set
+ * of dependencies, i.e. prune all the dependencies that we can recreate
+ * by transivitity.
+ *
+ * There are two conceptual ways to do that:
+ *
+ * (a) generate all the rules, and then prune the rules that may be
+ * recteated by combining other dependencies, or
+ *
+ * (b) performing the 'is combination of other dependencies' check before
+ * actually doing the work
+ *
+ * The second option has the advantage that we don't really need to perform
+ * the sort/count. It's not sufficient alone, though, because we may
+ * discover the dependencies in the wrong order. For example we may find
+ *
+ * (a -> b), (a -> c) and then (b -> c)
+ *
+ * None of those dependencies is a combination of the already known ones,
+ * yet (a -> C) is a combination of (a -> b) and (b -> c).
+ *
+ *
+ * FIXME Currently we simply replace NULL values with 0 and then handle is as
+ * a regular value, but that groups NULL and actual 0 values. That's
+ * clearly incorrect - we need to handle NULL values as a separate value.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ int ndeps = 0;
+ MVDependencies dependencies = NULL;
+ MultiSortSupport mss = multi_sort_init(2); /* 2 dimensions for now */
+
+ /* TODO Maybe this should be somehow related to the number of
+ * distinct values in the two columns we're currently analyzing.
+ * Assuming the distribution is uniform, we can estimate the
+ * average group size and use it as a threshold. Or something
+ * like that. Seems better than a static approach.
+ */
+ int min_group_size = 3;
+
+ /* dimension indexes we'll check for associations [a => b] */
+ int dima, dimb;
+
+ /*
+ * We'll reuse the same array for all the 2-column combinations.
+ *
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simples / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * 2);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * 2);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * 2];
+ items[i].isnull = &isnull[i * 2];
+ }
+
+ Assert(numattrs >= 2);
+
+ /*
+ * Evaluate all possible combinations of [A => B], using a simple algorithm:
+ *
+ * (a) sort the data by [A,B]
+ * (b) split the data into groups by A (new group whenever a value changes)
+ * (c) count different values in the B column (again, value changes)
+ *
+ * TODO It should be rather simple to merge [A => B] and [A => C] into
+ * [A => B,C]. Just keep A constant, collect all the "implied" columns
+ * and you're done.
+ */
+ for (dima = 0; dima < numattrs; dima++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, 0, dima, stats);
+
+ for (dimb = 0; dimb < numattrs; dimb++)
+ {
+ SortItem current;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* make sure the columns are different (A => A) */
+ if (dima == dimb)
+ continue;
+
+ /* prepare the sort function for the second dimension */
+ multi_sort_add_dimension(mss, 1, dimb, stats);
+
+ /* reset the values and isnull flags */
+ memset(values, 0, sizeof(Datum) * numrows * 2);
+ memset(isnull, 0, sizeof(bool) * numrows * 2);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values[0]
+ = heap_getattr(rows[i], attrs->values[dima],
+ stats[dima]->tupDesc, &items[i].isnull[0]);
+
+ items[i].values[1]
+ = heap_getattr(rows[i], attrs->values[dimb],
+ stats[dimb]->tupDesc, &items[i].isnull[1]);
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the array, split it into rows according to
+ * the A value, and count distinct values in the other one.
+ * If there's a single B value for the whole group, we count
+ * it as supporting the association, otherwise we count it
+ * as contradicting.
+ *
+ * Furthermore we require a group to have at least a certain
+ * number of rows to be considered useful for supporting the
+ * dependency. But when it's contradicting, use it always useful.
+ */
+
+ /* start with values from the first row */
+ current = items[0];
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the group */
+ if (multi_sort_compare_dim(0, &items[i], ¤t, mss) != 0)
+ {
+ /*
+ * If there are no contradicting rows, count it as
+ * supporting (otherwise contradicting), but only if
+ * the group is large enough.
+ *
+ * The requirement of a minimum group size makes it
+ * impossible to identify [unique,unique] cases, but
+ * that's probably a different case. This is more
+ * about [zip => city] associations etc.
+ *
+ * If there are violations, count the group/rows as
+ * a violation.
+ *
+ * It may ne neither, if the group is too small (does
+ * not contain at least min_group_size rows).
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* mismatch of a B value is contradicting */
+ else if (multi_sort_compare_dim(1, &items[i], ¤t, mss) != 0)
+ {
+ n_violations += 1;
+ }
+
+ current = items[i];
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /*
+ * See if the number of rows supporting the association is at least
+ * 10x the number of rows violating the hypothetical dependency.
+ *
+ * TODO This is rather arbitrary limit - I guess it's possible to do
+ * some math to come up with a better rule (e.g. testing a hypothesis
+ * 'this is due to randomness'). We can create a contingency table
+ * from the values and use it for testing. Possibly only when
+ * there are no contradicting rows?
+ *
+ * TODO Also, if (a => b) and (b => a) at the same time, it pretty much
+ * means there's a 1:1 relation (or one is a 'label'), making the
+ * conditions rather redundant. Although it's possible that the
+ * query uses incompatible combination of values.
+ */
+ if (n_supporting_rows > (n_contradicting_rows * 10))
+ {
+ if (dependencies == NULL)
+ {
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ }
+ else
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData, deps)
+ + sizeof(MVDependency) * (dependencies->ndeps + 1));
+
+ /* update the */
+ dependencies->deps[ndeps] = (MVDependency)palloc0(sizeof(MVDependencyData));
+ dependencies->deps[ndeps]->a = attrs->values[dima];
+ dependencies->deps[ndeps]->b = attrs->values[dimb];
+
+ dependencies->ndeps = (++ndeps);
+ }
+ }
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(stats);
+ pfree(mss);
+
+ return dependencies;
+}
+
+/*
+ * Store the dependencies into a bytea, so that it can be stored in the
+ * pg_mv_statistic catalog.
+ *
+ * Currently this only supports simple two-column rules, and stores them
+ * as a sequence of attnum pairs. In the future, this needs to be made
+ * more complex to support multiple columns on both sides of the
+ * implication (using AND on left, OR on right).
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+
+ /* we need to store ndeps, and each needs 2 * int16 */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * (sizeof(int16) * 2);
+
+ bytea * output = (bytea*)palloc0(len);
+
+ char * tmp = VARDATA(output);
+
+ SET_VARSIZE(output, len);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* walk through the dependencies and copy both columns into the bytea */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ memcpy(tmp, &(dependencies->deps[i]->a), sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(tmp, &(dependencies->deps[i]->b), sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ {
+ pfree(dependencies);
+ elog(WARNING, "not a MV Dependencies (magic number mismatch)");
+ return NULL;
+ }
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * sizeof(int16) * 2;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ dependencies->deps[i] = (MVDependency)palloc0(sizeof(MVDependencyData));
+
+ memcpy(&(dependencies->deps[i]->a), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+
+ memcpy(&(dependencies->deps[i]->b), tmp, sizeof(int16));
+ tmp += sizeof(int16);
+ }
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/* print the dependencies
+ *
+ * TODO Would be nice if this knew the actual column names (instead of
+ * the attnums).
+ *
+ * FIXME This is really ugly and does not really check the lengths and
+ * strcpy/snprintf return values properly. Needs to be fixed.
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i = 0;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result = NULL;
+ int len = 0;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+ char buffer[128];
+
+ int tmp = snprintf(buffer, 128, "%s%d => %d",
+ ((i == 0) ? "" : ", "), dependency->a, dependency->b);
+
+ if (tmp < 127)
+ {
+ if (result == NULL)
+ result = palloc0(len + tmp + 1);
+ else
+ result = repalloc(result, len + tmp + 1);
+
+ strcpy(result + len, buffer);
+ len += tmp;
+ }
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index fd8dc91..8ce9c0e 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2104,6 +2104,50 @@ describeOneTableDetails(const char *schemaname,
PQclear(result);
}
+ /* print any multivariate statistics */
+ if (pset.sversion >= 90600)
+ {
+ printfPQExpBuffer(&buf,
+ "SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
+ " deps_enabled,\n"
+ " deps_built,\n"
+ " (SELECT string_agg(attname::text,', ')\n"
+ " FROM ((SELECT unnest(stakeys) AS attnum) s\n"
+ " JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
+ "FROM pg_mv_statistic stat WHERE starelid = '%s' ORDER BY 1;",
+ oid);
+
+ result = PSQLexec(buf.data);
+ if (!result)
+ goto error_return;
+ else
+ tuples = PQntuples(result);
+
+ if (tuples > 0)
+ {
+ printTableAddFooter(&cont, _("Statistics:"));
+ for (i = 0; i < tuples; i++)
+ {
+ printfPQExpBuffer(&buf, " ");
+
+ /* statistics name (qualified with namespace) */
+ appendPQExpBuffer(&buf, "\"%s.%s\" ",
+ PQgetvalue(result, i, 1),
+ PQgetvalue(result, i, 2));
+
+ /* options */
+ if (!strcmp(PQgetvalue(result, i, 4), "t"))
+ appendPQExpBuffer(&buf, "(dependencies)");
+
+ appendPQExpBuffer(&buf, " ON (%s)",
+ PQgetvalue(result, i, 6));
+
+ printTableAddFooter(&cont, buf.data);
+ }
+ }
+ PQclear(result);
+ }
+
/* print rules */
if (tableinfo.hasrules && tableinfo.relkind != 'm')
{
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 049bf9f..12211fe 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -153,10 +153,11 @@ typedef enum ObjectClass
OCLASS_EXTENSION, /* pg_extension */
OCLASS_EVENT_TRIGGER, /* pg_event_trigger */
OCLASS_POLICY, /* pg_policy */
- OCLASS_TRANSFORM /* pg_transform */
+ OCLASS_TRANSFORM, /* pg_transform */
+ OCLASS_STATISTICS /* pg_mv_statistics */
} ObjectClass;
-#define LAST_OCLASS OCLASS_TRANSFORM
+#define LAST_OCLASS OCLASS_STATISTICS
/* in dependency.c */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index b80d8d8..5ae42f7 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -119,6 +119,7 @@ extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
DropBehavior behavior, bool complain, bool internal);
extern void RemoveAttrDefaultById(Oid attrdefId);
extern void RemoveStatistics(Oid relid, AttrNumber attnum);
+extern void RemoveMVStatistics(Oid relid, AttrNumber attnum);
extern Form_pg_attribute SystemAttributeDefinition(AttrNumber attno,
bool relhasoids);
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index ab2c1a8..a768bb5 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,13 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_name_index, 3997, on pg_mv_statistic using btree(staname name_ops, stanamespace oid_ops));
+#define MvStatisticNameIndexId 3997
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index 2ccb3a7..44cf9c6 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -137,6 +137,8 @@ extern Oid get_collation_oid(List *collname, bool missing_ok);
extern Oid get_conversion_oid(List *conname, bool missing_ok);
extern Oid FindDefaultConversionProc(int32 for_encoding, int32 to_encoding);
+extern Oid get_statistics_oid(List *names, bool missing_ok);
+
/* initialization & transaction cleanup code */
extern void InitializeSearchPath(void);
extern void AtEOXact_Namespace(bool isCommit, bool parallel);
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..c74af47
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+ NameData staname; /* statistics name */
+ Oid stanamespace; /* OID of namespace containing this statistics */
+ Oid staowner; /* statistics owner */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_mv_statistic
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 8
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_staname 2
+#define Anum_pg_mv_statistic_stanamespace 3
+#define Anum_pg_mv_statistic_staowner 4
+#define Anum_pg_mv_statistic_deps_enabled 5
+#define Anum_pg_mv_statistic_deps_built 6
+#define Anum_pg_mv_statistic_stakeys 7
+#define Anum_pg_mv_statistic_stadeps 8
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 5c71bce..ff2d797 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2666,6 +2666,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index b7a38ce..a52096b 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3577, 3578);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 54f67e9..99a6a62 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -75,6 +75,10 @@ extern ObjectAddress DefineOperator(List *names, List *parameters);
extern void RemoveOperatorById(Oid operOid);
extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
+/* commands/statscmds.c */
+extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
+extern void RemoveStatisticsById(Oid statsOid);
+
/* commands/aggregatecmds.c */
extern ObjectAddress DefineAggregate(List *name, List *args, bool oldstyle,
List *parameters, const char *queryString);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index fad9988..545b62a 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -266,6 +266,7 @@ typedef enum NodeTag
T_PlaceHolderInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
+ T_MVStatisticInfo,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
@@ -401,6 +402,7 @@ typedef enum NodeTag
T_CreatePolicyStmt,
T_AlterPolicyStmt,
T_CreateTransformStmt,
+ T_CreateStatsStmt,
/*
* TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fd0629..e1807fb 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -601,6 +601,17 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct CreateStatsStmt
+{
+ NodeTag type;
+ List *defnames; /* qualified name (list of Value strings) */
+ RangeVar *relation; /* relation to build statistics on */
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+ bool if_not_exists; /* just do nothing if statistics already exists? */
+} CreateStatsStmt;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1410,6 +1421,7 @@ typedef enum ObjectType
OBJECT_RULE,
OBJECT_SCHEMA,
OBJECT_SEQUENCE,
+ OBJECT_STATISTICS,
OBJECT_TABCONSTRAINT,
OBJECT_TABLE,
OBJECT_TABLESPACE,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 641728b..e10dcf1 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -539,6 +539,7 @@ typedef struct RelOptInfo
List *lateral_vars; /* LATERAL Vars and PHVs referenced by rel */
Relids lateral_referencers; /* rels that reference me laterally */
List *indexlist; /* list of IndexOptInfo */
+ List *mvstatlist; /* list of MVStatisticInfo */
BlockNumber pages; /* size estimates derived from pg_class */
double tuples;
double allvisfrac;
@@ -634,6 +635,33 @@ typedef struct IndexOptInfo
void (*amcostestimate) (); /* AM's cost estimator */
} IndexOptInfo;
+/*
+ * MVStatisticInfo
+ * Information about multivariate stats for planning/optimization
+ *
+ * This contains information about which columns are covered by the
+ * statistics (stakeys), which options were requested while adding the
+ * statistics (*_enabled), and which kinds of statistics were actually
+ * built and are available for the optimizer (*_built).
+ */
+typedef struct MVStatisticInfo
+{
+ NodeTag type;
+
+ Oid mvoid; /* OID of the statistics row */
+ RelOptInfo *rel; /* back-link to index's table */
+
+ /* enabled statistics */
+ bool deps_enabled; /* functional dependencies enabled */
+
+ /* built/available statistics */
+ bool deps_built; /* functional dependencies built */
+
+ /* columns in the statistics (attnums) */
+ int2vector *stakeys; /* attnums of the columns covered */
+
+} MVStatisticInfo;
+
/*
* EquivalenceClasses
diff --git a/src/include/utils/acl.h b/src/include/utils/acl.h
index 4e15a14..3e11253 100644
--- a/src/include/utils/acl.h
+++ b/src/include/utils/acl.h
@@ -330,6 +330,7 @@ extern bool pg_foreign_data_wrapper_ownercheck(Oid srv_oid, Oid roleid);
extern bool pg_foreign_server_ownercheck(Oid srv_oid, Oid roleid);
extern bool pg_event_trigger_ownercheck(Oid et_oid, Oid roleid);
extern bool pg_extension_ownercheck(Oid ext_oid, Oid roleid);
+extern bool pg_statistics_ownercheck(Oid stat_oid, Oid roleid);
extern bool has_createrole_privilege(Oid roleid);
extern bool has_bypassrls_privilege(Oid roleid);
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..7ebd961
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,70 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "fmgr.h"
+#include "commands/vacuum.h"
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/* An associative rule, tracking [a => b] dependency.
+ *
+ * TODO Make this work with multiple columns on both sides.
+ */
+typedef struct MVDependencyData {
+ int16 a;
+ int16 b;
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+
+#endif
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index f2bebf2..8771f9c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -61,6 +61,7 @@ typedef struct RelationData
bool rd_isvalid; /* relcache entry is valid */
char rd_indexvalid; /* state of rd_indexlist: 0 = not valid, 1 =
* valid, 2 = temporarily forced */
+ bool rd_mvstatvalid; /* state of rd_mvstatlist: true/false */
/*
* rd_createSubid is the ID of the highest subtransaction the rel has
@@ -93,6 +94,9 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
+
+ /* data managed by RelationGetMVStatList: */
+ List *rd_mvstatlist; /* list of OIDs of multivariate stats */
/* data managed by RelationGetIndexAttrBitmap: */
Bitmapset *rd_indexattr; /* identifies columns used in indexes */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 1b48304..9f03c8d 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -38,6 +38,7 @@ extern void RelationClose(Relation relation);
* Routines to compute/retrieve additional cached information
*/
extern List *RelationGetIndexList(Relation relation);
+extern List *RelationGetMVStatList(Relation relation);
extern Oid RelationGetOidIndex(Relation relation);
extern Oid RelationGetReplicaIndex(Relation relation);
extern List *RelationGetIndexExpressions(Relation relation);
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 256615b..0e0658d 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,8 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATNAMENSP,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/object_address.out b/src/test/regress/expected/object_address.out
index 75751be..eb60960 100644
--- a/src/test/regress/expected/object_address.out
+++ b/src/test/regress/expected/object_address.out
@@ -35,6 +35,7 @@ ALTER DEFAULT PRIVILEGES FOR ROLE regtest_addr_user REVOKE DELETE ON TABLES FROM
CREATE TRANSFORM FOR int LANGUAGE SQL (
FROM SQL WITH FUNCTION varchar_transform(internal),
TO SQL WITH FUNCTION int4recv(internal));
+CREATE STATISTICS addr_nsp.gentable_stat ON addr_nsp.gentable(a,b) WITH (dependencies);
-- test some error cases
SELECT pg_get_object_address('stone', '{}', '{}');
ERROR: unrecognized object type "stone"
@@ -373,7 +374,8 @@ WITH objects (type, name, args) AS (VALUES
-- extension
-- event trigger
('policy', '{addr_nsp, gentable, genpol}', '{}'),
- ('transform', '{int}', '{sql}')
+ ('transform', '{int}', '{sql}'),
+ ('statistics', '{addr_nsp, gentable_stat}', '{}')
)
SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)).*,
-- test roundtrip through pg_identify_object_as_address
@@ -420,13 +422,14 @@ SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)).*,
trigger | | | t on addr_nsp.gentable | t
operator family | pg_catalog | integer_ops | pg_catalog.integer_ops USING btree | t
policy | | | genpol on addr_nsp.gentable | t
+ statistics | addr_nsp | gentable_stat | addr_nsp.gentable_stat | t
collation | pg_catalog | "default" | pg_catalog."default" | t
transform | | | for integer on language sql | t
text search dictionary | addr_nsp | addr_ts_dict | addr_nsp.addr_ts_dict | t
text search parser | addr_nsp | addr_ts_prs | addr_nsp.addr_ts_prs | t
text search configuration | addr_nsp | addr_ts_conf | addr_nsp.addr_ts_conf | t
text search template | addr_nsp | addr_ts_temp | addr_nsp.addr_ts_temp | t
-(41 rows)
+(42 rows)
---
--- Cleanup resources
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 22ea06c..06f2231 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1368,6 +1368,15 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.staname,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index eb0bc88..92a0d8a 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
diff --git a/src/test/regress/sql/object_address.sql b/src/test/regress/sql/object_address.sql
index 68e7cb0..3775b28 100644
--- a/src/test/regress/sql/object_address.sql
+++ b/src/test/regress/sql/object_address.sql
@@ -39,6 +39,7 @@ ALTER DEFAULT PRIVILEGES FOR ROLE regtest_addr_user REVOKE DELETE ON TABLES FROM
CREATE TRANSFORM FOR int LANGUAGE SQL (
FROM SQL WITH FUNCTION varchar_transform(internal),
TO SQL WITH FUNCTION int4recv(internal));
+CREATE STATISTICS addr_nsp.gentable_stat ON addr_nsp.gentable(a,b) WITH (dependencies);
-- test some error cases
SELECT pg_get_object_address('stone', '{}', '{}');
@@ -166,7 +167,8 @@ WITH objects (type, name, args) AS (VALUES
-- extension
-- event trigger
('policy', '{addr_nsp, gentable, genpol}', '{}'),
- ('transform', '{int}', '{sql}')
+ ('transform', '{int}', '{sql}'),
+ ('statistics', '{addr_nsp, gentable_stat}', '{}')
)
SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)).*,
-- test roundtrip through pg_identify_object_as_address
--
2.5.0
0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchtext/x-patch; charset=UTF-8; name=0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchDownload
From 771376751dfba0f469f6830c2a9eb545d1e25235 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Sun, 28 Feb 2016 21:16:40 +0100
Subject: [PATCH 9/9] fixup of regression tests (plans changes by group by
estimation)
---
src/test/regress/expected/join.out | 18 ++++++++++--------
src/test/regress/expected/subselect.out | 25 +++++++++++--------------
src/test/regress/expected/union.out | 16 ++++++++--------
3 files changed, 29 insertions(+), 30 deletions(-)
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index cafbc5e..151402d 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3965,18 +3965,20 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
explain (costs off)
select d.* from d left join (select distinct * from b) s
on d.a = s.id;
- QUERY PLAN
---------------------------------------
+ QUERY PLAN
+---------------------------------------------
Merge Right Join
- Merge Cond: (b.id = d.a)
- -> Unique
- -> Sort
- Sort Key: b.id, b.c_id
- -> Seq Scan on b
+ Merge Cond: (s.id = d.a)
+ -> Sort
+ Sort Key: s.id
+ -> Subquery Scan on s
+ -> HashAggregate
+ Group Key: b.id, b.c_id
+ -> Seq Scan on b
-> Sort
Sort Key: d.a
-> Seq Scan on d
-(9 rows)
+(11 rows)
-- check join removal works when uniqueness of the join condition is enforced
-- by a UNION
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index de64ca7..0fc93d9 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -807,27 +807,24 @@ select * from int4_tbl where
explain (verbose, costs off)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
- QUERY PLAN
-----------------------------------------------------------------------
- Hash Join
+ QUERY PLAN
+----------------------------------------------------------------
+ Hash Semi Join
Output: o.f1
Hash Cond: (o.f1 = "ANY_subquery".f1)
-> Seq Scan on public.int4_tbl o
Output: o.f1
-> Hash
Output: "ANY_subquery".f1, "ANY_subquery".g
- -> HashAggregate
+ -> Subquery Scan on "ANY_subquery"
Output: "ANY_subquery".f1, "ANY_subquery".g
- Group Key: "ANY_subquery".f1, "ANY_subquery".g
- -> Subquery Scan on "ANY_subquery"
- Output: "ANY_subquery".f1, "ANY_subquery".g
- Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
- -> HashAggregate
- Output: i.f1, (generate_series(1, 2) / 10)
- Group Key: i.f1
- -> Seq Scan on public.int4_tbl i
- Output: i.f1
-(18 rows)
+ Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
+ -> HashAggregate
+ Output: i.f1, (generate_series(1, 2) / 10)
+ Group Key: i.f1
+ -> Seq Scan on public.int4_tbl i
+ Output: i.f1
+(15 rows)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 016571b..f2e297e 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -263,16 +263,16 @@ ORDER BY 1;
SELECT q2 FROM int8_tbl INTERSECT SELECT q1 FROM int8_tbl;
q2
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
SELECT q2 FROM int8_tbl INTERSECT ALL SELECT q1 FROM int8_tbl;
q2
------------------
+ 123
4567890123456789
4567890123456789
- 123
(3 rows)
SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
@@ -305,16 +305,16 @@ SELECT q1 FROM int8_tbl EXCEPT SELECT q2 FROM int8_tbl;
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q2 FROM int8_tbl;
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT DISTINCT q2 FROM int8_tbl;
q1
------------------
+ 123
4567890123456789
4567890123456789
- 123
(3 rows)
SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q1 FROM int8_tbl FOR NO KEY UPDATE;
@@ -343,8 +343,8 @@ SELECT f1 FROM float8_tbl EXCEPT SELECT f1 FROM int4_tbl ORDER BY 1;
SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl;
q1
-------------------
- 4567890123456789
123
+ 4567890123456789
456
4567890123456789
123
@@ -355,15 +355,15 @@ SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FR
SELECT q1 FROM int8_tbl INTERSECT (((SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl)));
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
(((SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl))) UNION ALL SELECT q2 FROM int8_tbl;
q1
-------------------
- 4567890123456789
123
+ 4567890123456789
456
4567890123456789
123
@@ -419,8 +419,8 @@ HINT: There is a column named "q2" in table "*SELECT* 2", but it cannot be refe
SELECT q1 FROM int8_tbl EXCEPT (((SELECT q2 FROM int8_tbl ORDER BY q2 LIMIT 1)));
q1
------------------
- 4567890123456789
123
+ 4567890123456789
(2 rows)
--
--
2.5.0
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchtext/x-patch; charset=UTF-8; name=0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchDownload
From e073a4d8368a0b0c66cd2933e6a7a210c1b5c53f Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 1/9] teach pull_(varno|varattno)_walker about RestrictInfo
otherwise pull_varnos fails when processing OR clauses
---
src/backend/optimizer/util/var.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index 292e1f4..9228a46 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -196,6 +196,13 @@ pull_varnos_walker(Node *node, pull_varnos_context *context)
context->sublevels_up--;
return result;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo*)node;
+ context->varnos = bms_add_members(context->varnos,
+ rinfo->clause_relids);
+ return false;
+ }
return expression_tree_walker(node, pull_varnos_walker,
(void *) context);
}
@@ -244,6 +251,15 @@ pull_varattnos_walker(Node *node, pull_varattnos_context *context)
return false;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *)node;
+
+ return expression_tree_walker((Node*)rinfo->clause,
+ pull_varattnos_walker,
+ (void*) context);
+ }
+
/* Should not find an unplanned subquery */
Assert(!IsA(node, Query));
--
2.5.0
Instead of simply multiplying the ndistinct estimate with selecticity,
we instead use the formula for the expected number of distinct values
observed in 'k' rows when there are 'd' distinct values in the bind * (1 - ((d - 1) / d)^k)
This is 'with replacements' which seems appropriate for the use, and it
mostly assumes uniform distribution of the distinct values. So if the
distribution is not uniform (e.g. there are very frequent groups) this
may be less accurate than the current algorithm in some cases, giving
over-estimates. But that's probably better than OOM.
---
src/backend/utils/adt/selfuncs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index f8d39aa..6eceedf 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -3466,7 +3466,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows, /* * Multiply by restriction selectivity. */ - reldistinct *= rel->rows / rel->tuples; + reldistinct = reldistinct * (1 - powl((reldistinct - 1) / reldistinct,rel->rows));
Why do you change "*=" style? I see no reason to change this.
reldistinct *= 1 - powl((reldistinct - 1) / reldistinct, rel->rows);
Looks better to me because it's shorter and cleaner.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I apology if it's already discussed. I am new to this patch.
Attached is v15 of the patch series, fixing this and also doing quite a
few additional improvements:* added some basic examples into the SGML documentation
* addressing the objectaddress omissions, as pointed out by Alvaro
* support for ALTER STATISTICS ... OWNER TO / RENAME / SET SCHEMA
* significant refactoring of MCV and histogram code, particularly
serialization, deserialization and building* reworking the functional dependencies to support more complex
dependencies, with multiple columns as 'conditions'* the reduction using functional dependencies is also significantly
simplified (I decided to get rid of computing the transitive closure
for now - it got too complex after the multi-condition dependencies,
so I'll leave that for the future
Do you have any other missing parts in this work? I am asking because
I wonder if you want to push this into 9.6 or rather 9.7.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello, I returned to this.
At Sun, 13 Mar 2016 22:59:38 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <1457906378.27231.10.camel@2ndquadrant.com>
Oh, yeah. There was an extra pfree().
Attached is v15 of the patch series, fixing this and also doing quite a
few additional improvements:* added some basic examples into the SGML documentation
* addressing the objectaddress omissions, as pointed out by Alvaro
* support for ALTER STATISTICS ... OWNER TO / RENAME / SET SCHEMA
* significant refactoring of MCV and histogram code, particularly
serialization, deserialization and building* reworking the functional dependencies to support more complex
dependencies, with multiple columns as 'conditions'* the reduction using functional dependencies is also significantly
simplified (I decided to get rid of computing the transitive closure
for now - it got too complex after the multi-condition dependencies,
so I'll leave that for the future
Many trailing white spaces found.
0002
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
2014 should be 2016?
This patch defines many "magic"s for many structs, but
magic(number)s seems to be used to identify file or buffer page
in PostgreSQL. They wouldn't be needed if you don't intend to
dig out or identify the orphan memory blocks of mvstats.
+ MVDependency deps[1]; /* XXX why not a pointer? */
MVDependency seems to be a pointer type.
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
and
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
seem to be contradicting.
.. Sorry, time is up..
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 03/16/2016 09:31 AM, Kyotaro HORIGUCHI wrote:
Hello, I returned to this.
At Sun, 13 Mar 2016 22:59:38 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <1457906378.27231.10.camel@2ndquadrant.com>
Oh, yeah. There was an extra pfree().
Attached is v15 of the patch series, fixing this and also doing quite a
few additional improvements:* added some basic examples into the SGML documentation
* addressing the objectaddress omissions, as pointed out by Alvaro
* support for ALTER STATISTICS ... OWNER TO / RENAME / SET SCHEMA
* significant refactoring of MCV and histogram code, particularly
serialization, deserialization and building* reworking the functional dependencies to support more complex
dependencies, with multiple columns as 'conditions'* the reduction using functional dependencies is also significantly
simplified (I decided to get rid of computing the transitive closure
for now - it got too complex after the multi-condition dependencies,
so I'll leave that for the futureMany trailing white spaces found.
Sorry, haven't noticed that after one of the rebases. Fixed in the
attached v15 of the patch.
0002
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
2014 should be 2016?
Yes, the copyright info will need some tweaks. There's a few other files
with 2015, and I think the start should be the current year (and not 1996).
This patch defines many "magic"s for many structs, but
magic(number)s seems to be used to identify file or buffer page
in PostgreSQL. They wouldn't be needed if you don't intend to
dig out or identify the orphan memory blocks of mvstats.+ MVDependency deps[1]; /* XXX why not a pointer? */
MVDependency seems to be a pointer type.
Right, but we need an array of the structures here, so one way is to use
a pointer and the other one is using variable-length field. Will remove
the comment, I think the structure is fine as is.
+ if (numcols >= MVSTATS_MAX_DIMENSIONS) + ereport(ERROR, and + Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));seem to be contradicting.
Nope, because the first check is in a loop where 'numcols' is used as an
index into an array with MVSTATS_MAX_DIMENSIONS elements.
.. Sorry, time is up..
Thanks for the comments!
Attached is v15 of the patch, that also fixes one mistake - after
reworking the functional dependencies to support multiple columns on the
left side (as conditions), I failed to move it to the proper place in
the patch series. So 0002 built the dependencies in the old way and 0003
changed it to the new one. That was pointless and added another 20kB to
the patch, so v15 moves the new code to 0002.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchtext/x-patch; name=0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patchDownload
From 5de240e541a0893ed945b16ec1fe23522c00ae61 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 1/9] teach pull_(varno|varattno)_walker about RestrictInfo
otherwise pull_varnos fails when processing OR clauses
---
src/backend/optimizer/util/var.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index 292e1f4..9228a46 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -196,6 +196,13 @@ pull_varnos_walker(Node *node, pull_varnos_context *context)
context->sublevels_up--;
return result;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo*)node;
+ context->varnos = bms_add_members(context->varnos,
+ rinfo->clause_relids);
+ return false;
+ }
return expression_tree_walker(node, pull_varnos_walker,
(void *) context);
}
@@ -244,6 +251,15 @@ pull_varattnos_walker(Node *node, pull_varattnos_context *context)
return false;
}
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *)node;
+
+ return expression_tree_walker((Node*)rinfo->clause,
+ pull_varattnos_walker,
+ (void*) context);
+ }
+
/* Should not find an unplanned subquery */
Assert(!IsA(node, Query));
--
2.5.0
0002-shared-infrastructure-and-functional-dependencies.patchtext/x-patch; name=0002-shared-infrastructure-and-functional-dependencies.patchDownload
From 6cf5d3b456bb294bb033b9e1e2eb545cfd4c1739 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 19:51:48 +0100
Subject: [PATCH 2/9] shared infrastructure and functional dependencies
Basic infrastructure shared by all kinds of multivariate stats, most
importantly:
- adds a new system catalog (pg_mv_statistic)
- CREATE STATISTICS name ON table (columns) WITH (options)
- DROP STATISTICS name
- ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME
- implementation of functional dependencies (the simplest type of
multivariate statistics)
- building functional dependencies in ANALYZE
- updates existing regression tests (new catalog etc.)
- adds a new regression test for functional dependencies
This does not include any changes to the optimizer, i.e. it does not
influence the query planning (subject to follow-up patches).
The current implementation requires a valid 'ltopr' for the columns, so
that we can sort the sample rows in various ways, both in this patch
and other kinds of statistics. Maybe this restriction could be relaxed
in the future, requiring just 'eqopr' in case of stats not sorting the
data (e.g. functional dependencies and MCV lists).
Maybe some of the stats (functional dependencies and MCV list with
limited functionality) might be made to work with hashes of the values,
which is sufficient for equality comparisons. But the queries would
require the equality operator anyway, so it's not really a weaker
requirement. The hashes might reduce space requirements, though.
The algorithm detecting the dependencies is rather simple and probably
needs improvements, so that it detects more complicated dependencies,
and also validation of the math.
The name 'functional dependencies' is more correct (than 'association
rules') as it's exactly the name used in relational theory (esp. Normal
Forms) for tracking column-level dependencies.
The multivariate statistics are automatically removed in two situations
(a) after a DROP TABLE (obviously)
(b) after ALTER TABLE ... DROP COLUMN, if the statistics would be
defined on less than 2 columns (remaining)
If there are more at least two remaining columns, we keep the
statistics but perform cleanup on the next ANALYZE. The dropped columns
are removed from stakeys, and the new statistics is built on the
smaller set.
We can't do this at DROP COLUMN, because that'd leave us with invalid
statistics, or we'd have to throw it away although we can still use it.
This lazy approach lets us use the statistics although some of the
columns are dead.
This also adds a simple list of statistics to \d in psql.
This means the statistics are created within a schema by using a
qualified name (or using the default schema)
CREATE STATISTICS schema.statistics ON ...
and then dropped by specifying qualified name
DROP STATISTICS schema.statistics
or searching through search_path (just like with other objects).
This also gets rid of the "(opt_)stats_name" definitions in gram.y and
instead replaces them with just "opt_any_name", although the optional
case is not really handled currently - there's no generated name yet
(so either we should drop it or implement it).
I'm not entirely sure making statistics schema-specific is that a great
idea. Maybe it should be "global", but that does not seem right (e.g.
it makes multi-tenant systems based on schemas more difficult to
manage, because tenants would interact).
---
doc/src/sgml/ref/allfiles.sgml | 3 +
doc/src/sgml/ref/alter_statistics.sgml | 115 +++++
doc/src/sgml/ref/create_statistics.sgml | 198 ++++++++
doc/src/sgml/ref/drop_statistics.sgml | 91 ++++
doc/src/sgml/reference.sgml | 2 +
src/backend/catalog/Makefile | 1 +
src/backend/catalog/aclchk.c | 27 +
src/backend/catalog/dependency.c | 11 +-
src/backend/catalog/heap.c | 102 ++++
src/backend/catalog/namespace.c | 51 ++
src/backend/catalog/objectaddress.c | 54 ++
src/backend/catalog/system_views.sql | 11 +
src/backend/commands/Makefile | 6 +-
src/backend/commands/alter.c | 3 +
src/backend/commands/analyze.c | 21 +
src/backend/commands/dropcmds.c | 4 +
src/backend/commands/event_trigger.c | 3 +
src/backend/commands/statscmds.c | 277 +++++++++++
src/backend/nodes/copyfuncs.c | 17 +
src/backend/nodes/outfuncs.c | 18 +
src/backend/optimizer/util/plancat.c | 59 +++
src/backend/parser/gram.y | 60 ++-
src/backend/tcop/utility.c | 14 +
src/backend/utils/Makefile | 2 +-
src/backend/utils/cache/relcache.c | 59 +++
src/backend/utils/cache/syscache.c | 23 +
src/backend/utils/mvstats/Makefile | 17 +
src/backend/utils/mvstats/README.dependencies | 222 +++++++++
src/backend/utils/mvstats/common.c | 376 ++++++++++++++
src/backend/utils/mvstats/common.h | 78 +++
src/backend/utils/mvstats/dependencies.c | 686 ++++++++++++++++++++++++++
src/bin/psql/describe.c | 44 ++
src/include/catalog/dependency.h | 5 +-
src/include/catalog/heap.h | 1 +
src/include/catalog/indexing.h | 7 +
src/include/catalog/namespace.h | 2 +
src/include/catalog/pg_mv_statistic.h | 75 +++
src/include/catalog/pg_proc.h | 5 +
src/include/catalog/toasting.h | 1 +
src/include/commands/defrem.h | 4 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 12 +
src/include/nodes/relation.h | 28 ++
src/include/utils/acl.h | 1 +
src/include/utils/mvstats.h | 71 +++
src/include/utils/rel.h | 4 +
src/include/utils/relcache.h | 1 +
src/include/utils/syscache.h | 2 +
src/test/regress/expected/mv_dependencies.out | 150 ++++++
src/test/regress/expected/object_address.out | 7 +-
src/test/regress/expected/rules.out | 9 +
src/test/regress/expected/sanity_check.out | 1 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_dependencies.sql | 142 ++++++
src/test/regress/sql/object_address.sql | 4 +-
55 files changed, 3179 insertions(+), 11 deletions(-)
create mode 100644 doc/src/sgml/ref/alter_statistics.sgml
create mode 100644 doc/src/sgml/ref/create_statistics.sgml
create mode 100644 doc/src/sgml/ref/drop_statistics.sgml
create mode 100644 src/backend/commands/statscmds.c
create mode 100644 src/backend/utils/mvstats/Makefile
create mode 100644 src/backend/utils/mvstats/README.dependencies
create mode 100644 src/backend/utils/mvstats/common.c
create mode 100644 src/backend/utils/mvstats/common.h
create mode 100644 src/backend/utils/mvstats/dependencies.c
create mode 100644 src/include/catalog/pg_mv_statistic.h
create mode 100644 src/include/utils/mvstats.h
create mode 100644 src/test/regress/expected/mv_dependencies.out
create mode 100644 src/test/regress/sql/mv_dependencies.sql
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index bf95453..524ed83 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -32,6 +32,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY alterServer SYSTEM "alter_server.sgml">
<!ENTITY alterSequence SYSTEM "alter_sequence.sgml">
<!ENTITY alterSystem SYSTEM "alter_system.sgml">
+<!ENTITY alterStatistics SYSTEM "alter_statistics.sgml">
<!ENTITY alterTable SYSTEM "alter_table.sgml">
<!ENTITY alterTableSpace SYSTEM "alter_tablespace.sgml">
<!ENTITY alterTSConfig SYSTEM "alter_tsconfig.sgml">
@@ -76,6 +77,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY createSchema SYSTEM "create_schema.sgml">
<!ENTITY createSequence SYSTEM "create_sequence.sgml">
<!ENTITY createServer SYSTEM "create_server.sgml">
+<!ENTITY createStatistics SYSTEM "create_statistics.sgml">
<!ENTITY createTable SYSTEM "create_table.sgml">
<!ENTITY createTableAs SYSTEM "create_table_as.sgml">
<!ENTITY createTableSpace SYSTEM "create_tablespace.sgml">
@@ -119,6 +121,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY dropSchema SYSTEM "drop_schema.sgml">
<!ENTITY dropSequence SYSTEM "drop_sequence.sgml">
<!ENTITY dropServer SYSTEM "drop_server.sgml">
+<!ENTITY dropStatistics SYSTEM "drop_statistics.sgml">
<!ENTITY dropTable SYSTEM "drop_table.sgml">
<!ENTITY dropTableSpace SYSTEM "drop_tablespace.sgml">
<!ENTITY dropTransform SYSTEM "drop_transform.sgml">
diff --git a/doc/src/sgml/ref/alter_statistics.sgml b/doc/src/sgml/ref/alter_statistics.sgml
new file mode 100644
index 0000000..aa421c0
--- /dev/null
+++ b/doc/src/sgml/ref/alter_statistics.sgml
@@ -0,0 +1,115 @@
+<!--
+doc/src/sgml/ref/alter_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-ALTERSTATISTICS">
+ <indexterm zone="sql-alterstatistics">
+ <primary>ALTER STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>ALTER STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>ALTER STATISTICS</refname>
+ <refpurpose>
+ change the definition of a multivariate statistics
+ </refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+ALTER STATISTICS <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable class="PARAMETER">new_owner</replaceable> | CURRENT_USER | SESSION_USER }
+ALTER STATISTICS <replaceable class="parameter">name</replaceable> RENAME TO <replaceable class="parameter">new_name</replaceable>
+ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <replaceable class="parameter">new_schema</replaceable>
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <command>ALTER STATISTICS</command> changes the parameters of an existing
+ multivariate statistics. Any parameters not specifically set in the
+ <command>ALTER STATISTICS</command> command retain their prior settings.
+ </para>
+
+ <para>
+ You must own the statistics to use <command>ALTER STATISTICS</>.
+ To change a statistics' schema, you must also have <literal>CREATE</>
+ privilege on the new schema.
+ To alter the owner, you must also be a direct or indirect member of the new
+ owning role, and that role must have <literal>CREATE</literal> privilege on
+ the statistics' schema. (These restrictions enforce that altering the owner
+ doesn't do anything you couldn't do by dropping and recreating the statistics.
+ However, a superuser can alter ownership of any statistics anyway.)
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><replaceable class="parameter">name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of a statistics to be altered.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">new_owner</replaceable></term>
+ <listitem>
+ <para>
+ The user name of the new owner of the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">new_name</replaceable></term>
+ <listitem>
+ <para>
+ The new name for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">new_schema</replaceable></term>
+ <listitem>
+ <para>
+ The new schema for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>ALTER STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-createstatistics"></member>
+ <member><xref linkend="sql-dropstatistics"></member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
new file mode 100644
index 0000000..ff09fa5
--- /dev/null
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -0,0 +1,198 @@
+<!--
+doc/src/sgml/ref/create_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-CREATESTATISTICS">
+ <indexterm zone="sql-createstatistics">
+ <primary>CREATE STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>CREATE STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>CREATE STATISTICS</refname>
+ <refpurpose>define a new statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable> ON <replaceable class="PARAMETER">table_name</replaceable> ( [
+ { <replaceable class="PARAMETER">column_name</replaceable> } ] [, ...])
+[ WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )
+</synopsis>
+
+ </refsynopsisdiv>
+
+ <refsect1 id="SQL-CREATESTATISTICS-description">
+ <title>Description</title>
+
+ <para>
+ <command>CREATE STATISTICS</command> will create a new multivariate
+ statistics on the table. The statistics will be created in the in the
+ current database. The statistics will be owned by the user issuing
+ the command.
+ </para>
+
+ <para>
+ If a schema name is given (for example, <literal>CREATE STATISTICS
+ myschema.mystat ...</>) then the statistics is created in the specified
+ schema. Otherwise it is created in the current schema. The name of
+ the table must be distinct from the name of any other statistics in the
+ same schema.
+ </para>
+
+ <para>
+ To be able to create a table, you must have <literal>USAGE</literal>
+ privilege on all column types or the type in the <literal>OF</literal>
+ clause, respectively.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>IF NOT EXISTS</></term>
+ <listitem>
+ <para>
+ Do not throw an error if a statistics with the same name already exists.
+ A notice is issued in this case. Note that there is no guarantee that
+ the existing statistics is anything like the one that would have been
+ created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">statistics_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to be created.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">table_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the table the statistics should
+ be created on.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name</replaceable></term>
+ <listitem>
+ <para>
+ The name of a column to be included in the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>WITH ( <replaceable class="PARAMETER">statistics_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ ...
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <refsect2 id="SQL-CREATESTATISTICS-parameters">
+ <title id="SQL-CREATESTATISTICS-parameters-title">Statistics Parameters</title>
+
+ <indexterm zone="sql-createstatistics-parameters">
+ <primary>statistics parameters</primary>
+ </indexterm>
+
+ <para>
+ The <literal>WITH</> clause can specify <firstterm>statistics parameters</>
+ for statistics. The currently available parameters are listed below.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>dependencies</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables functional dependencies for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1 id="SQL-CREATESTATISTICS-notes">
+ <title>Notes</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+
+ <refsect1 id="SQL-CREATESTATISTICS-examples">
+ <title>Examples</title>
+
+ <para>
+ Create table <structname>t1</> with two functionally dependent columns, i.e.
+ knowledge of a value in the first column is sufficient for detemining the
+ value in the other column. Then functional dependencies are built on those
+ columns:
+
+<programlisting>
+CREATE TABLE t1 (
+ a int,
+ b int
+);
+
+INSERT INTO t1 SELECT i/100, i/500
+ FROM generate_series(1,1000000) s(i);
+
+CREATE STATISTICS s1 ON t1 (a, b) WITH (dependencies);
+
+ANALYZE t1;
+
+-- valid combination of values
+EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 1);
+
+-- invalid combination of values
+EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 2);
+</programlisting>
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>CREATE STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-alterstatistics"></member>
+ <member><xref linkend="sql-dropstatistics"></member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/doc/src/sgml/ref/drop_statistics.sgml b/doc/src/sgml/ref/drop_statistics.sgml
new file mode 100644
index 0000000..dd9047a
--- /dev/null
+++ b/doc/src/sgml/ref/drop_statistics.sgml
@@ -0,0 +1,91 @@
+<!--
+doc/src/sgml/ref/drop_statistics.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="SQL-DROPSTATISTICS">
+ <indexterm zone="sql-dropstatistics">
+ <primary>DROP STATISTICS</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>DROP STATISTICS</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>DROP STATISTICS</refname>
+ <refpurpose>remove a statistics</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> [, ...]
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <command>DROP STATISTICS</command> removes statistics from the database.
+ Only the statistics owner, the schema owner, and superuser can drop a
+ statistics.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>IF EXISTS</literal></term>
+ <listitem>
+ <para>
+ Do not throw an error if the statistics does not exist. A notice is
+ issued in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">name</replaceable></term>
+ <listitem>
+ <para>
+ The name (optionally schema-qualified) of the statistics to drop.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Examples</title>
+
+ <para>
+ ...
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ There's no <command>DROP STATISTICS</command> command in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-alterstatistics"></member>
+ <member><xref linkend="sql-createstatistics"></member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 03020df..2b07b2d 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -104,6 +104,7 @@
&createSchema;
&createSequence;
&createServer;
+ &createStatistics;
&createTable;
&createTableAs;
&createTableSpace;
@@ -147,6 +148,7 @@
&dropSchema;
&dropSequence;
&dropServer;
+ &dropStatistics;
&dropTable;
&dropTableSpace;
&dropTSConfig;
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 25130ec..058b8a9 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -32,6 +32,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \
pg_opfamily.h pg_opclass.h pg_am.h pg_amop.h pg_amproc.h \
pg_language.h pg_largeobject_metadata.h pg_largeobject.h pg_aggregate.h \
+ pg_mv_statistic.h \
pg_statistic.h pg_rewrite.h pg_trigger.h pg_event_trigger.h pg_description.h \
pg_cast.h pg_enum.h pg_namespace.h pg_conversion.h pg_depend.h \
pg_database.h pg_db_role_setting.h pg_tablespace.h pg_pltemplate.h \
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 0f3bc07..e21aacd 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -38,6 +38,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -5021,6 +5022,32 @@ pg_extension_ownercheck(Oid ext_oid, Oid roleid)
}
/*
+ * Ownership check for a multivariate statistics (specified by OID).
+ */
+bool
+pg_statistics_ownercheck(Oid stat_oid, Oid roleid)
+{
+ HeapTuple tuple;
+ Oid ownerId;
+
+ /* Superusers bypass all permission checking. */
+ if (superuser_arg(roleid))
+ return true;
+
+ tuple = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(stat_oid));
+ if (!HeapTupleIsValid(tuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics with OID %u does not exist", stat_oid)));
+
+ ownerId = ((Form_pg_mv_statistic) GETSTRUCT(tuple))->staowner;
+
+ ReleaseSysCache(tuple);
+
+ return has_privs_of_role(roleid, ownerId);
+}
+
+/*
* Check whether specified role has CREATEROLE privilege (or is a superuser)
*
* Note: roles do not have owners per se; instead we use this test in
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index c48e37b..8200454 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -40,6 +40,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -160,7 +161,8 @@ static const Oid object_classes[] = {
ExtensionRelationId, /* OCLASS_EXTENSION */
EventTriggerRelationId, /* OCLASS_EVENT_TRIGGER */
PolicyRelationId, /* OCLASS_POLICY */
- TransformRelationId /* OCLASS_TRANSFORM */
+ TransformRelationId, /* OCLASS_TRANSFORM */
+ MvStatisticRelationId /* OCLASS_STATISTICS */
};
@@ -1272,6 +1274,10 @@ doDeletion(const ObjectAddress *object, int flags)
DropTransformById(object->objectId);
break;
+ case OCLASS_STATISTICS:
+ RemoveStatisticsById(object->objectId);
+ break;
+
default:
elog(ERROR, "unrecognized object class: %u",
object->classId);
@@ -2415,6 +2421,9 @@ getObjectClass(const ObjectAddress *object)
case TransformRelationId:
return OCLASS_TRANSFORM;
+
+ case MvStatisticRelationId:
+ return OCLASS_STATISTICS;
}
/* shouldn't get here */
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index e997b57..47ec8cc 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -47,6 +47,7 @@
#include "catalog/pg_constraint_fn.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_inherits.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_tablespace.h"
@@ -1613,7 +1614,10 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
heap_close(attr_rel, RowExclusiveLock);
if (attnum > 0)
+ {
RemoveStatistics(relid, attnum);
+ RemoveMVStatistics(relid, attnum);
+ }
relation_close(rel, NoLock);
}
@@ -1841,6 +1845,11 @@ heap_drop_with_catalog(Oid relid)
RemoveStatistics(relid, 0);
/*
+ * delete multi-variate statistics
+ */
+ RemoveMVStatistics(relid, 0);
+
+ /*
* delete attribute tuples
*/
DeleteAttributeTuples(relid);
@@ -2692,6 +2701,99 @@ RemoveStatistics(Oid relid, AttrNumber attnum)
/*
+ * RemoveMVStatistics --- remove entries in pg_mv_statistic for a rel
+ *
+ * If attnum is zero, remove all entries for rel; else remove only the one(s)
+ * for that column.
+ */
+void
+RemoveMVStatistics(Oid relid, AttrNumber attnum)
+{
+ Relation pgmvstatistic;
+ TupleDesc tupdesc = NULL;
+ SysScanDesc scan;
+ ScanKeyData key;
+ HeapTuple tuple;
+
+ /*
+ * When dropping a column, we'll drop statistics with a single
+ * remaining (undropped column). To do that, we need the tuple
+ * descriptor.
+ *
+ * We already have the relation locked (as we're running ALTER
+ * TABLE ... DROP COLUMN), so we'll just get the descriptor here.
+ */
+ if (attnum != 0)
+ {
+ Relation rel = relation_open(relid, NoLock);
+
+ /* multivariate stats are supported on tables and matviews */
+ if (rel->rd_rel->relkind == RELKIND_RELATION ||
+ rel->rd_rel->relkind == RELKIND_MATVIEW)
+ tupdesc = RelationGetDescr(rel);
+
+ relation_close(rel, NoLock);
+ }
+
+ if (tupdesc == NULL)
+ return;
+
+ pgmvstatistic = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ ScanKeyInit(&key,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ scan = systable_beginscan(pgmvstatistic,
+ MvStatisticRelidIndexId,
+ true, NULL, 1, &key);
+
+ /* we must loop even when attnum != 0, in case of inherited stats */
+ while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+ {
+ bool delete = true;
+
+ if (attnum != 0)
+ {
+ Datum adatum;
+ bool isnull;
+ int i;
+ int ncolumns = 0;
+ ArrayType *arr;
+ int16 *attnums;
+
+ /* get the columns */
+ adatum = SysCacheGetAttr(MVSTATOID, tuple,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+ attnums = (int16*)ARR_DATA_PTR(arr);
+
+ for (i = 0; i < ARR_DIMS(arr)[0]; i++)
+ {
+ /* count the column unless it's has been / is being dropped */
+ if ((! tupdesc->attrs[attnums[i]-1]->attisdropped) &&
+ (attnums[i] != attnum))
+ ncolumns += 1;
+ }
+
+ /* delete if there are less than two attributes */
+ delete = (ncolumns < 2);
+ }
+
+ if (delete)
+ simple_heap_delete(pgmvstatistic, &tuple->t_self);
+ }
+
+ systable_endscan(scan);
+
+ heap_close(pgmvstatistic, RowExclusiveLock);
+}
+
+
+/*
* RelationTruncateIndexes - truncate all indexes associated
* with the heap relation to zero tuples.
*
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 446b2ac..dfd5bef 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -4201,3 +4201,54 @@ pg_is_other_temp_schema(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(isOtherTempNamespace(oid));
}
+
+Oid
+get_statistics_oid(List *names, bool missing_ok)
+{
+ char *schemaname;
+ char *stats_name;
+ Oid namespaceId;
+ Oid stats_oid = InvalidOid;
+ ListCell *l;
+
+ /* deconstruct the name list */
+ DeconstructQualifiedName(names, &schemaname, &stats_name);
+
+ if (schemaname)
+ {
+ /* use exact schema given */
+ namespaceId = LookupExplicitNamespace(schemaname, missing_ok);
+ if (missing_ok && !OidIsValid(namespaceId))
+ stats_oid = InvalidOid;
+ else
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ }
+ else
+ {
+ /* search for it in search path */
+ recomputeNamespacePath();
+
+ foreach(l, activeSearchPath)
+ {
+ namespaceId = lfirst_oid(l);
+
+ if (namespaceId == myTempNamespace)
+ continue; /* do not look in temp namespace */
+ stats_oid = GetSysCacheOid2(MVSTATNAMENSP,
+ PointerGetDatum(stats_name),
+ ObjectIdGetDatum(namespaceId));
+ if (OidIsValid(stats_oid))
+ break;
+ }
+ }
+
+ if (!OidIsValid(stats_oid) && !missing_ok)
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("statistics \"%s\" does not exist",
+ NameListToString(names))));
+
+ return stats_oid;
+}
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index d2aaa6d..c13a569 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -39,6 +39,7 @@
#include "catalog/pg_language.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_opfamily.h"
@@ -438,9 +439,22 @@ static const ObjectPropertyType ObjectProperty[] =
Anum_pg_type_typacl,
ACL_KIND_TYPE,
true
+ },
+ {
+ MvStatisticRelationId,
+ MvStatisticOidIndexId,
+ MVSTATOID,
+ MVSTATNAMENSP,
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ Anum_pg_mv_statistic_staowner,
+ InvalidAttrNumber, /* no ACL (same as relation) */
+ -1, /* no ACL */
+ true
}
};
+
/*
* This struct maps the string object types as returned by
* getObjectTypeDescription into ObjType enum values. Note that some enum
@@ -640,6 +654,10 @@ static const struct object_type_map
/* OCLASS_TRANSFORM */
{
"transform", OBJECT_TRANSFORM
+ },
+ /* OBJECT_STATISTICS */
+ {
+ "statistics", OBJECT_STATISTICS
}
};
@@ -913,6 +931,11 @@ get_object_address(ObjectType objtype, List *objname, List *objargs,
address = get_object_address_defacl(objname, objargs,
missing_ok);
break;
+ case OBJECT_STATISTICS:
+ address.classId = MvStatisticRelationId;
+ address.objectId = get_statistics_oid(objname, missing_ok);
+ address.objectSubId = 0;
+ break;
default:
elog(ERROR, "unrecognized objtype: %d", (int) objtype);
/* placate compiler, in case it thinks elog might return */
@@ -2185,6 +2208,10 @@ check_object_ownership(Oid roleid, ObjectType objtype, ObjectAddress address,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser")));
break;
+ case OBJECT_STATISTICS:
+ if (!pg_statistics_ownercheck(address.objectId, roleid))
+ aclcheck_error_type(ACLCHECK_NOT_OWNER, address.objectId);
+ break;
default:
elog(ERROR, "unrecognized object type: %d",
(int) objtype);
@@ -3610,6 +3637,10 @@ getObjectTypeDescription(const ObjectAddress *object)
appendStringInfoString(&buffer, "transform");
break;
+ case OCLASS_STATISTICS:
+ appendStringInfoString(&buffer, "statistics");
+ break;
+
default:
appendStringInfo(&buffer, "unrecognized %u", object->classId);
break;
@@ -4566,6 +4597,29 @@ getObjectIdentityParts(const ObjectAddress *object,
}
break;
+ case OCLASS_STATISTICS:
+ {
+ HeapTuple tup;
+ Form_pg_mv_statistic formStatistic;
+ char *schema;
+
+ tup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(object->objectId));
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "cache lookup failed for statistics %u",
+ object->objectId);
+ formStatistic = (Form_pg_mv_statistic) GETSTRUCT(tup);
+ schema = get_namespace_name_or_temp(formStatistic->stanamespace);
+ appendStringInfoString(&buffer,
+ quote_qualified_identifier(schema,
+ NameStr(formStatistic->staname)));
+ if (objname)
+ *objname = list_make2(schema,
+ pstrdup(NameStr(formStatistic->staname)));
+ ReleaseSysCache(tup);
+ break;
+ }
+
default:
appendStringInfo(&buffer, "unrecognized object %u %u %d",
object->classId,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 84aa061..31dbb2c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -158,6 +158,17 @@ CREATE VIEW pg_indexes AS
LEFT JOIN pg_tablespace T ON (T.oid = I.reltablespace)
WHERE C.relkind IN ('r', 'm') AND I.relkind = 'i';
+CREATE VIEW pg_mv_stats AS
+ SELECT
+ N.nspname AS schemaname,
+ C.relname AS tablename,
+ S.staname AS staname,
+ S.stakeys AS attnums,
+ length(S.stadeps) as depsbytes,
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
+ LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
+
CREATE VIEW pg_stats WITH (security_barrier) AS
SELECT
nspname AS schemaname,
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index b1ac704..5151001 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -18,8 +18,8 @@ OBJS = aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
event_trigger.o explain.o extension.o foreigncmds.o functioncmds.o \
indexcmds.o lockcmds.o matview.o operatorcmds.o opclasscmds.o \
policy.o portalcmds.o prepare.o proclang.o \
- schemacmds.o seclabel.o sequence.o tablecmds.o tablespace.o trigger.o \
- tsearchcmds.o typecmds.o user.o vacuum.o vacuumlazy.o \
- variable.o view.o
+ schemacmds.o seclabel.o sequence.o statscmds.o \
+ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o \
+ user.o vacuum.o vacuumlazy.o variable.o view.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
index 5af0f2f..89985499 100644
--- a/src/backend/commands/alter.c
+++ b/src/backend/commands/alter.c
@@ -359,6 +359,7 @@ ExecRenameStmt(RenameStmt *stmt)
case OBJECT_OPCLASS:
case OBJECT_OPFAMILY:
case OBJECT_LANGUAGE:
+ case OBJECT_STATISTICS:
case OBJECT_TSCONFIGURATION:
case OBJECT_TSDICTIONARY:
case OBJECT_TSPARSER:
@@ -437,6 +438,7 @@ ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt,
case OBJECT_OPERATOR:
case OBJECT_OPCLASS:
case OBJECT_OPFAMILY:
+ case OBJECT_STATISTICS:
case OBJECT_TSCONFIGURATION:
case OBJECT_TSDICTIONARY:
case OBJECT_TSPARSER:
@@ -745,6 +747,7 @@ ExecAlterOwnerStmt(AlterOwnerStmt *stmt)
case OBJECT_OPERATOR:
case OBJECT_OPCLASS:
case OBJECT_OPFAMILY:
+ case OBJECT_STATISTICS:
case OBJECT_TABLESPACE:
case OBJECT_TSDICTIONARY:
case OBJECT_TSCONFIGURATION:
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8a5f07c..9087532 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -17,6 +17,7 @@
#include <math.h>
#include "access/multixact.h"
+#include "access/sysattr.h"
#include "access/transam.h"
#include "access/tupconvert.h"
#include "access/tuptoaster.h"
@@ -27,6 +28,7 @@
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_inherits_fn.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "commands/dbcommands.h"
#include "commands/tablecmds.h"
@@ -45,10 +47,13 @@
#include "storage/procarray.h"
#include "utils/acl.h"
#include "utils/attoptcache.h"
+#include "utils/builtins.h"
#include "utils/datum.h"
+#include "utils/fmgroids.h"
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_rusage.h"
#include "utils/sampling.h"
#include "utils/sortsupport.h"
@@ -460,6 +465,19 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
* all analyzable columns. We use a lower bound of 100 rows to avoid
* possible overflow in Vitter's algorithm. (Note: that will also be the
* target in the corner case where there are no analyzable columns.)
+ *
+ * FIXME This sample sizing is mostly OK when computing stats for
+ * individual columns, but when computing multi-variate stats
+ * for multivariate stats (histograms, mcv, ...) it's rather
+ * insufficient. For stats on multiple columns / complex stats
+ * we need larger sample sizes, because we need to build more
+ * detailed stats (more MCV items / histogram buckets) to get
+ * good accuracy. Maybe it'd be appropriate to use samples
+ * proportional to the table (say, 0.5% - 1%) instead of a
+ * fixed size might be more appropriate. Also, this should be
+ * bound to the requested statistics size - e.g. number of MCV
+ * items or histogram buckets should require several sample
+ * rows per item/bucket (so the sample should be k*size).
*/
targrows = 100;
for (i = 0; i < attr_cnt; i++)
@@ -562,6 +580,9 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
update_attstats(RelationGetRelid(Irel[ind]), false,
thisdata->attr_cnt, thisdata->vacattrstats);
}
+
+ /* Build multivariate stats (if there are any). */
+ build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/dropcmds.c b/src/backend/commands/dropcmds.c
index 522027a..cd65b58 100644
--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -292,6 +292,10 @@ does_not_exist_skipping(ObjectType objtype, List *objname, List *objargs)
msg = gettext_noop("schema \"%s\" does not exist, skipping");
name = NameListToString(objname);
break;
+ case OBJECT_STATISTICS:
+ msg = gettext_noop("statistics \"%s\" does not exist, skipping");
+ name = NameListToString(objname);
+ break;
case OBJECT_TSPARSER:
if (!schema_does_not_exist_skipping(objname, &msg, &name))
{
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 9e32f8d..09061bb 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -110,6 +110,7 @@ static event_trigger_support_data event_trigger_support[] = {
{"SCHEMA", true},
{"SEQUENCE", true},
{"SERVER", true},
+ {"STATISTICS", true},
{"TABLE", true},
{"TABLESPACE", false},
{"TRANSFORM", true},
@@ -1106,6 +1107,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
case OBJECT_RULE:
case OBJECT_SCHEMA:
case OBJECT_SEQUENCE:
+ case OBJECT_STATISTICS:
case OBJECT_TABCONSTRAINT:
case OBJECT_TABLE:
case OBJECT_TRANSFORM:
@@ -1167,6 +1169,7 @@ EventTriggerSupportsObjectClass(ObjectClass objclass)
case OCLASS_DEFACL:
case OCLASS_EXTENSION:
case OCLASS_POLICY:
+ case OCLASS_STATISTICS:
return true;
}
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
new file mode 100644
index 0000000..f43b053
--- /dev/null
+++ b/src/backend/commands/statscmds.c
@@ -0,0 +1,277 @@
+/*-------------------------------------------------------------------------
+ *
+ * statscmds.c
+ * Commands for creating and altering multivariate statistics
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/commands/statscmds.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/relscan.h"
+#include "catalog/dependency.h"
+#include "catalog/indexing.h"
+#include "catalog/namespace.h"
+#include "catalog/pg_mv_statistic.h"
+#include "catalog/pg_namespace.h"
+#include "commands/defrem.h"
+#include "miscadmin.h"
+#include "utils/builtins.h"
+#include "utils/inval.h"
+#include "utils/memutils.h"
+#include "utils/mvstats.h"
+#include "utils/rel.h"
+#include "utils/syscache.h"
+
+
+/* used for sorting the attnums in ExecCreateStatistics */
+static int compare_int16(const void *a, const void *b)
+{
+ return memcmp(a, b, sizeof(int16));
+}
+
+/*
+ * Implements the CREATE STATISTICS name ON table (columns) WITH (options)
+ *
+ * TODO Check that the types support sort, although maybe we can live
+ * without it (and only build MCV list / association rules).
+ *
+ * TODO This should probably check for duplicate stats (i.e. same
+ * keys, same options). Although maybe it's useful to have
+ * multiple stats on the same columns with different options
+ * (say, a detailed MCV-only stats for some queries, histogram
+ * for others, etc.)
+ */
+ObjectAddress
+CreateStatistics(CreateStatsStmt *stmt)
+{
+ int i, j;
+ ListCell *l;
+ int16 attnums[INDEX_MAX_KEYS];
+ int numcols = 0;
+ ObjectAddress address = InvalidObjectAddress;
+ char *namestr;
+ NameData staname;
+ Oid statoid;
+ Oid namespaceId;
+
+ HeapTuple htup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ int2vector *stakeys;
+ Relation mvstatrel;
+ Relation rel;
+ ObjectAddress parentobject, childobject;
+
+ /* by default build nothing */
+ bool build_dependencies = false;
+
+ Assert(IsA(stmt, CreateStatsStmt));
+
+ /* resolve the pieces of the name (namespace etc.) */
+ namespaceId = QualifiedNameGetCreationNamespace(stmt->defnames, &namestr);
+ namestrcpy(&staname, namestr);
+
+ /*
+ * If if_not_exists was given and the statistics already exists, bail out.
+ */
+ if (stmt->if_not_exists &&
+ SearchSysCacheExists2(MVSTATNAMENSP,
+ PointerGetDatum(&staname),
+ ObjectIdGetDatum(namespaceId)))
+ {
+ ereport(NOTICE,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+ errmsg("statistics \"%s\" already exists, skipping",
+ namestr)));
+ return InvalidObjectAddress;
+ }
+
+ rel = heap_openrv(stmt->relation, AccessExclusiveLock);
+
+ /* transform the column names to attnum values */
+
+ foreach(l, stmt->keys)
+ {
+ char *attname = strVal(lfirst(l));
+ HeapTuple atttuple;
+
+ atttuple = SearchSysCacheAttName(RelationGetRelid(rel), attname);
+
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" referenced in statistics does not exist",
+ attname)));
+
+ /* more than MVHIST_MAX_DIMENSIONS columns not allowed */
+ if (numcols >= MVSTATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than %d keys in a statistics",
+ MVSTATS_MAX_DIMENSIONS)));
+
+ attnums[numcols] = ((Form_pg_attribute) GETSTRUCT(atttuple))->attnum;
+ ReleaseSysCache(atttuple);
+ numcols++;
+ }
+
+ /*
+ * Check the lower bound (at least 2 columns), the upper bound was
+ * already checked in the loop.
+ */
+ if (numcols < 2)
+ ereport(ERROR,
+ (errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("multivariate stats require 2 or more columns")));
+
+ /* look for duplicities */
+ for (i = 0; i < numcols; i++)
+ for (j = 0; j < numcols; j++)
+ if ((i != j) && (attnums[i] == attnums[j]))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("duplicate column name in statistics definition")));
+
+ /* parse the statistics options */
+ foreach (l, stmt->options)
+ {
+ DefElem *opt = (DefElem*)lfirst(l);
+
+ if (strcmp(opt->defname, "dependencies") == 0)
+ build_dependencies = defGetBoolean(opt);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized STATISTICS option \"%s\"",
+ opt->defname)));
+ }
+
+ /* check that at least some statistics were requested */
+ if (! build_dependencies)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies) was requested")));
+
+ /* sort the attnums and build int2vector */
+ qsort(attnums, numcols, sizeof(int16), compare_int16);
+ stakeys = buildint2vector(attnums, numcols);
+
+ /*
+ * Okay, let's create the pg_mv_statistic entry.
+ */
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+
+ /* no stats collected yet, so just the keys */
+ values[Anum_pg_mv_statistic_starelid-1] = ObjectIdGetDatum(RelationGetRelid(rel));
+ values[Anum_pg_mv_statistic_staname -1] = NameGetDatum(&staname);
+ values[Anum_pg_mv_statistic_stanamespace -1] = ObjectIdGetDatum(namespaceId);
+ values[Anum_pg_mv_statistic_staowner-1] = ObjectIdGetDatum(GetUserId());
+
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
+
+ values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+
+ nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* insert the tuple into pg_mv_statistic */
+ mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ htup = heap_form_tuple(mvstatrel->rd_att, values, nulls);
+
+ simple_heap_insert(mvstatrel, htup);
+
+ CatalogUpdateIndexes(mvstatrel, htup);
+
+ statoid = HeapTupleGetOid(htup);
+
+ heap_freetuple(htup);
+
+
+ /*
+ * Store a dependency too, so that statistics are dropped on DROP TABLE
+ */
+ parentobject.classId = RelationRelationId;
+ parentobject.objectId = ObjectIdGetDatum(RelationGetRelid(rel));
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+ /*
+ * Also record dependency on the schema (to drop statistics on DROP SCHEMA)
+ */
+ parentobject.classId = NamespaceRelationId;
+ parentobject.objectId = ObjectIdGetDatum(namespaceId);
+ parentobject.objectSubId = 0;
+ childobject.classId = MvStatisticRelationId;
+ childobject.objectId = statoid;
+ childobject.objectSubId = 0;
+
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
+
+
+ heap_close(mvstatrel, RowExclusiveLock);
+
+ relation_close(rel, NoLock);
+
+ /*
+ * Invalidate relcache so that others see the new statistics.
+ */
+ CacheInvalidateRelcache(rel);
+
+ ObjectAddressSet(address, MvStatisticRelationId, statoid);
+
+ return address;
+}
+
+
+/*
+ * Implements the DROP STATISTICS
+ *
+ * DROP STATISTICS stats_name ON table_name
+ *
+ * The first one requires an exact match, the second one just drops
+ * all the statistics on a table.
+ */
+void
+RemoveStatisticsById(Oid statsOid)
+{
+ Relation relation;
+ Oid relid;
+ Relation rel;
+ HeapTuple tup;
+ Form_pg_mv_statistic mvstat;
+
+ /*
+ * Delete the pg_proc tuple.
+ */
+ relation = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ tup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(statsOid));
+ if (!HeapTupleIsValid(tup)) /* should not happen */
+ elog(ERROR, "cache lookup failed for statistics %u", statsOid);
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(tup);
+ relid = mvstat->starelid;
+
+ rel = heap_open(relid, AccessExclusiveLock);
+
+ simple_heap_delete(relation, &tup->t_self);
+
+ CacheInvalidateRelcache(rel);
+
+ ReleaseSysCache(tup);
+
+ heap_close(relation, RowExclusiveLock);
+ heap_close(rel, NoLock);
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df7c2fa..3b7c87f 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4124,6 +4124,20 @@ _copyAlterPolicyStmt(const AlterPolicyStmt *from)
return newnode;
}
+static CreateStatsStmt *
+_copyCreateStatsStmt(const CreateStatsStmt *from)
+{
+ CreateStatsStmt *newnode = makeNode(CreateStatsStmt);
+
+ COPY_NODE_FIELD(defnames);
+ COPY_NODE_FIELD(relation);
+ COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(options);
+ COPY_SCALAR_FIELD(if_not_exists);
+
+ return newnode;
+}
+
/* ****************************************************************
* pg_list.h copy functions
* ****************************************************************
@@ -4999,6 +5013,9 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_CreateStatsStmt:
+ retval = _copyCreateStatsStmt(from);
+ break;
case T_FuncWithArgs:
retval = _copyFuncWithArgs(from);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index eb0fc1e..07206d7 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2153,6 +2153,21 @@ _outIndexOptInfo(StringInfo str, const IndexOptInfo *node)
}
static void
+_outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
+{
+ WRITE_NODE_TYPE("MVSTATISTICINFO");
+
+ /* NB: this isn't a complete set of fields */
+ WRITE_OID_FIELD(mvoid);
+
+ /* enabled statistics */
+ WRITE_BOOL_FIELD(deps_enabled);
+
+ /* built/available statistics */
+ WRITE_BOOL_FIELD(deps_built);
+}
+
+static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
@@ -3636,6 +3651,9 @@ _outNode(StringInfo str, const void *obj)
case T_PlannerParamItem:
_outPlannerParamItem(str, obj);
break;
+ case T_MVStatisticInfo:
+ _outMVStatisticInfo(str, obj);
+ break;
case T_ExtensibleNode:
_outExtensibleNode(str, obj);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index ad715bb..7fb2088 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -28,6 +28,7 @@
#include "catalog/dependency.h"
#include "catalog/heap.h"
#include "catalog/pg_am.h"
+#include "catalog/pg_mv_statistic.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -40,7 +41,9 @@
#include "parser/parsetree.h"
#include "rewrite/rewriteManip.h"
#include "storage/bufmgr.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
+#include "utils/syscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
@@ -94,6 +97,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
Relation relation;
bool hasindex;
List *indexinfos = NIL;
+ List *stainfos = NIL;
/*
* We need not lock the relation since it was already locked, either by
@@ -387,6 +391,61 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->indexlist = indexinfos;
+ if (true)
+ {
+ List *mvstatoidlist;
+ ListCell *l;
+
+ mvstatoidlist = RelationGetMVStatList(relation);
+
+ foreach(l, mvstatoidlist)
+ {
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ Oid mvoid = lfirst_oid(l);
+ Form_pg_mv_statistic mvstat;
+ MVStatisticInfo *info;
+
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ /* unavailable stats are not interesting for the planner */
+ if (mvstat->deps_built)
+ {
+ info = makeNode(MVStatisticInfo);
+
+ info->mvoid = mvoid;
+ info->rel = rel;
+
+ /* enabled statistics */
+ info->deps_enabled = mvstat->deps_enabled;
+
+ /* built/available statistics */
+ info->deps_built = mvstat->deps_built;
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ info->stakeys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+
+ stainfos = lcons(info, stainfos);
+ }
+
+ ReleaseSysCache(htup);
+ }
+
+ list_free(mvstatoidlist);
+ }
+
+ rel->mvstatlist = stainfos;
+
/* Grab foreign-table info using the relcache, while we have it */
if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b9aeb31..eed9927 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -241,7 +241,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
ConstraintsSetStmt CopyStmt CreateAsStmt CreateCastStmt
CreateDomainStmt CreateExtensionStmt CreateGroupStmt CreateOpClassStmt
CreateOpFamilyStmt AlterOpFamilyStmt CreatePLangStmt
- CreateSchemaStmt CreateSeqStmt CreateStmt CreateTableSpaceStmt
+ CreateSchemaStmt CreateSeqStmt CreateStmt CreateStatsStmt CreateTableSpaceStmt
CreateFdwStmt CreateForeignServerStmt CreateForeignTableStmt
CreateAssertStmt CreateTransformStmt CreateTrigStmt CreateEventTrigStmt
CreateUserStmt CreateUserMappingStmt CreateRoleStmt CreatePolicyStmt
@@ -809,6 +809,7 @@ stmt :
| CreateSchemaStmt
| CreateSeqStmt
| CreateStmt
+ | CreateStatsStmt
| CreateTableSpaceStmt
| CreateTransformStmt
| CreateTrigStmt
@@ -3436,6 +3437,36 @@ OptConsTableSpace: USING INDEX TABLESPACE name { $$ = $4; }
ExistingIndex: USING INDEX index_name { $$ = $3; }
;
+/*****************************************************************************
+ *
+ * QUERY :
+ * CREATE STATISTICS stats_name ON relname (columns) WITH (options)
+ *
+ *****************************************************************************/
+
+
+CreateStatsStmt: CREATE STATISTICS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $3;
+ n->relation = $5;
+ n->keys = $7;
+ n->options = $9;
+ n->if_not_exists = false;
+ $$ = (Node *)n;
+ }
+ | CREATE STATISTICS IF_P NOT EXISTS any_name ON qualified_name '(' columnList ')' opt_reloptions
+ {
+ CreateStatsStmt *n = makeNode(CreateStatsStmt);
+ n->defnames = $6;
+ n->relation = $8;
+ n->keys = $10;
+ n->options = $12;
+ n->if_not_exists = true;
+ $$ = (Node *)n;
+ }
+ ;
+
/*****************************************************************************
*
@@ -5621,6 +5652,7 @@ drop_type: TABLE { $$ = OBJECT_TABLE; }
| TEXT_P SEARCH DICTIONARY { $$ = OBJECT_TSDICTIONARY; }
| TEXT_P SEARCH TEMPLATE { $$ = OBJECT_TSTEMPLATE; }
| TEXT_P SEARCH CONFIGURATION { $$ = OBJECT_TSCONFIGURATION; }
+ | STATISTICS { $$ = OBJECT_STATISTICS; }
;
any_name_list:
@@ -7995,6 +8027,15 @@ RenameStmt: ALTER AGGREGATE func_name aggr_args RENAME TO name
n->missing_ok = false;
$$ = (Node *)n;
}
+ | ALTER STATISTICS any_name RENAME TO name
+ {
+ RenameStmt *n = makeNode(RenameStmt);
+ n->renameType = OBJECT_STATISTICS;
+ n->object = $3;
+ n->newname = $6;
+ n->missing_ok = false;
+ $$ = (Node *)n;
+ }
;
opt_column: COLUMN { $$ = COLUMN; }
@@ -8231,6 +8272,15 @@ AlterObjectSchemaStmt:
n->missing_ok = false;
$$ = (Node *)n;
}
+ | ALTER STATISTICS any_name SET SCHEMA name
+ {
+ AlterObjectSchemaStmt *n = makeNode(AlterObjectSchemaStmt);
+ n->objectType = OBJECT_STATISTICS;
+ n->object = $3;
+ n->newschema = $6;
+ n->missing_ok = false;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
@@ -8421,6 +8471,14 @@ AlterOwnerStmt: ALTER AGGREGATE func_name aggr_args OWNER TO RoleSpec
n->newowner = $7;
$$ = (Node *)n;
}
+ | ALTER STATISTICS name OWNER TO RoleSpec
+ {
+ AlterOwnerStmt *n = makeNode(AlterOwnerStmt);
+ n->objectType = OBJECT_STATISTICS;
+ n->object = list_make1(makeString($3));
+ n->newowner = $6;
+ $$ = (Node *)n;
+ }
;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 045f7f0..96b58f8 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1520,6 +1520,10 @@ ProcessUtilitySlow(Node *parsetree,
address = ExecSecLabelStmt((SecLabelStmt *) parsetree);
break;
+ case T_CreateStatsStmt: /* CREATE STATISTICS */
+ address = CreateStatistics((CreateStatsStmt *) parsetree);
+ break;
+
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(parsetree));
@@ -1878,6 +1882,9 @@ AlterObjectTypeCommandTag(ObjectType objtype)
case OBJECT_MATVIEW:
tag = "ALTER MATERIALIZED VIEW";
break;
+ case OBJECT_STATISTICS:
+ tag = "ALTER STATISTICS";
+ break;
default:
tag = "???";
break;
@@ -2160,6 +2167,9 @@ CreateCommandTag(Node *parsetree)
case OBJECT_TRANSFORM:
tag = "DROP TRANSFORM";
break;
+ case OBJECT_STATISTICS:
+ tag = "DROP STATISTICS";
+ break;
default:
tag = "???";
}
@@ -2527,6 +2537,10 @@ CreateCommandTag(Node *parsetree)
tag = "EXECUTE";
break;
+ case T_CreateStatsStmt:
+ tag = "CREATE STATISTICS";
+ break;
+
case T_DeallocateStmt:
{
DeallocateStmt *stmt = (DeallocateStmt *) parsetree;
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index 8374533..eba0352 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -9,7 +9,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = fmgrtab.o
-SUBDIRS = adt cache error fmgr hash init mb misc mmgr resowner sort time
+SUBDIRS = adt cache error fmgr hash init mb misc mmgr mvstats resowner sort time
# location of Catalog.pm
catalogdir = $(top_srcdir)/src/backend/catalog
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 130c06d..3bc4c8a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -47,6 +47,7 @@
#include "catalog/pg_auth_members.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_proc.h"
@@ -3956,6 +3957,62 @@ RelationGetIndexList(Relation relation)
return result;
}
+
+List *
+RelationGetMVStatList(Relation relation)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result;
+ List *oldlist;
+ MemoryContext oldcxt;
+
+ /* Quick exit if we already computed the list. */
+ if (relation->rd_mvstatvalid != 0)
+ return list_copy(relation->rd_mvstatlist);
+
+ /*
+ * We build the list we intend to return (in the caller's context) while
+ * doing the scan. After successfully completing the scan, we copy that
+ * list into the relcache entry. This avoids cache-context memory leakage
+ * if we get some sort of error partway through.
+ */
+ result = NIL;
+
+ /* Prepare to scan pg_index for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(RelationGetRelid(relation)));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ /* TODO maybe include only already built statistics? */
+ result = insert_ordered_oid(result, HeapTupleGetOid(htup));
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* Now save a copy of the completed list in the relcache entry. */
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ oldlist = relation->rd_mvstatlist;
+ relation->rd_mvstatlist = list_copy(result);
+
+ relation->rd_mvstatvalid = true;
+ MemoryContextSwitchTo(oldcxt);
+
+ /* Don't leak the old list, if there is one */
+ list_free(oldlist);
+
+ return result;
+}
+
/*
* insert_ordered_oid
* Insert a new Oid into a sorted list of Oids, preserving ordering
@@ -4920,6 +4977,8 @@ load_relcache_init_file(bool shared)
rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_idattr = NULL;
+ rel->rd_mvstatvalid = false;
+ rel->rd_mvstatlist = NIL;
rel->rd_createSubid = InvalidSubTransactionId;
rel->rd_newRelfilenodeSubid = InvalidSubTransactionId;
rel->rd_amcache = NULL;
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 65ffe84..3c1bc4b 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -44,6 +44,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_foreign_table.h"
#include "catalog/pg_language.h"
+#include "catalog/pg_mv_statistic.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
@@ -502,6 +503,28 @@ static const struct cachedesc cacheinfo[] = {
},
4
},
+ {MvStatisticRelationId, /* MVSTATNAMENSP */
+ MvStatisticNameIndexId,
+ 2,
+ {
+ Anum_pg_mv_statistic_staname,
+ Anum_pg_mv_statistic_stanamespace,
+ 0,
+ 0
+ },
+ 4
+ },
+ {MvStatisticRelationId, /* MVSTATOID */
+ MvStatisticOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0
+ },
+ 4
+ },
{NamespaceRelationId, /* NAMESPACENAME */
NamespaceNameIndexId,
1,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
new file mode 100644
index 0000000..099f1ed
--- /dev/null
+++ b/src/backend/utils/mvstats/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for utils/mvstats
+#
+# IDENTIFICATION
+# src/backend/utils/mvstats/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/utils/mvstats
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = common.o dependencies.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.dependencies b/src/backend/utils/mvstats/README.dependencies
new file mode 100644
index 0000000..1f96fbc
--- /dev/null
+++ b/src/backend/utils/mvstats/README.dependencies
@@ -0,0 +1,222 @@
+Soft functional dependencies
+============================
+
+A type of multivariate statistics used to capture cases when one column (or
+possibly a combination of columns) determines values in another column. We may
+also say that one column implies the other one.
+
+A simple artificial example may be a table with two columns, created like this
+
+ CREATE TABLE t (a INT, b INT)
+ AS SELECT i, i/10 FROM generate_series(1,100000) s(i);
+
+Clearly, once we know the value for column 'a' the value for 'b' is trivially
+determined, as it's simply (a/10). A more practical example may be addresses,
+where (ZIP code -> city name), i.e. once we know the ZIP, we probably know the
+city it belongs to, as ZIP codes are usually assigned to one city. Larger cities
+may have multiple ZIP codes, so the dependency can't be reversed.
+
+Functional dependencies are a concept well described in relational theory,
+particularly in definition of normalization and "normal forms". Wikipedia has a
+nice definition of a functional dependency [1]:
+
+ In a given table, an attribute Y is said to have a functional dependency on
+ a set of attributes X (written X -> Y) if and only if each X value is
+ associated with precisely one Y value. For example, in an "Employee" table
+ that includes the attributes "Employee ID" and "Employee Date of Birth", the
+ functional dependency {Employee ID} -> {Employee Date of Birth} would hold.
+ It follows from the previous two sentences that each {Employee ID} is
+ associated with precisely one {Employee Date of Birth}.
+
+ [1] http://en.wikipedia.org/wiki/Database_normalization
+
+Many datasets might be normalized not to contain such dependencies, but often
+it's not practical for various reasons. In some cases it's actually a conscious
+design choice to model the dataset in denormalized way, either because of
+performance or to make querying easier.
+
+The functional dependencies are called 'soft' because the implementation is
+meant to allow small number of rows contradicting the dependency. Many actual
+data sets contain some sort of errors, either because of data entry mistakes
+(user mistyping the ZIP code) or issues in generating the data (e.g. a ZIP code
+mistakenly assigned to two cities in different states). A strict implementation
+would ignore dependencies on such noisy data, rendering the approach unusable on
+such data sets.
+
+
+Mining dependencies (ANALYZE)
+-----------------------------
+
+The current build algorithm is rather simple - for each pair (a,b) of columns,
+the data are sorted lexicographically (first by 'a', then by 'b'). Then for each
+group (rows with the same 'a' value) we decide whether the group is neutral,
+supporting or contradicting the dependency (a->b).
+
+A group is considered neutral when it's too small - e.g. when there's a single
+row in the group, there can't possibly be multiple values in 'b'. For this
+reason we ignore groups smaller than a threshold (currently 3 rows).
+
+For sufficiently large groups (3 rows or more), we count the number of distinct
+values in 'b'. When there's a single 'b' value, the group is considered to
+support the dependency (a->b), otherwise it's condidered as contradicting it.
+
+At the end, we compare the number of rows in supporting and contradicting groups,
+and if there are at least 10x as many supporting rows, we consider the
+functional dependency to be valid.
+
+
+This approach has the negative property that the algorithm is that it's a bit
+fragile with respect to the sample - there may be data sets producing quite
+different results for each ANALYZE execution (as even a single row may change
+the outcome of the final 10x test).
+
+It was proposed to make the dependencies "fuzzy" - e.g. track some coefficient
+between [0,1] determining how much the dependency holds. That would however mean
+we have to keep all the dependencies, as eliminating them based on the value of
+the coefficient (e.g. throw away dependencies <= 0.5) would result in exactly
+the same fragility issues. This would also make it more complicated to combine
+dependencies. So this does not seem like a practical approach.
+
+A better approach might be to replace the constants (min_group_size=3 and 10x)
+with values somehow related to the particular data set.
+
+
+Clause reduction (planner/optimizer)
+------------------------------------
+
+Apllying the functional dependencies is quite simple - given a list of equality
+clauses, check which clauses are redundant (i.e. implied by some other clause).
+For example given clause list
+
+ (a = 2) AND (b = 2) AND (c = 3)
+
+and dependencies (a->b) and (a->d), the list of clauses may be simplified to
+
+ (a = 1) AND (c = 3)
+
+Functional dependencies may only be applied to equality clauses, all other types
+of clauses are ignored. See clauselist_apply_dependencies() for more details.
+
+
+Compatibility of clauses
+------------------------
+
+The reduction assumes the clauses really are redundant, and the value in the
+reduced clause (b=2) is the value determined by (a=1). If that's not the case
+and the values are "incompatible" the result will be over-estimation.
+
+This may happen for example when using conditions on ZIP and city name with
+mismatching values (ZIP for a different city), etc. In such case the result
+set will be empty, but we'll estimate the selectivity using the ZIP condition.
+
+In this case the default estimation based on AVIA principle happens to work
+better, but mostly by chance.
+
+
+Dependencies vs. MCV/histogram
+------------------------------
+
+In some cases the "compatibility" of the conditions might be verified using the
+other types of multivariate stats - MCV lists and histograms.
+
+For MCV lists the verification might be very simple - peek into the list if
+there are any items matching the clause on the 'a' column (e.g. ZIP code), and
+if such item is found, check that the 'b' column matches the other clause. If it
+does not, the clauses are contradictory. We can't really say if such item was
+not found, except maybe restricting the selectivity using the MCV data (e.g.
+using min/max selectivity, or something).
+
+With histograms, it might work similarly - we can't check the values directly
+(because histograms use buckets, unlike MCV lists, storing the actual values).
+So we can only observe the buckets matching the clauses - if those buckets have
+very low frequency, it probably means the two clauses are incompatible.
+
+It's unclear what 'low frequency' is, but if one of the clauses is implied
+(automatically true because of the other clause), then
+
+ selectivity[clause(A)] = selectivity[clause(A) & clause(B)]
+
+So we might compute selectivity of the first clause - for example using regular
+statistics. And then check if the selectivity computed from the histogram is
+about the same (or significantly lower).
+
+The problem is that histograms work well only when the data ordering matches the
+natural meaning. For values that serve as labels - like city names or ZIP codes,
+or even generated IDs, histograms really don't work all that well. For example
+sorting cities by name won't match the sorting of ZIP codes, rendering the
+histogram unusable.
+
+So MCVs are probably going to work much better, because they don't really assume
+any sort of ordering. And it's probably more appropriate for the label-like data.
+
+A good question however is why even use functional dependencies in such cases
+and not simply use the MCV/histogram instead. One reason is that the functional
+dependencies allow fallback to regular stats, and often produce more accurate
+estimates - especially compared to histograms, that are quite bad in estimating
+equality clauses.
+
+
+Limitations
+-----------
+
+Let's see the main liminations of functional dependencies, especially those
+related to the current implementation.
+
+The current implementation supports only dependencies between two columns, but
+this is merely a simplification of the initial implementation. It's certainly
+useful to mine for dependencies involving multiple columns on the 'left' side,
+i.e. a condition for the dependency. That is dependencies like (a,b -> c).
+
+The implementation may/should be smart enough not to mine redundant conditions,
+e.g. (a->b) and (a,c -> b), because the latter is a trivial consequence of the
+former one (if values of 'a' determine 'b', adding another column won't change
+that relationship). The ANALYZE should first analyze 1:1 dependencies, then 2:1
+dependencies (and skip the already identified ones), etc.
+
+For example the dependency
+
+ (city name -> zip code)
+
+is much stronger, i.e. whenever it hold, then
+
+ (city name, state name -> zip code)
+
+holds too. But in case there are cities with the same name in different states,
+then only the latter dependency will be valid.
+
+Of course, there probably are cities with the same name within a single state,
+but hopefully this is relatively rare occurence (and thus we'll still detect
+the 'soft' dependency).
+
+Handling multiple columns on the right side of the dependency, is not necessary,
+as those dependencies may be simply decomposed into a set of dependencies with
+the same meaning, one for each column on the right side. For example
+
+ (a -> b,c)
+
+is exactly the same as
+
+ (a -> b) & (a -> c)
+
+Of course, storing the first form may be more efficient thant storing multiple
+'simple' dependencies separately.
+
+
+TODO Support dependencies with multiple columns on left/right.
+
+TODO Investigate using histogram and MCV list to verify the dependencies.
+
+TODO Investigate statistical testing of the distribution (to decide whether it
+ makes sense to build the histogram/MCV list).
+
+TODO Using a min/max of selectivities would probably make more sense for the
+ associated columns.
+
+TODO Consider eliminating the implied columns from the histogram and MCV lists
+ (but maybe that's not a good idea, because that'd make it impossible to use
+ these stats for non-equality clauses and also it wouldn't be possible to
+ use the stats for verification of the dependencies).
+
+TODO The reduction probably might be extended to also handle IS NULL clauses,
+ assuming we fix the ANALYZE to properly handle NULL values. We however
+ won't be able to reduce IS NOT NULL (unless I'm missing something).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
new file mode 100644
index 0000000..82f2177
--- /dev/null
+++ b/src/backend/utils/mvstats/common.c
@@ -0,0 +1,376 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.c
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+
+static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
+ int natts, VacAttrStats **vacattrstats);
+
+static List* list_mv_stats(Oid relid);
+
+
+/*
+ * Compute requested multivariate stats, using the rows sampled for the
+ * plain (single-column) stats.
+ *
+ * This fetches a list of stats from pg_mv_statistic, computes the stats
+ * and serializes them back into the catalog (as bytea values).
+ */
+void
+build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats)
+{
+ ListCell *lc;
+ List *mvstats;
+
+ TupleDesc tupdesc = RelationGetDescr(onerel);
+
+ /*
+ * Fetch defined MV groups from pg_mv_statistic, and then compute
+ * the MV statistics (histograms for now).
+ */
+ mvstats = list_mv_stats(RelationGetRelid(onerel));
+
+ foreach (lc, mvstats)
+ {
+ int j;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
+ MVDependencies deps = NULL;
+
+ VacAttrStats **stats = NULL;
+ int numatts = 0;
+
+ /* int2 vector of attnums the stats should be computed on */
+ int2vector * attrs = stat->stakeys;
+
+ /* see how many of the columns are not dropped */
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ numatts += 1;
+
+ /* if there are dropped attributes, build a filtered int2vector */
+ if (numatts != attrs->dim1)
+ {
+ int16 *tmp = palloc0(numatts * sizeof(int16));
+ int attnum = 0;
+
+ for (j = 0; j < attrs->dim1; j++)
+ if (! tupdesc->attrs[attrs->values[j]-1]->attisdropped)
+ tmp[attnum++] = attrs->values[j];
+
+ pfree(attrs);
+ attrs = buildint2vector(tmp, numatts);
+ }
+
+ /* filter only the interesting vacattrstats records */
+ stats = lookup_var_attr_stats(attrs, natts, vacattrstats);
+
+ /* check allowed number of dimensions */
+ Assert((attrs->dim1 >= 2) && (attrs->dim1 <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Analyze functional dependencies of columns.
+ */
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
+
+ /* store the histogram / MCV list in the catalog */
+ update_mv_stats(stat->mvoid, deps, attrs);
+ }
+}
+
+/*
+ * Lookup the VacAttrStats info for the selected columns, with indexes
+ * matching the attrs vector (to make it easy to work with when
+ * computing multivariate stats).
+ */
+static VacAttrStats **
+lookup_var_attr_stats(int2vector *attrs, int natts, VacAttrStats **vacattrstats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+ VacAttrStats **stats = (VacAttrStats**)palloc0(numattrs * sizeof(VacAttrStats*));
+
+ /* lookup VacAttrStats info for the requested columns (same attnum) */
+ for (i = 0; i < numattrs; i++)
+ {
+ stats[i] = NULL;
+ for (j = 0; j < natts; j++)
+ {
+ if (attrs->values[i] == vacattrstats[j]->tupattnum)
+ {
+ stats[i] = vacattrstats[j];
+ break;
+ }
+ }
+
+ /*
+ * Check that we found the info, that the attnum matches and
+ * that there's the requested 'lt' operator and that the type
+ * is 'passed-by-value'.
+ */
+ Assert(stats[i] != NULL);
+ Assert(stats[i]->tupattnum == attrs->values[i]);
+
+ /* FIXME This is rather ugly way to check for 'ltopr' (which
+ * is defined for 'scalar' attributes).
+ */
+ Assert(((StdAnalyzeData *)stats[i]->extra_data)->ltopr != InvalidOid);
+ }
+
+ return stats;
+}
+
+/*
+ * Fetch list of MV stats defined on a table, without the actual data
+ * for histograms, MCV lists etc.
+ */
+static List*
+list_mv_stats(Oid relid)
+{
+ Relation indrel;
+ SysScanDesc indscan;
+ ScanKeyData skey;
+ HeapTuple htup;
+ List *result = NIL;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ ScanKeyInit(&skey,
+ Anum_pg_mv_statistic_starelid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(relid));
+
+ indrel = heap_open(MvStatisticRelationId, AccessShareLock);
+ indscan = systable_beginscan(indrel, MvStatisticRelidIndexId, true,
+ NULL, 1, &skey);
+
+ while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+ {
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ Form_pg_mv_statistic stats = (Form_pg_mv_statistic) GETSTRUCT(htup);
+
+ info->mvoid = HeapTupleGetOid(htup);
+ info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_built = stats->deps_built;
+
+ result = lappend(result, info);
+ }
+
+ systable_endscan(indscan);
+
+ heap_close(indrel, AccessShareLock);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return result;
+}
+
+void
+update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+{
+ HeapTuple stup,
+ oldtup;
+ Datum values[Natts_pg_mv_statistic];
+ bool nulls[Natts_pg_mv_statistic];
+ bool replaces[Natts_pg_mv_statistic];
+
+ Relation sd = heap_open(MvStatisticRelationId, RowExclusiveLock);
+
+ memset(nulls, 1, Natts_pg_mv_statistic * sizeof(bool));
+ memset(replaces, 0, Natts_pg_mv_statistic * sizeof(bool));
+ memset(values, 0, Natts_pg_mv_statistic * sizeof(Datum));
+
+ /*
+ * Construct a new pg_mv_statistic tuple - replace only the histogram
+ * and MCV list, depending whether it actually was computed.
+ */
+ if (dependencies != NULL)
+ {
+ nulls[Anum_pg_mv_statistic_stadeps -1] = false;
+ values[Anum_pg_mv_statistic_stadeps - 1]
+ = PointerGetDatum(serialize_mv_dependencies(dependencies));
+ }
+
+ /* always replace the value (either by bytea or NULL) */
+ replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+
+ /* always change the availability flags */
+ nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_stakeys-1] = false;
+
+ /* use the new attnums, in case we removed some dropped ones */
+ replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_stakeys -1] = true;
+
+ values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
+
+ /* Is there already a pg_mv_statistic tuple for this attribute? */
+ oldtup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ if (HeapTupleIsValid(oldtup))
+ {
+ /* Yes, replace it */
+ stup = heap_modify_tuple(oldtup,
+ RelationGetDescr(sd),
+ values,
+ nulls,
+ replaces);
+ ReleaseSysCache(oldtup);
+ simple_heap_update(sd, &stup->t_self, stup);
+ }
+ else
+ elog(ERROR, "invalid pg_mv_statistic record (oid=%d)", mvoid);
+
+ /* update indexes too */
+ CatalogUpdateIndexes(sd, stup);
+
+ heap_freetuple(stup);
+
+ heap_close(sd, RowExclusiveLock);
+}
+
+/* multi-variate stats comparator */
+
+/*
+ * qsort_arg comparator for sorting Datums (MV stats)
+ *
+ * This does not maintain the tupnoLink array.
+ */
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+ Datum da = *(Datum*)a;
+ Datum db = *(Datum*)b;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+ Datum da = ((ScalarItem*)a)->value;
+ Datum db = ((ScalarItem*)b)->value;
+ SortSupport ssup= (SortSupport) arg;
+
+ return ApplySortComparator(da, false, db, false, ssup);
+}
+
+/* initialize multi-dimensional sort */
+MultiSortSupport
+multi_sort_init(int ndims)
+{
+ MultiSortSupport mss;
+
+ Assert(ndims >= 2);
+
+ mss = (MultiSortSupport)palloc0(offsetof(MultiSortSupportData, ssup)
+ + sizeof(SortSupportData)*ndims);
+
+ mss->ndims = ndims;
+
+ return mss;
+}
+
+/*
+ * add sort into for dimension 'dim' (index into vacattrstats) to mss,
+ * at the position 'sortattr'
+ */
+void
+multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats)
+{
+ /* first, lookup StdAnalyzeData for the dimension (attribute) */
+ SortSupportData ssup;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)vacattrstats[dim]->extra_data;
+
+ Assert(mss != NULL);
+ Assert(sortdim < mss->ndims);
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup);
+
+ mss->ssup[sortdim] = ssup;
+}
+
+/* compare all the dimensions in the selected order */
+int
+multi_sort_compare(const void *a, const void *b, void *arg)
+{
+ int i;
+ SortItem *ia = (SortItem*)a;
+ SortItem *ib = (SortItem*)b;
+
+ MultiSortSupport mss = (MultiSortSupport)arg;
+
+ for (i = 0; i < mss->ndims; i++)
+ {
+ int compare;
+
+ compare = ApplySortComparator(ia->values[i], ia->isnull[i],
+ ib->values[i], ib->isnull[i],
+ &mss->ssup[i]);
+
+ if (compare != 0)
+ return compare;
+
+ }
+
+ /* equal by default */
+ return 0;
+}
+
+/* compare selected dimension */
+int
+multi_sort_compare_dim(int dim, const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ return ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+}
+
+int
+multi_sort_compare_dims(int start, int end,
+ const SortItem *a, const SortItem *b,
+ MultiSortSupport mss)
+{
+ int dim;
+
+ for (dim = start; dim <= end; dim++)
+ {
+ int r = ApplySortComparator(a->values[dim], a->isnull[dim],
+ b->values[dim], b->isnull[dim],
+ &mss->ssup[dim]);
+
+ if (r != 0)
+ return r;
+ }
+
+ return 0;
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
new file mode 100644
index 0000000..75b9c54
--- /dev/null
+++ b/src/backend/utils/mvstats/common.h
@@ -0,0 +1,78 @@
+/*-------------------------------------------------------------------------
+ *
+ * common.h
+ * POSTGRES multivariate statistics
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/sysattr.h"
+#include "access/tuptoaster.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_mv_statistic.h"
+#include "foreign/fdwapi.h"
+#include "postmaster/autovacuum.h"
+#include "storage/lmgr.h"
+#include "utils/builtins.h"
+#include "utils/datum.h"
+#include "utils/fmgroids.h"
+#include "utils/mvstats.h"
+#include "utils/sortsupport.h"
+#include "utils/syscache.h"
+
+
+/* FIXME private structure copied from analyze.c */
+
+typedef struct
+{
+ Oid eqopr; /* '=' operator for datatype, if any */
+ Oid eqfunc; /* and associated function */
+ Oid ltopr; /* '<' operator for datatype, if any */
+} StdAnalyzeData;
+
+typedef struct
+{
+ Datum value; /* a data value */
+ int tupno; /* position index for tuple it came from */
+} ScalarItem;
+
+/* multi-sort */
+typedef struct MultiSortSupportData {
+ int ndims; /* number of dimensions supported by the */
+ SortSupportData ssup[1]; /* sort support data for each dimension */
+} MultiSortSupportData;
+
+typedef MultiSortSupportData* MultiSortSupport;
+
+typedef struct SortItem {
+ Datum *values;
+ bool *isnull;
+} SortItem;
+
+MultiSortSupport multi_sort_init(int ndims);
+
+void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
+ int dim, VacAttrStats **vacattrstats);
+
+int multi_sort_compare(const void *a, const void *b, void *arg);
+
+int multi_sort_compare_dim(int dim, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+int multi_sort_compare_dims(int start, int end, const SortItem *a,
+ const SortItem *b, MultiSortSupport mss);
+
+/* comparators, used when constructing multivariate stats */
+int compare_scalars_simple(const void *a, const void *b, void *arg);
+int compare_scalars_partition(const void *a, const void *b, void *arg);
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
new file mode 100644
index 0000000..5437bdf
--- /dev/null
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -0,0 +1,686 @@
+/*-------------------------------------------------------------------------
+ *
+ * dependencies.c
+ * POSTGRES multivariate functional dependencies
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/dependencies.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+/* internal state for generator of variations (k-permutations of n elements) */
+typedef struct VariationGeneratorData {
+
+ int k; /* size of the k-permutation */
+ int current; /* index of the next variation to return */
+
+ int nvariations; /* number of variations generated (size of array) */
+ int variations[1]; /* array of pre-built variations */
+
+} VariationGeneratorData;
+
+typedef VariationGeneratorData* VariationGenerator;
+
+/*
+ * generate all variations (k-permutations of n elements)
+ */
+static void
+generate_variations(VariationGenerator state,
+ int n, int maxlevel, int level, int *current)
+{
+ int i, j;
+
+ /* initialize */
+ if (level == 0)
+ {
+ current = (int*)palloc0(sizeof(int) * (maxlevel+1));
+ state->current = 0;
+ }
+
+ for (i = 0; i < n; i++)
+ {
+ /* check if the value is already used current variation */
+ bool found = false;
+ for (j = 0; j < level; j++)
+ {
+ if (current[j] == i)
+ {
+ found = true;
+ break;
+ }
+ }
+
+ /* already used, so try the next element */
+ if (found)
+ continue;
+
+ /* ok, we can use this element, so store it */
+ current[level] = i;
+
+ /* and check if we do have a complete variation of k elements */
+ if (level == maxlevel)
+ {
+ /* yep, store the variation */
+ Assert(state->current < state->nvariations);
+ memcpy(&state->variations[(state->k * state->current)], current,
+ sizeof(int) * (maxlevel+1));
+ state->current++;
+ }
+ else
+ /* nope, look for additional elements */
+ generate_variations(state, n, maxlevel, level+1, current);
+ }
+
+ if (level == 0)
+ pfree(current);
+}
+
+/*
+ * initialize the generator of variations, and prebuild the variations
+ *
+ * This pre-builds all the variations. We could also generate them in
+ * generator_next(), but this seems simpler.
+ */
+static VariationGenerator
+generator_init(int2vector *attrs, int k)
+{
+ int i;
+ int n = attrs->dim1;
+ int nvariations;
+ VariationGenerator state;
+
+ Assert((n >= k) && (k > 0));
+
+ /* compute the total number of variations as n!/(n-k)! */
+ nvariations = n;
+ for (i = 1; i < k; i++)
+ nvariations *= (n - i);
+
+ /* allocate the generator state as a single chunk of memory */
+ state = (VariationGenerator)palloc0(
+ offsetof(VariationGeneratorData, variations)
+ + (nvariations * k * sizeof(int))); /* variations */
+
+ state->nvariations = nvariations;
+ state->k = k;
+
+ /* now actually pre-generate all the variations */
+ generate_variations(state, n, (k-1), 0, NULL);
+
+ /* we expect to generate exactly the right number of variations */
+ Assert(state->nvariations == state->current);
+
+ /* reset the index */
+ state->current = 0;
+
+ return state;
+}
+
+/* free the generator state */
+static void
+generator_free(VariationGenerator state)
+{
+ /* we've allocated a single chunk, so just free it */
+ pfree(state);
+}
+
+/* generate next combination */
+static int*
+generator_next(VariationGenerator state, int2vector *attrs)
+{
+ if (state->current == state->nvariations)
+ return NULL;
+
+ return &state->variations[state->k * state->current++];
+}
+
+/*
+ * check if the dependency is implied by existing dependencies
+ *
+ * A dependency is considered implied, if there exists a dependency with the
+ * same column on the left, and a subset of columns on the right side. So for
+ * example if we have a dependency
+ *
+ * (a,b,c) -> d
+ *
+ * then we are looking for these six dependencies
+ *
+ * (a) -> d
+ * (b) -> d
+ * (c) -> d
+ * (a,b) -> d
+ * (a,c) -> d
+ * (b,c) -> d
+ *
+ * This does not detect transitive dependencies. For example if we have
+ *
+ * (a) -> b
+ * (b) -> c
+ *
+ * then obviously
+ *
+ * (a) -> c
+ *
+ * but this is not detected. Extending the method to handle transitive cases
+ * is future work.
+ */
+static bool
+dependency_is_implied(MVDependencies dependencies, int k, int *dependency,
+ int2vector * attrs)
+{
+ bool implied = false;
+ int i, j, l;
+ int *tmp;
+
+ if (dependencies == NULL)
+ return false;
+
+ tmp = (int*)palloc0(sizeof(int) * k);
+
+ /* translate the indexes to actual attribute numbers */
+ for (i = 0; i < k; i++)
+ tmp[i] = attrs->values[dependency[i]];
+
+ /* search for a smaller */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ bool contained = true;
+ MVDependency dep = dependencies->deps[i];
+
+ /* does the last attribute match? */
+ if (tmp[k-1] != dep->attributes[dep->nattributes-1])
+ continue; /* nope, no need to check this dependency further */
+
+ /* are the conditions superset of the existing dependency? */
+ for (j = 0; j < (dep->nattributes-1); j++)
+ {
+ bool found = false;
+
+ for (l = 0; l < (k-1); l++)
+ {
+ if (tmp[l] == dep->attributes[j])
+ {
+ found = true;
+ break;
+ }
+ }
+
+ /* we've found an attribute not included in the new dependency */
+ if (! found)
+ {
+ contained = false;
+ break;
+ }
+ }
+
+ /* we've found an existing dependency, trivially proving the new one */
+ if (contained)
+ {
+ implied = true;
+ break;
+ }
+ }
+
+ pfree(tmp);
+
+ return implied;
+}
+
+/*
+ * validates functional dependency on the data
+ *
+ * An actual work horse of detecting functional dependencies. Given a variation
+ * of k attributes, it checks that the first (k-1) are sufficient to determine
+ * the last one.
+ */
+static bool
+dependency_is_valid(int numrows, HeapTuple *rows, int k, int * dependency,
+ VacAttrStats **stats, int2vector *attrs)
+{
+ int i, j;
+ int nvalues = numrows * k;
+
+ /*
+ * XXX Maybe the threshold should be somehow related to the number of
+ * distinct values in the combination of columns we're analyzing.
+ * Assuming the distribution is uniform, we can estimate the average
+ * group size and use it as a threshold, similarly to what we do for
+ * MCV lists.
+ */
+ int min_group_size = 3;
+
+ /* number of groups supporting / contradicting the dependency */
+ int n_supporting = 0;
+ int n_contradicting = 0;
+
+ /* counters valid within a group */
+ int group_size = 0;
+ int n_violations = 0;
+
+ int n_supporting_rows = 0;
+ int n_contradicting_rows = 0;
+
+ /* sort info for all attributes columns */
+ MultiSortSupport mss = multi_sort_init(k);
+
+ /* data for the sort */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * nvalues);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * nvalues);
+
+ /* fix the pointers to values/isnull */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * k];
+ items[i].isnull = &isnull[i * k];
+ }
+
+ /*
+ * Verify the dependency (a,b,...)->z, using a rather simple algorithm:
+ *
+ * (a) sort the data lexicographically
+ *
+ * (b) split the data into groups by first (k-1) columns
+ *
+ * (c) for each group count different values in the last column
+ */
+
+ /* prepare the sort function for the first dimension, and SortItem array */
+ for (i = 0; i < k; i++)
+ {
+ multi_sort_add_dimension(mss, i, dependency[i], stats);
+
+ /* accumulate all the data for both columns into an array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ items[j].values[i]
+ = heap_getattr(rows[j], attrs->values[dependency[i]],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+ }
+ }
+
+ /* sort the items so that we can detect the groups */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /*
+ * Walk through the sorted array, split it into rows according to the first
+ * (k-1) columns. If there's a single value in the last column, we count
+ * the group as 'supporting' the functional dependency. Otherwise we count
+ * it as contradicting.
+ *
+ * We also require a group to have a minimum number of rows to be considered
+ * useful for supporting the dependency. Contradicting groups may be of
+ * any size, though.
+ *
+ * XXX The minimum size requirement makes it impossible to identify case
+ * when both columns are unique (or nearly unique), and therefore
+ * trivially functionally dependent.
+ */
+
+ /* start with the first row forming a group */
+ group_size = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ /* end of the preceding group */
+ if (multi_sort_compare_dims(0, (k-2), &items[i-1], &items[i], mss) != 0)
+ {
+ /*
+ * If there is a single are no contradicting rows, count the group
+ * as supporting, otherwise contradicting.
+ */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations > 0)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ /* current values start a new group */
+ n_violations = 0;
+ group_size = 0;
+ }
+ /* first colums match, but the last one does not (so contradicting) */
+ else if (multi_sort_compare_dims((k-1), (k-1), &items[i-1], &items[i], mss) != 0)
+ n_violations += 1;
+
+ group_size += 1;
+ }
+
+ /* handle the last group (just like above) */
+ if ((n_violations == 0) && (group_size >= min_group_size))
+ {
+ n_supporting += 1;
+ n_supporting_rows += group_size;
+ }
+ else if (n_violations)
+ {
+ n_contradicting += 1;
+ n_contradicting_rows += group_size;
+ }
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+ pfree(mss);
+
+ /*
+ * See if the number of rows supporting the association is at least 10x the
+ * number of rows violating the hypothetical dependency.
+ */
+ return (n_supporting_rows > (n_contradicting_rows * 10));
+}
+
+/*
+ * detects functional dependencies between groups of columns
+ *
+ * Generates all possible subsets of columns (variations) and checks if the
+ * last one is determined by the preceding ones. For example given 3 columns,
+ * there are 12 variations (6 for variations on 2 columns, 6 for 3 columns):
+ *
+ * two columns three columns
+ * ----------- -------------
+ * (a) -> c (a,b) -> c
+ * (b) -> c (b,a) -> c
+ * (a) -> b (a,c) -> b
+ * (c) -> b (c,a) -> b
+ * (c) -> a (c,b) -> a
+ * (b) -> a (b,c) -> a
+ *
+ * Clearly some of the variations are redundant, as the order of columns on the
+ * left side does not matter. This is detected in dependency_is_implied, and
+ * those dependencies are ignored.
+ *
+ * We however do not detect that dependencies are transitively implied. For
+ * example given dependencies
+ *
+ * (a) -> b
+ * (b) -> c
+ *
+ * then
+ *
+ * (a) -> c
+ *
+ * is trivially implied. However we don't detect that and all three dependencies
+ * will get included in the resulting set. Eliminating such transitively implied
+ * dependencies is future work.
+ */
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int k;
+ int numattrs = attrs->dim1;
+
+ /* result */
+ MVDependencies dependencies = NULL;
+
+ Assert(numattrs >= 2);
+
+ /*
+ * We'll try build functional dependencies starting from the smallest ones
+ * covering jut 2 columns, to the largest ones, covering all columns
+ * included int the statistics. We start from the smallest ones because
+ * we want to be able to skip already implied ones.
+ */
+ for (k = 2; k <= numattrs; k++)
+ {
+ int *dependency; /* array with k elements */
+
+ /* prepare a generator of variation */
+ VariationGenerator generator = generator_init(attrs, k);
+
+ /* generate all possible variations of k values (out of n) */
+ while ((dependency = generator_next(generator, attrs)))
+ {
+ MVDependency d;
+
+ /* skip dependencies that are already trivially implied */
+ if (dependency_is_implied(dependencies, k, dependency, attrs))
+ continue;
+
+ /* also skip dependencies that don't seem to be valid */
+ if (! dependency_is_valid(numrows, rows, k, dependency, stats, attrs))
+ continue;
+
+ d = (MVDependency)palloc0(offsetof(MVDependencyData, attributes)
+ + k * sizeof(int));
+
+ /* copy the dependency, but translate it to actuall attnums */
+ d->nattributes = k;
+ for (i = 0; i < k; i++)
+ d->attributes[i] = attrs->values[dependency[i]];
+
+ /* initialize the list of dependencies */
+ if (dependencies == NULL)
+ {
+ dependencies
+ = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ dependencies->magic = MVSTAT_DEPS_MAGIC;
+ dependencies->type = MVSTAT_DEPS_TYPE_BASIC;
+ dependencies->ndeps = 0;
+ }
+
+ dependencies->ndeps++;
+ dependencies = (MVDependencies)repalloc(dependencies,
+ offsetof(MVDependenciesData, deps)
+ + dependencies->ndeps * sizeof(MVDependency));
+
+ dependencies->deps[dependencies->ndeps-1] = d;
+ }
+
+ /* we're done with variations of k elements, so free the generator */
+ generator_free(generator);
+ }
+
+ return dependencies;
+}
+
+
+/*
+ * serialize list of dependencies into a bytea
+ */
+bytea *
+serialize_mv_dependencies(MVDependencies dependencies)
+{
+ int i;
+ bytea * output;
+ char *tmp;
+
+ /* we need to store ndeps, with a number of attributes for each one */
+ Size len = VARHDRSZ + offsetof(MVDependenciesData, deps)
+ + sizeof(int) * dependencies->ndeps;
+
+ /* and also include space for the actual attribute numbers */
+ for (i = 0; i < dependencies->ndeps; i++)
+ len += (sizeof(int16) * dependencies->deps[i]->nattributes);
+
+ output = (bytea*)palloc0(len);
+ SET_VARSIZE(output, len);
+
+ tmp = VARDATA(output);
+
+ /* first, store the number of dimensions / items */
+ memcpy(tmp, dependencies, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ /* store number of attributes and attribute numbers for each dependency */
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency d = dependencies->deps[i];
+
+ memcpy(tmp, &(d->nattributes), sizeof(int));
+ tmp += sizeof(int);
+
+ memcpy(tmp, d->attributes, sizeof(int16) * d->nattributes);
+ tmp += sizeof(int16) * d->nattributes;
+
+ Assert(tmp <= ((char*)output + len));
+ }
+
+ return output;
+}
+
+/*
+ * Reads serialized dependencies into MVDependencies structure.
+ */
+MVDependencies
+deserialize_mv_dependencies(bytea * data)
+{
+ int i;
+ Size expected_size;
+ MVDependencies dependencies;
+ char *tmp;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVDependenciesData,deps))
+ elog(ERROR, "invalid MVDependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVDependenciesData,deps));
+
+ /* read the MVDependencies header */
+ dependencies = (MVDependencies)palloc0(sizeof(MVDependenciesData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(dependencies, tmp, offsetof(MVDependenciesData, deps));
+ tmp += offsetof(MVDependenciesData, deps);
+
+ if (dependencies->magic != MVSTAT_DEPS_MAGIC)
+ elog(ERROR, "invalid dependency type %d (expected %dd)",
+ dependencies->type, MVSTAT_DEPS_MAGIC);
+
+ if (dependencies->type != MVSTAT_DEPS_TYPE_BASIC)
+ elog(ERROR, "invalid dependency type %d (expected %dd)",
+ dependencies->type, MVSTAT_DEPS_TYPE_BASIC);
+
+ Assert(dependencies->ndeps > 0);
+
+ /* what minimum bytea size do we expect for those parameters */
+ expected_size = offsetof(MVDependenciesData,deps) +
+ dependencies->ndeps * (sizeof(int) + sizeof(int16) * 2);
+
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid dependencies size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* allocate space for the MCV items */
+ dependencies = repalloc(dependencies, offsetof(MVDependenciesData,deps)
+ + (dependencies->ndeps * sizeof(MVDependency)));
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ int k;
+ MVDependency d;
+
+ /* number of attributes */
+ memcpy(&k, tmp, sizeof(int));
+ tmp += sizeof(int);
+
+ /* is the number of attributes valid? */
+ Assert((k >= 2) && (k <= MVSTATS_MAX_DIMENSIONS));
+
+ /* now that we know the number of attributes, allocate the dependency */
+ d = (MVDependency)palloc0(offsetof(MVDependencyData, attributes)
+ + k * sizeof(int));
+
+ d->nattributes = k;
+
+ /* copy attribute numbers */
+ memcpy(d->attributes, tmp, sizeof(int16) * d->nattributes);
+ tmp += sizeof(int16) * d->nattributes;
+
+ dependencies->deps[i] = d;
+
+ /* still within the bytea */
+ Assert(tmp <= ((char*)data + VARSIZE_ANY(data)));
+ }
+
+ /* we should have consumed the whole bytea exactly */
+ Assert(tmp == ((char*)data + VARSIZE_ANY(data)));
+
+ return dependencies;
+}
+
+/* print some basic info about dependencies (number of dependencies) */
+Datum
+pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ result = palloc0(128);
+ snprintf(result, 128, "dependencies=%d", dependencies->ndeps);
+
+ /* FIXME free the deserialized data (pfree is not enough) */
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/*
+ * print the dependencies
+ *
+ * TODO Would be nice if this printed column names (instead of just attnums).
+ */
+Datum
+pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
+{
+ int i, j;
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ StringInfoData buf;
+
+ MVDependencies dependencies = deserialize_mv_dependencies(data);
+
+ if (dependencies == NULL)
+ PG_RETURN_NULL();
+
+ initStringInfo(&buf);
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ MVDependency dependency = dependencies->deps[i];
+
+ if (i > 0)
+ appendStringInfo(&buf, ", ");
+
+ /* conditions */
+ appendStringInfoChar(&buf, '(');
+ for (j = 0; j < dependency->nattributes-1; j++)
+ {
+ if (j > 0)
+ appendStringInfoChar(&buf, ',');
+
+ appendStringInfo(&buf, "%d", dependency->attributes[j]);
+ }
+
+ /* the implied attribute */
+ appendStringInfo(&buf, ") => %d",
+ dependency->attributes[dependency->nattributes-1]);
+ }
+
+ PG_RETURN_TEXT_P(cstring_to_text(buf.data));
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index fd8dc91..8ce9c0e 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2104,6 +2104,50 @@ describeOneTableDetails(const char *schemaname,
PQclear(result);
}
+ /* print any multivariate statistics */
+ if (pset.sversion >= 90600)
+ {
+ printfPQExpBuffer(&buf,
+ "SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
+ " deps_enabled,\n"
+ " deps_built,\n"
+ " (SELECT string_agg(attname::text,', ')\n"
+ " FROM ((SELECT unnest(stakeys) AS attnum) s\n"
+ " JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
+ "FROM pg_mv_statistic stat WHERE starelid = '%s' ORDER BY 1;",
+ oid);
+
+ result = PSQLexec(buf.data);
+ if (!result)
+ goto error_return;
+ else
+ tuples = PQntuples(result);
+
+ if (tuples > 0)
+ {
+ printTableAddFooter(&cont, _("Statistics:"));
+ for (i = 0; i < tuples; i++)
+ {
+ printfPQExpBuffer(&buf, " ");
+
+ /* statistics name (qualified with namespace) */
+ appendPQExpBuffer(&buf, "\"%s.%s\" ",
+ PQgetvalue(result, i, 1),
+ PQgetvalue(result, i, 2));
+
+ /* options */
+ if (!strcmp(PQgetvalue(result, i, 4), "t"))
+ appendPQExpBuffer(&buf, "(dependencies)");
+
+ appendPQExpBuffer(&buf, " ON (%s)",
+ PQgetvalue(result, i, 6));
+
+ printTableAddFooter(&cont, buf.data);
+ }
+ }
+ PQclear(result);
+ }
+
/* print rules */
if (tableinfo.hasrules && tableinfo.relkind != 'm')
{
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 049bf9f..12211fe 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -153,10 +153,11 @@ typedef enum ObjectClass
OCLASS_EXTENSION, /* pg_extension */
OCLASS_EVENT_TRIGGER, /* pg_event_trigger */
OCLASS_POLICY, /* pg_policy */
- OCLASS_TRANSFORM /* pg_transform */
+ OCLASS_TRANSFORM, /* pg_transform */
+ OCLASS_STATISTICS /* pg_mv_statistics */
} ObjectClass;
-#define LAST_OCLASS OCLASS_TRANSFORM
+#define LAST_OCLASS OCLASS_STATISTICS
/* in dependency.c */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index b80d8d8..5ae42f7 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -119,6 +119,7 @@ extern void RemoveAttrDefault(Oid relid, AttrNumber attnum,
DropBehavior behavior, bool complain, bool internal);
extern void RemoveAttrDefaultById(Oid attrdefId);
extern void RemoveStatistics(Oid relid, AttrNumber attnum);
+extern void RemoveMVStatistics(Oid relid, AttrNumber attnum);
extern Form_pg_attribute SystemAttributeDefinition(AttrNumber attno,
bool relhasoids);
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index ab2c1a8..a768bb5 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -173,6 +173,13 @@ DECLARE_UNIQUE_INDEX(pg_largeobject_loid_pn_index, 2683, on pg_largeobject using
DECLARE_UNIQUE_INDEX(pg_largeobject_metadata_oid_index, 2996, on pg_largeobject_metadata using btree(oid oid_ops));
#define LargeObjectMetadataOidIndexId 2996
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_oid_index, 3380, on pg_mv_statistic using btree(oid oid_ops));
+#define MvStatisticOidIndexId 3380
+DECLARE_UNIQUE_INDEX(pg_mv_statistic_name_index, 3997, on pg_mv_statistic using btree(staname name_ops, stanamespace oid_ops));
+#define MvStatisticNameIndexId 3997
+DECLARE_INDEX(pg_mv_statistic_relid_index, 3379, on pg_mv_statistic using btree(starelid oid_ops));
+#define MvStatisticRelidIndexId 3379
+
DECLARE_UNIQUE_INDEX(pg_namespace_nspname_index, 2684, on pg_namespace using btree(nspname name_ops));
#define NamespaceNameIndexId 2684
DECLARE_UNIQUE_INDEX(pg_namespace_oid_index, 2685, on pg_namespace using btree(oid oid_ops));
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index 2ccb3a7..44cf9c6 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -137,6 +137,8 @@ extern Oid get_collation_oid(List *collname, bool missing_ok);
extern Oid get_conversion_oid(List *conname, bool missing_ok);
extern Oid FindDefaultConversionProc(int32 for_encoding, int32 to_encoding);
+extern Oid get_statistics_oid(List *names, bool missing_ok);
+
/* initialization & transaction cleanup code */
extern void InitializeSearchPath(void);
extern void AtEOXact_Namespace(bool isCommit, bool parallel);
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
new file mode 100644
index 0000000..c74af47
--- /dev/null
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -0,0 +1,75 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_mv_statistic.h
+ * definition of the system "multivariate statistic" relation (pg_mv_statistic)
+ * along with the relation's initial contents.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_mv_statistic.h
+ *
+ * NOTES
+ * the genbki.pl script reads this file and generates .bki
+ * information from the DATA() statements.
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_MV_STATISTIC_H
+#define PG_MV_STATISTIC_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_mv_statistic definition. cpp turns this into
+ * typedef struct FormData_pg_mv_statistic
+ * ----------------
+ */
+#define MvStatisticRelationId 3381
+
+CATALOG(pg_mv_statistic,3381)
+{
+ /* These fields form the unique key for the entry: */
+ Oid starelid; /* relation containing attributes */
+ NameData staname; /* statistics name */
+ Oid stanamespace; /* OID of namespace containing this statistics */
+ Oid staowner; /* statistics owner */
+
+ /* statistics requested to build */
+ bool deps_enabled; /* analyze dependencies? */
+
+ /* statistics that are available (if requested) */
+ bool deps_built; /* dependencies were built */
+
+ /* variable-length fields start here, but we allow direct access to stakeys */
+ int2vector stakeys; /* array of column keys */
+
+#ifdef CATALOG_VARLEN
+ bytea stadeps; /* dependencies (serialized) */
+#endif
+
+} FormData_pg_mv_statistic;
+
+/* ----------------
+ * Form_pg_mv_statistic corresponds to a pointer to a tuple with
+ * the format of pg_mv_statistic relation.
+ * ----------------
+ */
+typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
+
+/* ----------------
+ * compiler constants for pg_mv_statistic
+ * ----------------
+ */
+#define Natts_pg_mv_statistic 8
+#define Anum_pg_mv_statistic_starelid 1
+#define Anum_pg_mv_statistic_staname 2
+#define Anum_pg_mv_statistic_stanamespace 3
+#define Anum_pg_mv_statistic_staowner 4
+#define Anum_pg_mv_statistic_deps_enabled 5
+#define Anum_pg_mv_statistic_deps_built 6
+#define Anum_pg_mv_statistic_stakeys 7
+#define Anum_pg_mv_statistic_stadeps 8
+
+#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index ceb8129..cdcbf95 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2666,6 +2666,11 @@ DESCR("current user privilege on any column by rel name");
DATA(insert OID = 3029 ( has_any_column_privilege PGNSP PGUID 12 10 0 0 0 f f f f t f s s 2 0 16 "26 25" _null_ _null_ _null_ _null_ _null_ has_any_column_privilege_id _null_ _null_ _null_ ));
DESCR("current user privilege on any column by rel oid");
+DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_info _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies info");
+DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
+DESCR("multivariate stats: functional dependencies show");
+
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
DATA(insert OID = 1929 ( pg_stat_get_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index b7a38ce..a52096b 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -49,6 +49,7 @@ extern void BootstrapToastTable(char *relName,
DECLARE_TOAST(pg_attrdef, 2830, 2831);
DECLARE_TOAST(pg_constraint, 2832, 2833);
DECLARE_TOAST(pg_description, 2834, 2835);
+DECLARE_TOAST(pg_mv_statistic, 3577, 3578);
DECLARE_TOAST(pg_proc, 2836, 2837);
DECLARE_TOAST(pg_rewrite, 2838, 2839);
DECLARE_TOAST(pg_seclabel, 3598, 3599);
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 54f67e9..99a6a62 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -75,6 +75,10 @@ extern ObjectAddress DefineOperator(List *names, List *parameters);
extern void RemoveOperatorById(Oid operOid);
extern ObjectAddress AlterOperator(AlterOperatorStmt *stmt);
+/* commands/statscmds.c */
+extern ObjectAddress CreateStatistics(CreateStatsStmt *stmt);
+extern void RemoveStatisticsById(Oid statsOid);
+
/* commands/aggregatecmds.c */
extern ObjectAddress DefineAggregate(List *name, List *args, bool oldstyle,
List *parameters, const char *queryString);
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index fad9988..545b62a 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -266,6 +266,7 @@ typedef enum NodeTag
T_PlaceHolderInfo,
T_MinMaxAggInfo,
T_PlannerParamItem,
+ T_MVStatisticInfo,
/*
* TAGS FOR MEMORY NODES (memnodes.h)
@@ -401,6 +402,7 @@ typedef enum NodeTag
T_CreatePolicyStmt,
T_AlterPolicyStmt,
T_CreateTransformStmt,
+ T_CreateStatsStmt,
/*
* TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fd0629..e1807fb 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -601,6 +601,17 @@ typedef struct ColumnDef
int location; /* parse location, or -1 if none/unknown */
} ColumnDef;
+typedef struct CreateStatsStmt
+{
+ NodeTag type;
+ List *defnames; /* qualified name (list of Value strings) */
+ RangeVar *relation; /* relation to build statistics on */
+ List *keys; /* String nodes naming referenced column(s) */
+ List *options; /* list of DefElem nodes */
+ bool if_not_exists; /* just do nothing if statistics already exists? */
+} CreateStatsStmt;
+
+
/*
* TableLikeClause - CREATE TABLE ( ... LIKE ... ) clause
*/
@@ -1410,6 +1421,7 @@ typedef enum ObjectType
OBJECT_RULE,
OBJECT_SCHEMA,
OBJECT_SEQUENCE,
+ OBJECT_STATISTICS,
OBJECT_TABCONSTRAINT,
OBJECT_TABLE,
OBJECT_TABLESPACE,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index bdea72c..75c4752 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -541,6 +541,7 @@ typedef struct RelOptInfo
List *lateral_vars; /* LATERAL Vars and PHVs referenced by rel */
Relids lateral_referencers; /* rels that reference me laterally */
List *indexlist; /* list of IndexOptInfo */
+ List *mvstatlist; /* list of MVStatisticInfo */
BlockNumber pages; /* size estimates derived from pg_class */
double tuples;
double allvisfrac;
@@ -636,6 +637,33 @@ typedef struct IndexOptInfo
void (*amcostestimate) (); /* AM's cost estimator */
} IndexOptInfo;
+/*
+ * MVStatisticInfo
+ * Information about multivariate stats for planning/optimization
+ *
+ * This contains information about which columns are covered by the
+ * statistics (stakeys), which options were requested while adding the
+ * statistics (*_enabled), and which kinds of statistics were actually
+ * built and are available for the optimizer (*_built).
+ */
+typedef struct MVStatisticInfo
+{
+ NodeTag type;
+
+ Oid mvoid; /* OID of the statistics row */
+ RelOptInfo *rel; /* back-link to index's table */
+
+ /* enabled statistics */
+ bool deps_enabled; /* functional dependencies enabled */
+
+ /* built/available statistics */
+ bool deps_built; /* functional dependencies built */
+
+ /* columns in the statistics (attnums) */
+ int2vector *stakeys; /* attnums of the columns covered */
+
+} MVStatisticInfo;
+
/*
* EquivalenceClasses
diff --git a/src/include/utils/acl.h b/src/include/utils/acl.h
index 4e15a14..3e11253 100644
--- a/src/include/utils/acl.h
+++ b/src/include/utils/acl.h
@@ -330,6 +330,7 @@ extern bool pg_foreign_data_wrapper_ownercheck(Oid srv_oid, Oid roleid);
extern bool pg_foreign_server_ownercheck(Oid srv_oid, Oid roleid);
extern bool pg_event_trigger_ownercheck(Oid et_oid, Oid roleid);
extern bool pg_extension_ownercheck(Oid ext_oid, Oid roleid);
+extern bool pg_statistics_ownercheck(Oid stat_oid, Oid roleid);
extern bool has_createrole_privilege(Oid roleid);
extern bool has_bypassrls_privilege(Oid roleid);
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
new file mode 100644
index 0000000..7837bc0
--- /dev/null
+++ b/src/include/utils/mvstats.h
@@ -0,0 +1,71 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvstats.h
+ * Multivariate statistics and selectivity estimation functions.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/mvstats.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MVSTATS_H
+#define MVSTATS_H
+
+#include "fmgr.h"
+#include "commands/vacuum.h"
+
+
+#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
+
+/*
+ * Functional dependencies, tracking column-level relationships (values
+ * in one column determine values in another one).
+ */
+typedef struct MVDependencyData {
+ int nattributes; /* number of attributes */
+ int16 attributes[1]; /* attribute numbers */
+} MVDependencyData;
+
+typedef MVDependencyData* MVDependency;
+
+typedef struct MVDependenciesData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MV Dependencies (BASIC) */
+ int32 ndeps; /* number of dependencies */
+ MVDependency deps[1]; /* XXX why not a pointer? */
+} MVDependenciesData;
+
+typedef MVDependenciesData* MVDependencies;
+
+#define MVSTAT_DEPS_MAGIC 0xB4549A2C /* marks serialized bytea */
+#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
+
+/*
+ * TODO Maybe fetching the histogram/MCV list separately is inefficient?
+ * Consider adding a single `fetch_stats` method, fetching all
+ * stats specified using flags (or something like that).
+ */
+
+bytea * serialize_mv_dependencies(MVDependencies dependencies);
+
+/* deserialization of stats (serialization is private to analyze) */
+MVDependencies deserialize_mv_dependencies(bytea * data);
+
+/* FIXME this probably belongs somewhere else (not to operations stats) */
+extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+
+MVDependencies
+build_mv_dependencies(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+ int natts, VacAttrStats **vacattrstats);
+
+void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+
+#endif
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index f2bebf2..8771f9c 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -61,6 +61,7 @@ typedef struct RelationData
bool rd_isvalid; /* relcache entry is valid */
char rd_indexvalid; /* state of rd_indexlist: 0 = not valid, 1 =
* valid, 2 = temporarily forced */
+ bool rd_mvstatvalid; /* state of rd_mvstatlist: true/false */
/*
* rd_createSubid is the ID of the highest subtransaction the rel has
@@ -93,6 +94,9 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
+
+ /* data managed by RelationGetMVStatList: */
+ List *rd_mvstatlist; /* list of OIDs of multivariate stats */
/* data managed by RelationGetIndexAttrBitmap: */
Bitmapset *rd_indexattr; /* identifies columns used in indexes */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 1b48304..9f03c8d 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -38,6 +38,7 @@ extern void RelationClose(Relation relation);
* Routines to compute/retrieve additional cached information
*/
extern List *RelationGetIndexList(Relation relation);
+extern List *RelationGetMVStatList(Relation relation);
extern Oid RelationGetOidIndex(Relation relation);
extern Oid RelationGetReplicaIndex(Relation relation);
extern List *RelationGetIndexExpressions(Relation relation);
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index 256615b..0e0658d 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -66,6 +66,8 @@ enum SysCacheIdentifier
INDEXRELID,
LANGNAME,
LANGOID,
+ MVSTATNAMENSP,
+ MVSTATOID,
NAMESPACENAME,
NAMESPACEOID,
OPERNAMENSP,
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
new file mode 100644
index 0000000..f54e1b7
--- /dev/null
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -0,0 +1,150 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | (1) => 2, (1) => 3, (2) => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | (1) => 2, (1) => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | (1) => 2, (1) => 3, (2) => 3
+(1 row)
+
+DROP TABLE functional_dependencies;
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | f |
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | (1) => 2, (1) => 3, (2) => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | (1) => 2, (1) => 3
+(1 row)
+
+TRUNCATE functional_dependencies;
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+-------------------------------
+ t | t | (1) => 2, (1) => 3, (2) => 3
+(1 row)
+
+DROP TABLE functional_dependencies;
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+ deps_enabled | deps_built | pg_mv_stats_dependencies_show
+--------------+------------+--------------------------------------------------
+ t | t | (2) => 1, (3) => 1, (3) => 2, (4) => 1, (4) => 2
+(1 row)
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/expected/object_address.out b/src/test/regress/expected/object_address.out
index 75751be..eb60960 100644
--- a/src/test/regress/expected/object_address.out
+++ b/src/test/regress/expected/object_address.out
@@ -35,6 +35,7 @@ ALTER DEFAULT PRIVILEGES FOR ROLE regtest_addr_user REVOKE DELETE ON TABLES FROM
CREATE TRANSFORM FOR int LANGUAGE SQL (
FROM SQL WITH FUNCTION varchar_transform(internal),
TO SQL WITH FUNCTION int4recv(internal));
+CREATE STATISTICS addr_nsp.gentable_stat ON addr_nsp.gentable(a,b) WITH (dependencies);
-- test some error cases
SELECT pg_get_object_address('stone', '{}', '{}');
ERROR: unrecognized object type "stone"
@@ -373,7 +374,8 @@ WITH objects (type, name, args) AS (VALUES
-- extension
-- event trigger
('policy', '{addr_nsp, gentable, genpol}', '{}'),
- ('transform', '{int}', '{sql}')
+ ('transform', '{int}', '{sql}'),
+ ('statistics', '{addr_nsp, gentable_stat}', '{}')
)
SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)).*,
-- test roundtrip through pg_identify_object_as_address
@@ -420,13 +422,14 @@ SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)).*,
trigger | | | t on addr_nsp.gentable | t
operator family | pg_catalog | integer_ops | pg_catalog.integer_ops USING btree | t
policy | | | genpol on addr_nsp.gentable | t
+ statistics | addr_nsp | gentable_stat | addr_nsp.gentable_stat | t
collation | pg_catalog | "default" | pg_catalog."default" | t
transform | | | for integer on language sql | t
text search dictionary | addr_nsp | addr_ts_dict | addr_nsp.addr_ts_dict | t
text search parser | addr_nsp | addr_ts_prs | addr_nsp.addr_ts_prs | t
text search configuration | addr_nsp | addr_ts_conf | addr_nsp.addr_ts_conf | t
text search template | addr_nsp | addr_ts_temp | addr_nsp.addr_ts_temp | t
-(41 rows)
+(42 rows)
---
--- Cleanup resources
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 22ea06c..06f2231 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1368,6 +1368,15 @@ pg_matviews| SELECT n.nspname AS schemaname,
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
LEFT JOIN pg_tablespace t ON ((t.oid = c.reltablespace)))
WHERE (c.relkind = 'm'::"char");
+pg_mv_stats| SELECT n.nspname AS schemaname,
+ c.relname AS tablename,
+ s.staname,
+ s.stakeys AS attnums,
+ length(s.stadeps) AS depsbytes,
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ FROM ((pg_mv_statistic s
+ JOIN pg_class c ON ((c.oid = s.starelid)))
+ LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
pg_policies| SELECT n.nspname AS schemaname,
c.relname AS tablename,
pol.polname AS policyname,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index eb0bc88..92a0d8a 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -113,6 +113,7 @@ pg_inherits|t
pg_language|t
pg_largeobject|t
pg_largeobject_metadata|t
+pg_mv_statistic|t
pg_namespace|t
pg_opclass|t
pg_operator|t
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7e9b319..097a04f 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -162,3 +162,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
new file mode 100644
index 0000000..051633a
--- /dev/null
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -0,0 +1,142 @@
+-- data type passed by value
+CREATE TABLE functional_dependencies (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s1 ON functional_dependencies (unknown_column) WITH (dependencies);
+
+-- single column
+CREATE STATISTICS s1 ON functional_dependencies (a) WITH (dependencies);
+
+-- single column, duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a,a) WITH (dependencies);
+
+-- two columns, one duplicated
+CREATE STATISTICS s1 ON functional_dependencies (a, a, b) WITH (dependencies);
+
+-- unknown option
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (unknown_option);
+
+-- correct command
+CREATE STATISTICS s1 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
+
+-- varlena type (text)
+CREATE TABLE functional_dependencies (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s2 ON functional_dependencies (a, b, c) WITH (dependencies);
+
+-- random data (no functional dependencies)
+INSERT INTO functional_dependencies
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c
+INSERT INTO functional_dependencies
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+TRUNCATE functional_dependencies;
+
+-- a => b, a => c, b => c
+INSERT INTO functional_dependencies
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE functional_dependencies (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s3 ON functional_dependencies (a, b, c, d) WITH (dependencies);
+
+INSERT INTO functional_dependencies
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE functional_dependencies;
+
+SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
+ FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+
+DROP TABLE functional_dependencies;
diff --git a/src/test/regress/sql/object_address.sql b/src/test/regress/sql/object_address.sql
index 68e7cb0..3775b28 100644
--- a/src/test/regress/sql/object_address.sql
+++ b/src/test/regress/sql/object_address.sql
@@ -39,6 +39,7 @@ ALTER DEFAULT PRIVILEGES FOR ROLE regtest_addr_user REVOKE DELETE ON TABLES FROM
CREATE TRANSFORM FOR int LANGUAGE SQL (
FROM SQL WITH FUNCTION varchar_transform(internal),
TO SQL WITH FUNCTION int4recv(internal));
+CREATE STATISTICS addr_nsp.gentable_stat ON addr_nsp.gentable(a,b) WITH (dependencies);
-- test some error cases
SELECT pg_get_object_address('stone', '{}', '{}');
@@ -166,7 +167,8 @@ WITH objects (type, name, args) AS (VALUES
-- extension
-- event trigger
('policy', '{addr_nsp, gentable, genpol}', '{}'),
- ('transform', '{int}', '{sql}')
+ ('transform', '{int}', '{sql}'),
+ ('statistics', '{addr_nsp, gentable_stat}', '{}')
)
SELECT (pg_identify_object(addr1.classid, addr1.objid, addr1.subobjid)).*,
-- test roundtrip through pg_identify_object_as_address
--
2.5.0
0003-clause-reduction-using-functional-dependencies.patchtext/x-patch; name=0003-clause-reduction-using-functional-dependencies.patchDownload
From 6e3e16f46f93f045c137c070b48e387a470c3a08 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 19:42:18 +0200
Subject: [PATCH 3/9] clause reduction using functional dependencies
During planning, use functional dependencies to decide which clauses to
skip during cardinality estimation. Initial and rather simplistic
implementation.
This only works with regular WHERE clauses, not clauses used for join
clauses.
Note: The clause_is_mv_compatible() needs to identify the relation (so
that we can fetch the list of multivariate stats by OID).
planner_rt_fetch() seems like the appropriate way to get the relation
OID, but apparently it only works with simple vars. Maybe
examine_variable() would make this work with more complex vars too?
Includes regression tests analyzing functional dependencies (part of
ANALYZE) on several datasets (no dependencies, no transitive
dependencies, ...).
Checks that a query with conditions on two columns, where one (B) is
functionally dependent on the other one (A), correctly ignores the
clause on (B) and chooses bitmap index scan instead of plain index scan
(which is what happens otherwise, thanks to assumption of
independence).
Note: Functional dependencies only work with equality clauses, no
inequalities etc.
---
src/backend/optimizer/path/clausesel.c | 505 +++++++++++++++++++++++++-
src/backend/utils/mvstats/README.dependencies | 63 ++--
src/backend/utils/mvstats/README.stats | 36 ++
src/backend/utils/mvstats/common.c | 5 +-
src/backend/utils/mvstats/dependencies.c | 24 ++
src/include/utils/mvstats.h | 3 +-
src/include/utils/rel.h | 2 +-
src/test/regress/expected/mv_dependencies.out | 24 ++
src/test/regress/parallel_schedule | 3 +
src/test/regress/sql/mv_dependencies.sql | 15 +
10 files changed, 637 insertions(+), 43 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.stats
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 02660c2..a3afdf5 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -14,14 +14,19 @@
*/
#include "postgres.h"
+#include "access/sysattr.h"
+#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/pathnode.h"
#include "optimizer/plancat.h"
+#include "optimizer/var.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/selfuncs.h"
+#include "utils/typcache.h"
/*
@@ -41,6 +46,25 @@ typedef struct RangeQueryClause
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
+#define MV_CLAUSE_TYPE_FDEP 0x01
+
+static bool clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum);
+
+static Bitmapset *collect_mv_attnums(List *clauses, Index relid);
+
+static int count_mv_attnums(List *clauses, Index relid);
+
+static int count_varnos(List *clauses, Index *relid);
+
+static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Index relid, List *stats);
+
+static bool has_stats(List *stats, int type);
+
+static List * find_stats(PlannerInfo *root, Index relid);
+
+static bool stats_type_matches(MVStatisticInfo *stat, int type);
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
@@ -60,7 +84,19 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
-
+ *
+ * The first thing we try to do is applying multivariate statistics, in a way
+ * that intends to minimize the overhead when there are no multivariate stats
+ * on the relation. Thus we do several simple (and inexpensive) checks first,
+ * to verify that suitable multivariate statistics exist.
+ *
+ * If we identify such multivariate statistics apply, we try to apply them.
+ * Currently we only have (soft) functional dependencies, so we try to reduce
+ * the list of clauses.
+ *
+ * Then we remove the clauses estimated using multivariate stats, and process
+ * the rest of the clauses using the regular per-column stats.
+ *
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
* query components if they are restriction opclauses whose operators have
@@ -99,6 +135,22 @@ clauselist_selectivity(PlannerInfo *root,
RangeQueryClause *rqlist = NULL;
ListCell *l;
+ /* processing mv stats */
+ Oid relid = InvalidOid;
+
+ /* list of multivariate stats on the relation */
+ List *stats = NIL;
+
+ /*
+ * To fetch the statistics, we first need to determine the rel. Currently
+ * point we only support estimates of simple restrictions with all Vars
+ * referencing a single baserel. However set_baserel_size_estimates() sets
+ * varRelid=0 so we have to actually inspect the clauses by pull_varnos
+ * and see if there's just a single varno referenced.
+ */
+ if ((count_varnos(clauses, &relid) == 1) && ((varRelid == 0) || (varRelid == relid)))
+ stats = find_stats(root, relid);
+
/*
* If there's exactly one clause, then no use in trying to match up pairs,
* so just go directly to clause_selectivity().
@@ -108,6 +160,24 @@ clauselist_selectivity(PlannerInfo *root,
varRelid, jointype, sjinfo);
/*
+ * Apply functional dependencies, but first check that there are some stats
+ * with functional dependencies built (by simply walking the stats list),
+ * and that there are at two or more attributes referenced by clauses that
+ * may be reduced using functional dependencies.
+ *
+ * We would find that anyway when trying to actually apply the functional
+ * dependencies, but let's do the cheap checks first.
+ *
+ * After applying the functional dependencies we get the remainig clauses
+ * that need to be estimated by other types of stats (MCV, histograms etc).
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
+ (count_mv_attnums(clauses, relid) >= 2))
+ {
+ clauses = clauselist_apply_dependencies(root, clauses, relid, stats);
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -763,3 +833,436 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Index relid)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ AttrNumber attnum;
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result for now - we only need the info */
+ if (clause_is_mv_compatible(clause, relid, &attnum))
+ attnums = bms_add_member(attnums, attnum);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ if (attnums != NULL)
+ pfree(attnums);
+ attnums = NULL;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Index relid)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Count varnos referenced in the clauses, and if there's a single varno then
+ * return the index in 'relid'.
+ */
+static int
+count_varnos(List *clauses, Index *relid)
+{
+ int cnt;
+ Bitmapset *varnos = NULL;
+
+ varnos = pull_varnos((Node *) clauses);
+ cnt = bms_num_members(varnos);
+
+ /* if there's a single varno in the clauses, remember it */
+ if (bms_num_members(varnos) == 1)
+ *relid = bms_singleton_member(varnos);
+
+ bms_free(varnos);
+
+ return cnt;
+}
+
+typedef struct
+{
+ Index varno; /* relid we're interested in */
+ Bitmapset *varattnos; /* attnums referenced by the clauses */
+} mv_compatible_context;
+
+/*
+ * Recursive walker that checks compatibility of the clause with multivariate
+ * statistics, and collects attnums from the Vars.
+ *
+ * XXX The original idea was to combine this with expression_tree_walker, but
+ * I've been unable to make that work - seems that does not quite allow
+ * checking the structure. Hence the explicit calls to the walker.
+ */
+static bool
+mv_compatible_walker(Node *node, mv_compatible_context *context)
+{
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) node;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return true;
+
+ /* clauses referencing multiple varnos are incompatible */
+ if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+ return true;
+
+ /* check the clause inside the RestrictInfo */
+ return mv_compatible_walker((Node*)rinfo->clause, (void *) context);
+ }
+
+ if (IsA(node, Var))
+ {
+ Var * var = (Var*)node;
+
+ /*
+ * Also, the variable needs to reference the right relid (this might be
+ * unnecessary given the other checks, but let's be sure).
+ */
+ if (var->varno != context->varno)
+ return true;
+
+ /* Also skip system attributes (we don't allow stats on those). */
+ if (! AttrNumberIsForUserDefinedAttr(var->varattno))
+ return true;
+
+ /* Seems fine, so let's remember the attnum. */
+ context->varattnos = bms_add_member(context->varattnos, var->varattno);
+
+ return false;
+ }
+
+ /*
+ * And finally the operator expressions - we only allow simple expressions
+ * with two arguments, where one is a Var and the other is a constant, and
+ * it's a simple comparison (which we detect using estimator function).
+ */
+ if (is_opclause(node))
+ {
+ OpExpr *expr = (OpExpr *) node;
+ Var *var;
+ bool varonleft = true;
+ bool ok;
+
+ /*
+ * Only expressions with two arguments are considered compatible.
+ *
+ * XXX Possibly unnecessary (can OpExpr have different arg count?).
+ */
+ if (list_length(expr->args) != 2)
+ return true;
+
+ /* see if it actually has the right */
+ ok = (NumRelids((Node*)expr) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ /* unsupported structure (two variables or so) */
+ if (! ok)
+ return true;
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the clause.
+ * Otherwise note the relid and attnum for the variable. This uses the
+ * function for estimating selectivity, ont the operator directly (a bit
+ * awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+
+ /* equality conditions are compatible with all statistics */
+ break;
+
+ default:
+
+ /* unknown estimator */
+ return true;
+ }
+
+ var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+ return mv_compatible_walker((Node *) var, context);
+ }
+
+ /* Node not explicitly supported, so terminate */
+ return true;
+}
+
+/*
+ * Determines whether the clause is compatible with multivariate stats,
+ * and if it is, returns some additional information - varno (index
+ * into simple_rte_array) and a bitmap of attributes. This is then
+ * used to fetch related multivariate statistics.
+ *
+ * At this moment we only support basic conditions of the form
+ *
+ * variable OP constant
+ *
+ * where OP is one of [=,<,<=,>=,>] (which is however determined by
+ * looking at the associated function for estimating selectivity, just
+ * like with the single-dimensional case).
+ *
+ * TODO Support 'OR clauses' - shouldn't be all that difficult to
+ * evaluate them using multivariate stats.
+ */
+static bool
+clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
+{
+ mv_compatible_context context;
+
+ context.varno = relid;
+ context.varattnos = NULL; /* no attnums */
+
+ if (mv_compatible_walker(clause, (void *) &context))
+ return false;
+
+ /* remember the newly collected attnums */
+ *attnum = bms_singleton_member(context.varattnos);
+
+ return true;
+}
+
+
+/*
+ * Reduce clauses using functional dependencies
+ */
+static List*
+fdeps_reduce_clauses(List *clauses, Index relid, Bitmapset *reduced_attnums)
+{
+ ListCell *lc;
+ List *reduced_clauses = NIL;
+
+ foreach (lc, clauses)
+ {
+ AttrNumber attnum = InvalidAttrNumber;
+ Node * clause = (Node*)lfirst(lc);
+
+ /* ignore clauses that are not compatible with functional dependencies */
+ if (! clause_is_mv_compatible(clause, relid, &attnum))
+ reduced_clauses = lappend(reduced_clauses, clause);
+
+ /* for equality clauses, only keep those not on reduced attributes */
+ if (! bms_is_member(attnum, reduced_attnums))
+ reduced_clauses = lappend(reduced_clauses, clause);
+ }
+
+ return reduced_clauses;
+}
+
+/*
+ * decide which attributes are redundant (for equality clauses)
+ *
+ * We try to apply all functional dependencies available, and for each one we
+ * check if it matches attnums from equality clauses, but only those not yet
+ * reduced.
+ *
+ * XXX Not sure if the order in which we apply the dependencies matters.
+ *
+ * XXX We do not combine functional dependencies from separate stats. That is
+ * if we have dependencies on [a,b] and [b,c], then we don't deduce
+ * a->c from a->b and b->c. Computing such transitive closure is a possible
+ * future improvement.
+ */
+static Bitmapset *
+fdeps_reduce_attnums(List *stats, Bitmapset *attnums)
+{
+ ListCell *lc;
+ Bitmapset *reduced = NULL;
+
+ foreach (lc, stats)
+ {
+ int i;
+ MVDependencies dependencies = NULL;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* skip statistics without dependencies */
+ if (! stats_type_matches(info, MV_CLAUSE_TYPE_FDEP))
+ continue;
+
+ /* fetch and deserialize dependencies */
+ dependencies = load_mv_dependencies(info->mvoid);
+
+ for (i = 0; i < dependencies->ndeps; i++)
+ {
+ int j;
+ bool matched = true;
+ MVDependency dep = dependencies->deps[i];
+
+ /* we don't bother to break the loop early (only few attributes) */
+ for (j = 0; j < dep->nattributes; j++)
+ {
+ if (! bms_is_member(dep->attributes[j], attnums))
+ matched = false;
+
+ if (bms_is_member(dep->attributes[j], reduced))
+ matched = false;
+ }
+
+ /* if dependency applies, mark the last attribute as reduced */
+ if (matched)
+ reduced = bms_add_member(reduced,
+ dep->attributes[dep->nattributes-1]);
+ }
+ }
+
+ return reduced;
+}
+
+/*
+ * reduce list of equality clauses using soft functional dependencies
+ *
+ * We simply walk through list of functional dependencies, and for each one we
+ * check whether the dependency 'matches' the clauses, i.e. if there's a clause
+ * matching the condition. If yes, we attempt to remove all clauses matching
+ * the implied part of the dependency from the list.
+ *
+ * This only reduces equality clauses, and ignores all the other types. We might
+ * extend it to handle IS NULL clause, in the future.
+ *
+ * We also assume the equality clauses are 'compatible'. For example we can't
+ * identify when the clauses use a mismatching zip code and city name. In such
+ * case the usual approach (product of selectivities) would produce a better
+ * estimate, although mostly by chance.
+ *
+ * The implementation needs to be careful about cyclic dependencies, e.g. when
+ *
+ * (a -> b) and (b -> a)
+ *
+ * at the same time, which means there's 1:1 relationship between te columns.
+ * In this case we must not reduce clauses on both attributes at the same time.
+ *
+ * TODO Currently we only apply functional dependencies at the same level, but
+ * maybe we could transfer the clauses from upper levels to the subtrees?
+ * For example let's say we have (a->b) dependency, and condition
+ *
+ * (a=1) AND (b=2 OR c=3)
+ *
+ * Currently, we won't be able to perform any reduction, because we'll
+ * consider (a=1) and (b=2 OR c=3) independently. But maybe we could pass
+ * (a=1) into the other expression, and only check it against conditions
+ * of the functional dependencies?
+ *
+ * In this case we'd end up with
+ *
+ * (a=1)
+ *
+ * as we'd consider (b=2) implied thanks to the rule, rendering the whole
+ * OR clause valid.
+ */
+static List *
+clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
+ Index relid, List *stats)
+{
+ Bitmapset *clause_attnums = NULL;
+ Bitmapset *reduced_attnums = NULL;
+
+ /*
+ * Is there at least one statistics with functional dependencies?
+ * If not, return the original clauses right away.
+ *
+ * XXX Isn't this a bit pointless, thanks to exactly the same check in
+ * clauselist_selectivity()? Can we trigger the condition here?
+ */
+ if (! has_stats(stats, MV_CLAUSE_TYPE_FDEP))
+ return clauses;
+
+ /* collect attnums from clauses compatible with dependencies (equality) */
+ clause_attnums = collect_mv_attnums(clauses, relid);
+
+ /* decide which attnums may be eliminated */
+ reduced_attnums = fdeps_reduce_attnums(stats, clause_attnums);
+
+ /*
+ * Walk through the clauses, and see which other clauses we may reduce.
+ */
+ clauses = fdeps_reduce_clauses(clauses, relid, reduced_attnums);
+
+ bms_free(clause_attnums);
+ bms_free(reduced_attnums);
+
+ return clauses;
+}
+
+/*
+ * Check that there are stats with at least one of the requested types.
+ */
+static bool
+stats_type_matches(MVStatisticInfo *stat, int type)
+{
+ if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
+ return true;
+
+ return false;
+}
+
+/*
+ * Check that there are stats with at least one of the requested types.
+ */
+static bool
+has_stats(List *stats, int type)
+{
+ ListCell *s;
+
+ foreach (s, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* terminate if we've found at least one matching statistics */
+ if (stats_type_matches(stat, type))
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * Lookups stats for a given baserel.
+ */
+static List *
+find_stats(PlannerInfo *root, Index relid)
+{
+ Assert(root->simple_rel_array[relid] != NULL);
+
+ return root->simple_rel_array[relid]->mvstatlist;
+}
diff --git a/src/backend/utils/mvstats/README.dependencies b/src/backend/utils/mvstats/README.dependencies
index 1f96fbc..f248459 100644
--- a/src/backend/utils/mvstats/README.dependencies
+++ b/src/backend/utils/mvstats/README.dependencies
@@ -156,37 +156,24 @@ estimates - especially compared to histograms, that are quite bad in estimating
equality clauses.
-Limitations
------------
-
-Let's see the main liminations of functional dependencies, especially those
-related to the current implementation.
+Multi-column dependencies
+-------------------------
-The current implementation supports only dependencies between two columns, but
-this is merely a simplification of the initial implementation. It's certainly
-useful to mine for dependencies involving multiple columns on the 'left' side,
-i.e. a condition for the dependency. That is dependencies like (a,b -> c).
+The implementation supports dependencies with multiple columns on the left side
+(i.e. condition of the dependency). The detection starts from dependencies with
+a single condition, and then proceeds to higher condition counts.
-The implementation may/should be smart enough not to mine redundant conditions,
-e.g. (a->b) and (a,c -> b), because the latter is a trivial consequence of the
-former one (if values of 'a' determine 'b', adding another column won't change
-that relationship). The ANALYZE should first analyze 1:1 dependencies, then 2:1
-dependencies (and skip the already identified ones), etc.
+It also detects dependencies that are implied by already identified ones, and
+ignores them. For example if we know that (a->b) then we won't add (a,c->b) as
+this dependency is a trivial consequence of (a->b).
-For example the dependency
+For a more practical example, consider these two dependencies
(city name -> zip code)
-
-is much stronger, i.e. whenever it hold, then
-
(city name, state name -> zip code)
-holds too. But in case there are cities with the same name in different states,
-then only the latter dependency will be valid.
-
-Of course, there probably are cities with the same name within a single state,
-but hopefully this is relatively rare occurence (and thus we'll still detect
-the 'soft' dependency).
+We could say that the former dependency is stronger because if it's valid, then
+the second dependency is valid too.
Handling multiple columns on the right side of the dependency, is not necessary,
as those dependencies may be simply decomposed into a set of dependencies with
@@ -199,24 +186,22 @@ is exactly the same as
(a -> b) & (a -> c)
Of course, storing the first form may be more efficient thant storing multiple
-'simple' dependencies separately.
-
+'simple' dependencies separately. This is left as a future work.
-TODO Support dependencies with multiple columns on left/right.
-TODO Investigate using histogram and MCV list to verify the dependencies.
+Future work
+-----------
-TODO Investigate statistical testing of the distribution (to decide whether it
- makes sense to build the histogram/MCV list).
+* Investigate using histogram and MCV list to verify the dependencies.
-TODO Using a min/max of selectivities would probably make more sense for the
- associated columns.
+* Investigate statistical testing of the distribution (to decide whether it
+ makes sense to build the histogram/MCV list).
-TODO Consider eliminating the implied columns from the histogram and MCV lists
- (but maybe that's not a good idea, because that'd make it impossible to use
- these stats for non-equality clauses and also it wouldn't be possible to
- use the stats for verification of the dependencies).
+* Consider eliminating the implied columns from the histogram and MCV lists
+ (but maybe that's not a good idea, because that'd make it impossible to use
+ these stats for non-equality clauses and also it wouldn't be possible to
+ use the stats for verification of the dependencies).
-TODO The reduction probably might be extended to also handle IS NULL clauses,
- assuming we fix the ANALYZE to properly handle NULL values. We however
- won't be able to reduce IS NOT NULL (unless I'm missing something).
+* The reduction probably might be extended to also handle IS NULL clauses,
+ assuming we fix the ANALYZE to properly handle NULL values. We however
+ won't be able to reduce IS NOT NULL (unless I'm missing something).
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
new file mode 100644
index 0000000..a38ea7b
--- /dev/null
+++ b/src/backend/utils/mvstats/README.stats
@@ -0,0 +1,36 @@
+Multivariate statististics
+==========================
+
+When estimating various quantities (e.g. condition selectivities) the default
+approach relies on the assumption of independence. In practice that's often
+not true, resulting in estimation errors.
+
+Multivariate stats track different types of dependencies between the columns,
+hopefully improving the estimates.
+
+Currently we only have one kind of multivariate statistics - soft functional
+dependencies, and we use it to improve estimates of equality clauses. See
+README.dependencies for details.
+
+
+Selectivity estimation
+----------------------
+
+When estimating selectivity, we aim to achieve several things:
+
+ (a) maximize the estimate accuracy
+
+ (b) minimize the overhead, especially when no suitable multivariate stats
+ exist (so if you are not using multivariate stats, there's no overhead)
+
+This clauselist_selectivity() performs several inexpensive checks first, before
+even attempting to do the more expensive estimation.
+
+ (1) check if there are multivariate stats on the relation
+
+ (2) check there are at least two attributes referenced by clauses compatible
+ with multivariate statistics (equality clauses for func. dependencies)
+
+ (3) perform reduction of equality clauses using func. dependencies
+
+ (4) estimate the reduced list of clauses using regular statistics
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index 82f2177..dcb7c78 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -84,7 +84,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
/*
* Analyze functional dependencies of columns.
*/
- deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->deps_enabled)
+ deps = build_mv_dependencies(numrows, rows, attrs, stats);
/* store the histogram / MCV list in the catalog */
update_mv_stats(stat->mvoid, deps, attrs);
@@ -163,6 +164,7 @@ list_mv_stats(Oid relid)
info->mvoid = HeapTupleGetOid(htup);
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
+ info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
result = lappend(result, info);
@@ -274,6 +276,7 @@ compare_scalars_partition(const void *a, const void *b, void *arg)
return ApplySortComparator(da, false, db, false, ssup);
}
+
/* initialize multi-dimensional sort */
MultiSortSupport
multi_sort_init(int ndims)
diff --git a/src/backend/utils/mvstats/dependencies.c b/src/backend/utils/mvstats/dependencies.c
index 5437bdf..412dc30 100644
--- a/src/backend/utils/mvstats/dependencies.c
+++ b/src/backend/utils/mvstats/dependencies.c
@@ -684,3 +684,27 @@ pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(buf.data));
}
+
+MVDependencies
+load_mv_dependencies(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->deps_enabled && mvstat->deps_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stadeps, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_dependencies(DatumGetByteaP(deps));
+}
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 7837bc0..ec55a09 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,7 +17,6 @@
#include "fmgr.h"
#include "commands/vacuum.h"
-
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
/*
@@ -49,6 +48,8 @@ typedef MVDependenciesData* MVDependencies;
* stats specified using flags (or something like that).
*/
+MVDependencies load_mv_dependencies(Oid mvoid);
+
bytea * serialize_mv_dependencies(MVDependencies dependencies);
/* deserialization of stats (serialization is private to analyze) */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 8771f9c..d09ba25 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -94,7 +94,7 @@ typedef struct RelationData
List *rd_indexlist; /* list of OIDs of indexes on relation */
Oid rd_oidindex; /* OID of unique index on OID, if any */
Oid rd_replidindex; /* OID of replica identity index, if any */
-
+
/* data managed by RelationGetMVStatList: */
List *rd_mvstatlist; /* list of OIDs of multivariate stats */
diff --git a/src/test/regress/expected/mv_dependencies.out b/src/test/regress/expected/mv_dependencies.out
index f54e1b7..ee8a9b2 100644
--- a/src/test/regress/expected/mv_dependencies.out
+++ b/src/test/regress/expected/mv_dependencies.out
@@ -58,8 +58,10 @@ SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
TRUNCATE functional_dependencies;
-- a => b, a => c, b => c
+-- check explain (expect bitmap index scan, not plain index scan)
INSERT INTO functional_dependencies
SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
ANALYZE functional_dependencies;
SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
@@ -68,6 +70,16 @@ SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
t | t | (1) => 2, (1) => 3, (2) => 3
(1 row)
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
DROP TABLE functional_dependencies;
-- varlena type (text)
CREATE TABLE functional_dependencies (
@@ -113,8 +125,10 @@ SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
TRUNCATE functional_dependencies;
-- a => b, a => c, b => c
+-- check explain (expect bitmap index scan, not plain index scan)
INSERT INTO functional_dependencies
SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
ANALYZE functional_dependencies;
SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
@@ -123,6 +137,16 @@ SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
t | t | (1) => 2, (1) => 3, (2) => 3
(1 row)
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on functional_dependencies
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on fdeps_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
DROP TABLE functional_dependencies;
-- NULL values (mix of int and text columns)
CREATE TABLE functional_dependencies (
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index bec0316..4f2ffb8 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -110,3 +110,6 @@ test: event_trigger
# run stats by itself because its delay may be insufficient under heavy load
test: stats
+
+# run tests of multivariate stats
+test: mv_dependencies
diff --git a/src/test/regress/sql/mv_dependencies.sql b/src/test/regress/sql/mv_dependencies.sql
index 051633a..8ba72a4 100644
--- a/src/test/regress/sql/mv_dependencies.sql
+++ b/src/test/regress/sql/mv_dependencies.sql
@@ -56,13 +56,20 @@ SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
TRUNCATE functional_dependencies;
-- a => b, a => c, b => c
+-- check explain (expect bitmap index scan, not plain index scan)
INSERT INTO functional_dependencies
SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+
ANALYZE functional_dependencies;
SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = 10 AND b = 5;
+
DROP TABLE functional_dependencies;
-- varlena type (text)
@@ -99,6 +106,7 @@ TRUNCATE functional_dependencies;
-- a => b, a => c
INSERT INTO functional_dependencies
SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
ANALYZE functional_dependencies;
SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
@@ -107,13 +115,20 @@ SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
TRUNCATE functional_dependencies;
-- a => b, a => c, b => c
+-- check explain (expect bitmap index scan, not plain index scan)
INSERT INTO functional_dependencies
SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+
+CREATE INDEX fdeps_idx ON functional_dependencies (a, b);
+
ANALYZE functional_dependencies;
SELECT deps_enabled, deps_built, pg_mv_stats_dependencies_show(stadeps)
FROM pg_mv_statistic WHERE starelid = 'functional_dependencies'::regclass;
+EXPLAIN (COSTS off)
+ SELECT * FROM functional_dependencies WHERE a = '10' AND b = '5';
+
DROP TABLE functional_dependencies;
-- NULL values (mix of int and text columns)
--
2.5.0
0004-multivariate-MCV-lists.patchtext/x-patch; name=0004-multivariate-MCV-lists.patchDownload
From 6aba7480c5a4fd56896bf1a2d320e19ea231225d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Mon, 6 Apr 2015 16:52:15 +0200
Subject: [PATCH 4/9] multivariate MCV lists
- extends the pg_mv_statistic catalog (add 'mcv' fields)
- building the MCV lists during ANALYZE
- simple estimation while planning the queries
Includes regression tests, mostly equal to regression tests for
functional dependencies.
---
doc/src/sgml/ref/create_statistics.sgml | 43 ++
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 45 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 800 +++++++++++++++++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.mcv | 137 ++++
src/backend/utils/mvstats/README.stats | 89 ++-
src/backend/utils/mvstats/common.c | 133 +++-
src/backend/utils/mvstats/common.h | 15 +
src/backend/utils/mvstats/mcv.c | 1120 +++++++++++++++++++++++++++++++
src/bin/psql/describe.c | 25 +-
src/include/catalog/pg_mv_statistic.h | 18 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 77 ++-
src/test/regress/expected/mv_mcv.out | 207 ++++++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_mcv.sql | 178 +++++
22 files changed, 2847 insertions(+), 65 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.mcv
create mode 100644 src/backend/utils/mvstats/mcv.c
create mode 100644 src/test/regress/expected/mv_mcv.out
create mode 100644 src/test/regress/sql/mv_mcv.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index ff09fa5..d6973e8 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -132,6 +132,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>max_mcv_items</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of MCV list items.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>mcv</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables MCV list for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
@@ -177,6 +195,31 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 2);
</programlisting>
</para>
+ <para>
+ Create table <structname>t2</> with two perfectly correlated columns
+ (containing identical data), and a MCV list on those columns:
+
+<programlisting>
+CREATE TABLE t2 (
+ a int,
+ b int
+);
+
+INSERT INTO t2 SELECT mod(i,100), mod(i,100)
+ FROM generate_series(1,1000000) s(i);
+
+CREATE STATISTICS s2 ON t2 (a, b) WITH (mcv);
+
+ANALYZE t2;
+
+-- valid combination (found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1);
+
+-- invalid combination (not found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
+</programlisting>
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 31dbb2c..5c40334 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -165,7 +165,9 @@ CREATE VIEW pg_mv_stats AS
S.staname AS staname,
S.stakeys AS attnums,
length(S.stadeps) as depsbytes,
- pg_mv_stats_dependencies_info(S.stadeps) as depsinfo
+ pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
+ length(S.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index f43b053..c480fbe 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -70,7 +70,13 @@ CreateStatistics(CreateStatsStmt *stmt)
ObjectAddress parentobject, childobject;
/* by default build nothing */
- bool build_dependencies = false;
+ bool build_dependencies = false,
+ build_mcv = false;
+
+ int32 max_mcv_items = -1;
+
+ /* options required because of other options */
+ bool require_mcv = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -146,6 +152,29 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "mcv") == 0)
+ build_mcv = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_mcv_items") == 0)
+ {
+ max_mcv_items = defGetInt32(opt);
+
+ /* this option requires 'mcv' to be enabled */
+ require_mcv = true;
+
+ /* sanity check */
+ if (max_mcv_items < MVSTAT_MCVLIST_MIN_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items must be at least %d",
+ MVSTAT_MCVLIST_MIN_ITEMS)));
+
+ else if (max_mcv_items > MVSTAT_MCVLIST_MAX_ITEMS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("max number of MCV items is %d",
+ MVSTAT_MCVLIST_MAX_ITEMS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -154,10 +183,16 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! build_dependencies)
+ if (! (build_dependencies || build_mcv))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("no statistics type (dependencies, mcv) was requested")));
+
+ /* now do some checking of the options */
+ if (require_mcv && (! build_mcv))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies) was requested")));
+ errmsg("option 'mcv' is required by other options(s)")));
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
@@ -178,8 +213,12 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(stakeys);
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
+ values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+
+ values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
+ nulls[Anum_pg_mv_statistic_stamcv -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 07206d7..333e24b 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2162,9 +2162,11 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
+ WRITE_BOOL_FIELD(mcv_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
+ WRITE_BOOL_FIELD(mcv_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index a3afdf5..c16d559 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -15,6 +15,7 @@
#include "postgres.h"
#include "access/sysattr.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_operator.h"
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
@@ -47,18 +48,39 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
#define MV_CLAUSE_TYPE_FDEP 0x01
+#define MV_CLAUSE_TYPE_MCV 0x02
-static bool clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum);
+static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
+ int type);
-static Bitmapset *collect_mv_attnums(List *clauses, Index relid);
+static Bitmapset *collect_mv_attnums(List *clauses, Index relid, int type);
-static int count_mv_attnums(List *clauses, Index relid);
+static int count_mv_attnums(List *clauses, Index relid, int type);
static int count_varnos(List *clauses, Index *relid);
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Index relid, List *stats);
+static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
+
+static List *clauselist_mv_split(PlannerInfo *root, Index relid,
+ List *clauses, List **mvclauses,
+ MVStatisticInfo *mvstats, int types);
+
+static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
+
+static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats,
+ bool *fullmatch, Selectivity *lowsel);
+
+static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
@@ -66,6 +88,13 @@ static List * find_stats(PlannerInfo *root, Index relid);
static bool stats_type_matches(MVStatisticInfo *stat, int type);
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+#define UPDATE_RESULT(m,r,isor) \
+ (m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -91,11 +120,13 @@ static bool stats_type_matches(MVStatisticInfo *stat, int type);
* to verify that suitable multivariate statistics exist.
*
* If we identify such multivariate statistics apply, we try to apply them.
- * Currently we only have (soft) functional dependencies, so we try to reduce
- * the list of clauses.
*
- * Then we remove the clauses estimated using multivariate stats, and process
- * the rest of the clauses using the regular per-column stats.
+ * First we try to reduce the list of clauses by applying (soft) functional
+ * dependencies, and then we try to estimate the selectivity of the reduced
+ * list of clauses using the multivariate MCV list.
+ *
+ * Finally we remove the portion of clauses estimated using multivariate stats,
+ * and process the rest of the clauses using the regular per-column stats.
*
* Currently, the only extra smarts we have is to recognize "range queries",
* such as "x > 34 AND x < 42". Clauses are recognized as possible range
@@ -172,12 +203,46 @@ clauselist_selectivity(PlannerInfo *root,
* that need to be estimated by other types of stats (MCV, histograms etc).
*/
if (has_stats(stats, MV_CLAUSE_TYPE_FDEP) &&
- (count_mv_attnums(clauses, relid) >= 2))
+ (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_FDEP) >= 2))
{
clauses = clauselist_apply_dependencies(root, clauses, relid, stats);
}
/*
+ * Check that there are statistics with MCV list or histogram, and also the
+ * number of attributes covered by these types of statistics.
+ *
+ * If there are no such stats or not enough attributes, don't waste time
+ * with the multivariate code and simply skip to estimation using the
+ * regular per-column stats.
+ */
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
+ (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV) >= 2))
+ {
+ /* collect attributes from the compatible conditions */
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV);
+
+ /* and search for the statistic covering the most attributes */
+ MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+
+ if (mvstat != NULL) /* we have a matching stats */
+ {
+ /* clauses compatible with multi-variate stats */
+ List *mvclauses = NIL;
+
+ /* split the clauselist into regular and mv-clauses */
+ clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
+ mvstat, MV_CLAUSE_TYPE_MCV);
+
+ /* we've chosen the histogram to match the clauses */
+ Assert(mvclauses != NIL);
+
+ /* compute the multivariate stats */
+ s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ }
+ }
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -834,32 +899,93 @@ clause_selectivity(PlannerInfo *root,
return s1;
}
+
+/*
+ * estimate selectivity of clauses using multivariate statistic
+ *
+ * Perform estimation of the clauses using a MCV list.
+ *
+ * This assumes all the clauses are compatible with the selected statistics
+ * (e.g. only reference columns covered by the statistics, use supported
+ * operator, etc.).
+ *
+ * TODO We may support some additional conditions, most importantly those
+ * matching multiple columns (e.g. "a = b" or "a < b").
+ *
+ * TODO Clamp the selectivity by min of the per-clause selectivities (i.e. the
+ * selectivity of the most restrictive clause), because that's the maximum
+ * we can ever get from ANDed list of clauses. This may probably prevent
+ * issues with hitting too many buckets and low precision histograms.
+ *
+ * TODO We may remember the lowest frequency in the MCV list, and then later use
+ * it as a upper boundary for the selectivity (had there been a more
+ * frequent item, it'd be in the MCV list). This might improve cases with
+ * low-detail histograms.
+ *
+ * TODO We may also derive some additional boundaries for the selectivity from
+ * the MCV list, because
+ *
+ * (a) if we have a "full equality condition" (one equality condition on
+ * each column of the statistic) and we found a match in the MCV list,
+ * then this is the final selectivity (and pretty accurate),
+ *
+ * (b) if we have a "full equality condition" and we haven't found a match
+ * in the MCV list, then the selectivity is below the lowest frequency
+ * found in the MCV list,
+ *
+ * TODO When applying the clauses to the histogram/MCV list, we can do
+ * that from the most selective clauses first, because that'll
+ * eliminate the buckets/items sooner (so we'll be able to skip
+ * them without inspection, which is more expensive). But this
+ * requires really knowing the per-clause selectivities in advance,
+ * and that's not what we do now.
+ */
+static Selectivity
+clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+{
+ bool fullmatch = false;
+
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV selectivity */
+ return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ &fullmatch, &mcv_low);
+}
+
/*
* Collect attributes from mv-compatible clauses.
*/
static Bitmapset *
-collect_mv_attnums(List *clauses, Index relid)
+collect_mv_attnums(List *clauses, Index relid, int types)
{
Bitmapset *attnums = NULL;
ListCell *l;
/*
- * Walk through the clauses and identify the ones we can estimate
- * using multivariate stats, and remember the relid/columns. We'll
- * then cross-check if we have suitable stats, and only if needed
- * we'll split the clauses into multivariate and regular lists.
+ * Walk through the clauses and identify the ones we can estimate using
+ * multivariate stats, and remember the relid/columns. We'll then
+ * cross-check if we have suitable stats, and only if needed we'll split
+ * the clauses into multivariate and regular lists.
*
- * For now we're only interested in RestrictInfo nodes with nested
- * OpExpr, using either a range or equality.
+ * For now we're only interested in RestrictInfo nodes with nested OpExpr,
+ * using either a range or equality.
*/
foreach (l, clauses)
{
- AttrNumber attnum;
Node *clause = (Node *) lfirst(l);
- /* ignore the result for now - we only need the info */
- if (clause_is_mv_compatible(clause, relid, &attnum))
- attnums = bms_add_member(attnums, attnum);
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, relid, &attnums, types);
}
/*
@@ -880,10 +1006,10 @@ collect_mv_attnums(List *clauses, Index relid)
* Count the number of attributes in clauses compatible with multivariate stats.
*/
static int
-count_mv_attnums(List *clauses, Index relid)
+count_mv_attnums(List *clauses, Index relid, int type)
{
int c;
- Bitmapset *attnums = collect_mv_attnums(clauses, relid);
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
c = bms_num_members(attnums);
@@ -913,9 +1039,183 @@ count_varnos(List *clauses, Index *relid)
return cnt;
}
+
+/*
+ * We're looking for statistics matching at least 2 attributes, referenced in
+ * clauses compatible with multivariate statistics. The current selection
+ * criteria is very simple - we choose the statistics referencing the most
+ * attributes.
+ *
+ * If there are multiple statistics referencing the same number of columns
+ * (from the clauses), the one with less source columns (as listed in the
+ * ADD STATISTICS when creating the statistics) wins. Else the first one wins.
+ *
+ * This is a very simple criteria, and has several weaknesses:
+ *
+ * (a) does not consider the accuracy of the statistics
+ *
+ * If there are two histograms built on the same set of columns, but one
+ * has 100 buckets and the other one has 1000 buckets (thus likely
+ * providing better estimates), this is not currently considered.
+ *
+ * (b) does not consider the type of statistics
+ *
+ * If there are three statistics - one containing just a MCV list, another
+ * one with just a histogram and a third one with both, we treat them equally.
+ *
+ * (c) does not consider the number of clauses
+ *
+ * As explained, only the number of referenced attributes counts, so if
+ * there are multiple clauses on a single attribute, this still counts as
+ * a single attribute.
+ *
+ * (d) does not consider type of condition
+ *
+ * Some clauses may work better with some statistics - for example equality
+ * clauses probably work better with MCV lists than with histograms. But
+ * IS [NOT] NULL conditions may often work better with histograms (thanks
+ * to NULL-buckets).
+ *
+ * So for example with five WHERE conditions
+ *
+ * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
+ *
+ * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be selected
+ * as it references the most columns.
+ *
+ * Once we have selected the multivariate statistics, we split the list of
+ * clauses into two parts - conditions that are compatible with the selected
+ * stats, and conditions are estimated using simple statistics.
+ *
+ * From the example above, conditions
+ *
+ * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
+ *
+ * will be estimated using the multivariate statistics (a,b,c,d) while the last
+ * condition (e = 1) will get estimated using the regular ones.
+ *
+ * There are various alternative selection criteria (e.g. counting conditions
+ * instead of just referenced attributes), but eventually the best option should
+ * be to combine multiple statistics. But that's much harder to do correctly.
+ *
+ * TODO Select multiple statistics and combine them when computing the estimate.
+ *
+ * TODO This will probably have to consider compatibility of clauses, because
+ * 'dependencies' will probably work only with equality clauses.
+ */
+static MVStatisticInfo *
+choose_mv_statistics(List *stats, Bitmapset *attnums)
+{
+ int i;
+ ListCell *lc;
+
+ MVStatisticInfo *choice = NULL;
+
+ int current_matches = 1; /* goal #1: maximize */
+ int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+
+ /*
+ * Walk through the statistics (simple array with nmvstats elements) and for
+ * each one count the referenced attributes (encoded in the 'attnums' bitmap).
+ */
+ foreach (lc, stats)
+ {
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* columns matching this statistics */
+ int matches = 0;
+
+ int2vector * attrs = info->stakeys;
+ int numattrs = attrs->dim1;
+
+ /* skip dependencies-only stats */
+ if (! info->mcv_built)
+ continue;
+
+ /* count columns covered by the histogram */
+ for (i = 0; i < numattrs; i++)
+ if (bms_is_member(attrs->values[i], attnums))
+ matches++;
+
+ /*
+ * Use this statistics when it improves the number of matches or
+ * when it matches the same number of attributes but is smaller.
+ */
+ if ((matches > current_matches) ||
+ ((matches == current_matches) && (current_dims > numattrs)))
+ {
+ choice = info;
+ current_matches = matches;
+ current_dims = numattrs;
+ }
+ }
+
+ return choice;
+}
+
+
+/*
+ * This splits the clauses list into two parts - one containing clauses that
+ * will be evaluated using the chosen statistics, and the remaining clauses
+ * (either non-mvcompatible, or not related to the histogram).
+ */
+static List *
+clauselist_mv_split(PlannerInfo *root, Index relid,
+ List *clauses, List **mvclauses,
+ MVStatisticInfo *mvstats, int types)
+{
+ int i;
+ ListCell *l;
+ List *non_mvclauses = NIL;
+
+ /* FIXME is there a better way to get info on int2vector? */
+ int2vector * attrs = mvstats->stakeys;
+ int numattrs = mvstats->stakeys->dim1;
+
+ Bitmapset *mvattnums = NULL;
+
+ /* build bitmap of attributes, so we can do bms_is_subset later */
+ for (i = 0; i < numattrs; i++)
+ mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+
+ /* erase the list of mv-compatible clauses */
+ *mvclauses = NIL;
+
+ foreach (l, clauses)
+ {
+ bool match = false; /* by default not mv-compatible */
+ Bitmapset *attnums = NULL;
+ Node *clause = (Node *) lfirst(l);
+
+ if (clause_is_mv_compatible(clause, relid, &attnums, types))
+ {
+ /* are all the attributes part of the selected stats? */
+ if (bms_is_subset(attnums, mvattnums))
+ match = true;
+ }
+
+ /*
+ * The clause matches the selected stats, so put it to the list of
+ * mv-compatible clauses. Otherwise, keep it in the list of 'regular'
+ * clauses (that may be selected later).
+ */
+ if (match)
+ *mvclauses = lappend(*mvclauses, clause);
+ else
+ non_mvclauses = lappend(non_mvclauses, clause);
+ }
+
+ /*
+ * Perform regular estimation using the clauses incompatible with the chosen
+ * histogram (or MV stats in general).
+ */
+ return non_mvclauses;
+
+}
typedef struct
{
+ int types; /* types of statistics ? */
Index varno; /* relid we're interested in */
Bitmapset *varattnos; /* attnums referenced by the clauses */
} mv_compatible_context;
@@ -950,6 +1250,49 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
return mv_compatible_walker((Node*)rinfo->clause, (void *) context);
}
+ if (or_clause(node) || and_clause(node) || not_clause(node))
+ {
+ /*
+ * AND/OR/NOT-clauses are supported if all sub-clauses are supported
+ *
+ * TODO We might support mixed case, where some of the clauses are
+ * supported and some are not, and treat all supported subclauses
+ * as a single clause, compute it's selectivity using mv stats,
+ * and compute the total selectivity using the current algorithm.
+ *
+ * TODO For RestrictInfo above an OR-clause, we might use the orclause
+ * with nested RestrictInfo - we won't have to call pull_varnos()
+ * for each clause, saving time.
+ *
+ * TODO Perhaps this needs a bit more thought for functional
+ * dependencies? Those don't quite work for NOT cases.
+ */
+ BoolExpr *expr = (BoolExpr *) node;
+ ListCell *lc;
+
+ foreach (lc, expr->args)
+ {
+ if (mv_compatible_walker((Node *) lfirst(lc), context))
+ return true;
+ }
+
+ return false;
+ }
+
+ if (IsA(node, NullTest))
+ {
+ NullTest* nt = (NullTest*)node;
+
+ /*
+ * Only simple (Var IS NULL) expressions supported for now. Maybe we could
+ * use examine_variable to fix this?
+ */
+ if (! IsA(nt->arg, Var))
+ return true;
+
+ return mv_compatible_walker((Node*)(nt->arg), context);
+ }
+
if (IsA(node, Var))
{
Var * var = (Var*)node;
@@ -1010,10 +1353,18 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
switch (get_oprrest(expr->opno))
{
case F_EQSEL:
-
/* equality conditions are compatible with all statistics */
break;
+ case F_SCALARLTSEL:
+ case F_SCALARGTSEL:
+
+ /* not compatible with functional dependencies */
+ if (! (context->types & MV_CLAUSE_TYPE_MCV))
+ return true; /* terminate */
+
+ break;
+
default:
/* unknown estimator */
@@ -1047,10 +1398,11 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
* evaluate them using multivariate stats.
*/
static bool
-clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
+clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums, int types)
{
mv_compatible_context context;
+ context.types = types;
context.varno = relid;
context.varattnos = NULL; /* no attnums */
@@ -1058,7 +1410,7 @@ clause_is_mv_compatible(Node *clause, Index relid, AttrNumber *attnum)
return false;
/* remember the newly collected attnums */
- *attnum = bms_singleton_member(context.varattnos);
+ *attnums = bms_add_members(*attnums, context.varattnos);
return true;
}
@@ -1075,15 +1427,15 @@ fdeps_reduce_clauses(List *clauses, Index relid, Bitmapset *reduced_attnums)
foreach (lc, clauses)
{
- AttrNumber attnum = InvalidAttrNumber;
+ Bitmapset *attnums = NULL;
Node * clause = (Node*)lfirst(lc);
/* ignore clauses that are not compatible with functional dependencies */
- if (! clause_is_mv_compatible(clause, relid, &attnum))
+ if (! clause_is_mv_compatible(clause, relid, &attnums, MV_CLAUSE_TYPE_FDEP))
reduced_clauses = lappend(reduced_clauses, clause);
/* for equality clauses, only keep those not on reduced attributes */
- if (! bms_is_member(attnum, reduced_attnums))
+ if (! bms_is_subset(attnums, reduced_attnums))
reduced_clauses = lappend(reduced_clauses, clause);
}
@@ -1208,7 +1560,7 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
return clauses;
/* collect attnums from clauses compatible with dependencies (equality) */
- clause_attnums = collect_mv_attnums(clauses, relid);
+ clause_attnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_FDEP);
/* decide which attnums may be eliminated */
reduced_attnums = fdeps_reduce_attnums(stats, clause_attnums);
@@ -1233,6 +1585,9 @@ stats_type_matches(MVStatisticInfo *stat, int type)
if ((type & MV_CLAUSE_TYPE_FDEP) && stat->deps_built)
return true;
+ if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
+ return true;
+
return false;
}
@@ -1266,3 +1621,392 @@ find_stats(PlannerInfo *root, Index relid)
return root->simple_rel_array[relid]->mvstatlist;
}
+
+/*
+ * Estimate selectivity of clauses using a MCV list.
+ *
+ * If there's no MCV list for the stats, the function returns 0.0.
+ *
+ * While computing the estimate, the function checks whether all the
+ * columns were matched with an equality condition. If that's the case,
+ * we can skip processing the histogram, as there can be no rows in
+ * it with the same values - all the rows matching the condition are
+ * represented by the MCV item. This can only happen with equality
+ * on all the attributes.
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all items as 'match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the items
+ * 4) skip items that are already 'no match'
+ * 5) check clause for items that still match
+ * 6) sum frequencies for items to get selectivity
+ *
+ * The function also returns the frequency of the least frequent item
+ * on the MCV list, which may be useful for clamping estimate from the
+ * histogram (all items not present in the MCV list are less frequent).
+ * This however seems useful only for cases with conditions on all
+ * attributes.
+ *
+ * TODO This only handles AND-ed clauses, but it might work for OR-ed
+ * lists too - it just needs to reverse the logic a bit. I.e. start
+ * with 'no match' for all items, and mark the items as a match
+ * as the clauses are processed (and skip items that are 'match').
+ */
+static Selectivity
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats, bool *fullmatch,
+ Selectivity *lowsel)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ MCVList mcvlist = NULL;
+ int nmatches = 0;
+
+ /* match/mismatch bitmap for each MCV item */
+ char * matches = NULL;
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 2);
+
+ /* there's no MCV list built yet */
+ if (! mvstats->mcv_built)
+ return 0.0;
+
+ mcvlist = load_mv_mcvlist(mvstats->mvoid);
+
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* by default all the MCV items match the clauses fully */
+ matches = palloc0(sizeof(char) * mcvlist->nitems);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
+
+ /* number of matching MCV items */
+ nmatches = mcvlist->nitems;
+
+ nmatches = update_match_bitmap_mcvlist(root, clauses,
+ mvstats->stakeys, mcvlist,
+ nmatches, matches,
+ lowsel, fullmatch, false);
+
+ /* sum frequencies for all the matching MCV items */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /* used to 'scale' for MCV lists not covering all tuples */
+ u += mcvlist->items[i]->frequency;
+
+ if (matches[i] != MVSTATS_MATCH_NONE)
+ s += mcvlist->items[i]->frequency;
+ }
+
+ pfree(matches);
+ pfree(mcvlist);
+
+ return s*u;
+}
+
+/*
+ * Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * TODO This works with 'bitmap' where each bit is represented as a char,
+ * which is slightly wasteful. Instead, we could use a regular
+ * bitmap, reducing the size to ~1/8. Another thing is merging the
+ * bitmaps using & and |, which might be faster than min/max.
+ */
+static int
+update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
+ int2vector *stakeys, MCVList mcvlist,
+ int nmatches, char * matches,
+ Selectivity *lowsel, bool *fullmatch,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ Bitmapset *eqmatches = NULL; /* attributes with equality matches */
+
+ /* The bitmap may be partially built. */
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mcvlist->nitems);
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+ Assert(mcvlist != NULL);
+ Assert(mcvlist->nitems > 0);
+
+ /* No possible matches (only works for AND-ded clauses) */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ return nmatches;
+
+ /*
+ * find the lowest frequency in the MCV list
+ *
+ * We need to do that here, because we do various tricks in the following
+ * code - skipping items already ruled out, etc.
+ *
+ * XXX A loop is necessary because the MCV list is not sorted by frequency.
+ */
+ *lowsel = 1.0;
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ if (item->frequency < *lowsel)
+ *lowsel = item->frequency;
+ }
+
+ /*
+ * Loop through the list of clauses, and for each of them evaluate
+ * all the MCV items not yet eliminated by the preceding clauses.
+ */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* if there are no remaining matches possible, we can stop */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr *expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+ FmgrInfo opproc;
+
+ /* get procedure computing operator selectivity */
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+
+ FmgrInfo gtproc;
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->gt_opr), >proc);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool mismatch = false;
+ MCVItem item = mcvlist->items[i];
+
+ /*
+ * If there are no more matches (AND) or no remaining unmatched
+ * items (OR), we can stop processing this clause.
+ */
+ if (((nmatches == 0) && (! is_or)) ||
+ ((nmatches == mcvlist->nitems) && is_or))
+ break;
+
+ /*
+ * For AND-lists, we can also mark NULL items as 'no match' (and
+ * then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && item->isnull[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /* skip MCV items that were already ruled out */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ switch (oprrest)
+ {
+ case F_EQSEL:
+ /*
+ * We don't care about isgt in equality, because it does not
+ * matter whether it's (var = const) or (const = var).
+ */
+ mismatch = ! DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ if (! mismatch)
+ eqmatches = bms_add_member(eqmatches, idx);
+
+ break;
+
+ case F_SCALARLTSEL: /* column < constant */
+ case F_SCALARGTSEL: /* column > constant */
+
+ /*
+ * First check whether the constant is below the lower boundary (in that
+ * case we can skip the bucket, because there's no overlap).
+ */
+ mismatch = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ cst->constvalue,
+ item->values[idx]));
+
+ /* invert the result if isgt=true */
+ mismatch = (isgt) ? (! mismatch) : mismatch;
+ break;
+ }
+
+ /* XXX The conditions on matches[i] are not needed, as we
+ * skip MCV items that can't become true/false, depending
+ * on the current flag. See beginning of the loop over
+ * MCV items.
+ */
+
+ if ((is_or) && (matches[i] == MVSTATS_MATCH_NONE) && (! mismatch))
+ {
+ /* OR - was MATCH_NONE, but will be MATCH_FULL */
+ matches[i] = MVSTATS_MATCH_FULL;
+ ++nmatches;
+ continue;
+ }
+ else if ((! is_or) && (matches[i] == MVSTATS_MATCH_FULL) && mismatch)
+ {
+ /* AND - was MATC_FULL, but will be MATCH_NONE */
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ continue;
+ }
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the MCV items and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining MCV items that might possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem item = mcvlist->items[i];
+
+ /* if there are no more matches, we can stop processing this clause */
+ if (nmatches == 0)
+ break;
+
+ /* skip MCV items that were already ruled out */
+ if (matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
+ /* if the clause mismatches the MCV item, set it as MATCH_NONE */
+ if (((expr->nulltesttype == IS_NULL) && (! item->isnull[idx])) ||
+ ((expr->nulltesttype == IS_NOT_NULL) && (item->isnull[idx])))
+ {
+ matches[i] = MVSTATS_MATCH_NONE;
+ --nmatches;
+ }
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each MCV item */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching MCV items */
+ or_nmatches = mcvlist->nitems;
+
+ /* by default none of the MCV items matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ stakeys, mcvlist,
+ or_nmatches, or_matches,
+ lowsel, fullmatch, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ {
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+ }
+
+ /*
+ * If all the columns were matched by equality, it's a full match.
+ * In this case there can be just a single MCV item, matching the
+ * clause (if there were two, both would match the other one).
+ */
+ *fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+ /* free the allocated pieces */
+ if (eqmatches)
+ pfree(eqmatches);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 7fb2088..8394111 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -412,7 +412,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built)
+ if (mvstat->deps_built || mvstat->mcv_built)
{
info = makeNode(MVStatisticInfo);
@@ -421,9 +421,11 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
+ info->mcv_enabled = mvstat->mcv_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
+ info->mcv_built = mvstat->mcv_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 099f1ed..f9bf10c 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o
+OBJS = common.o dependencies.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.mcv b/src/backend/utils/mvstats/README.mcv
new file mode 100644
index 0000000..e93cfe4
--- /dev/null
+++ b/src/backend/utils/mvstats/README.mcv
@@ -0,0 +1,137 @@
+MCV lists
+=========
+
+Multivariate MCV (most-common values) lists are a straightforward extension of
+regular MCV list, tracking most frequent combinations of values for a group of
+attributes.
+
+This works particularly well for columns with a small number of distinct values,
+as the list may include all the combinations and approximate the distribution
+very accurately.
+
+For columns with large number of distinct values (e.g. those with continuous
+domains), the list will only track the most frequent combinations. If the
+distribution is mostly uniform (all combinations about equally frequent), the
+MCV list will be empty.
+
+Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
+than when using histograms.
+
+Also, MCV lists don't necessarily require sorting of the values (the fact that
+we use sorting when building them is implementation detail), but even more
+importantly the ordering is not built into the approximation (while histograms
+are built on ordering). So MCV lists work well even for attributes where the
+ordering of the data type is disconnected from the meaning of the data. For
+example we know how to sort strings, but it's unlikely to make much sense for
+city names (or other label-like attributes).
+
+
+Selectivity estimation
+----------------------
+
+The estimation, implemented in clauselist_mv_selectivity_mcvlist(), is quite
+simple in principle - we need to identify MCV items matching all the clauses
+and sum frequencies of all those items.
+
+Currently MCV lists support estimation of the following clause types:
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR clauses WHERE (a < 1) OR (b >= 2)
+
+It's possible to add support for additional clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and possibly others. These are tasks for the future, not yet implemented.
+
+
+Estimating equality clauses
+---------------------------
+
+When computing selectivity estimate for equality clauses
+
+ (a = 1) AND (b = 2)
+
+we can do this estimate pretty exactly assuming that two conditions are met:
+
+ (1) there's an equality condition on all attributes of the statistic
+
+ (2) we find a matching item in the MCV list
+
+In this case we know the MCV item represents all tuples matching the clauses,
+and the selectivity estimate is complete (i.e. we don't need to perform
+estimation using the histogram). This is what we call 'full match'.
+
+When only (1) holds, but there's no matching MCV item, we don't know whether
+there are no such rows or just are not very frequent. We can however use the
+frequency of the least frequent MCV item as an upper bound for the selectivity.
+
+For a combination of equality conditions (not full-match case) we can clamp the
+selectivity by the minimum of selectivities for each condition. For example if
+we know the number of distinct values for each column, we can use 1/ndistinct
+as a per-column estimate. Or rather 1/ndistinct + selectivity derived from the
+MCV list.
+
+We should also probably only use the 'residual ndistinct' by exluding the items
+included in the MCV list (and also residual frequency):
+
+ f = (1.0 - sum(MCV frequencies)) / (ndistinct - ndistinct(MCV list))
+
+but it's worth pointing out the ndistinct values are multi-variate for the
+columns referenced by the equality conditions.
+
+Note: Only the "full match" limit is currently implemented.
+
+
+Hashed MCV (not yet implemented)
+--------------------------------
+
+Regular MCV lists have to include actual values for each item, so if those items
+are large the list may be quite large. This is especially true for multi-variate
+MCV lists, although the current implementation partially mitigates this by
+performing de-duplicating the values before storing them on disk.
+
+It's possible to only store hashes (32-bit values) instead of the actual values,
+significantly reducing the space requirements. Obviously, this would only make
+the MCV lists useful for estimating equality conditions (assuming the 32-bit
+hashes make the collisions rare enough).
+
+This might also complicate matching the columns to available stats.
+
+
+TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
+ of the actual values. This type of MCV list will be useful only for
+ estimating equality clauses, and will reduce space requirements for large
+ varlena types (in such cases we usually only want equality anyway).
+
+TODO Currently there's no logic to consider building only a MCV list (and not
+ building the histogram at all), except for doing this decision manually in
+ ADD STATISTICS.
+
+
+Inspecting the MCV list
+-----------------------
+
+Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarrays, so we
+simply get the text representation of the arrays.
+
+With multivariate MCV lits it's not that simple due to the possible mix of
+data types. It might be possible to produce similar array-like representation,
+but that'd unnecessarily complicate further processing and analysis of the MCV
+list. Instead, there's a SRF function providing values, frequencies etc.
+
+ SELECT * FROM pg_mv_mcv_items();
+
+It has two input parameters:
+
+ oid - OID of the MCV list (pg_mv_statistic.staoid)
+
+and produces a table with these columns:
+
+ - item ID (0...nitems-1)
+ - values (string array)
+ - nulls only (boolean array)
+ - frequency (double precision)
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index a38ea7b..5c5c59a 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -8,9 +8,50 @@ not true, resulting in estimation errors.
Multivariate stats track different types of dependencies between the columns,
hopefully improving the estimates.
-Currently we only have one kind of multivariate statistics - soft functional
-dependencies, and we use it to improve estimates of equality clauses. See
-README.dependencies for details.
+
+Types of statistics
+-------------------
+
+Currently we only have two kinds of multivariate statistics
+
+ (a) soft functional dependencies (README.dependencies)
+
+ (b) MCV lists (README.mcv)
+
+
+Compatible clause types
+-----------------------
+
+Each type of statistics may be used to estimate some subset of clause types.
+
+ (a) functional dependencies - equality clauses (AND), possibly IS NULL
+
+ (b) MCV list - equality and inequality clauses, IS [NOT] NULL, AND/OR
+
+Currently only simple operator clauses (Var op Const) are supported, but it's
+possible to support more complex clause types, e.g. (Var op Var).
+
+
+Complex clauses
+---------------
+
+We also support estimating more complex clauses - essentially AND/OR clauses
+with (Var op Const) as leaves, as long as all the referenced attributes are
+covered by a single statistics.
+
+For example this condition
+
+ (a=1) AND ((b=2) OR ((c=3) AND (d=4)))
+
+may be estimated using statistics on (a,b,c,d). If we only have statistics on
+(b,c,d) we may estimate the second part, and estimate (a=1) using simple stats.
+
+If we only have statistics on (a,b,c) we can't apply it at all at this point,
+but it's worth pointing out clauselist_selectivity() works recursively and when
+handling the second part (the OR-clause), we'll be able to apply the statistics.
+
+Note: The multi-statistics estimation patch also makes it possible to pass some
+clauses as 'conditions' into the deeper parts of the expression tree.
Selectivity estimation
@@ -23,14 +64,48 @@ When estimating selectivity, we aim to achieve several things:
(b) minimize the overhead, especially when no suitable multivariate stats
exist (so if you are not using multivariate stats, there's no overhead)
-This clauselist_selectivity() performs several inexpensive checks first, before
+Thus clauselist_selectivity() performs several inexpensive checks first, before
even attempting to do the more expensive estimation.
(1) check if there are multivariate stats on the relation
- (2) check there are at least two attributes referenced by clauses compatible
- with multivariate statistics (equality clauses for func. dependencies)
+ (2) check that there are functional dependencies on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equality clauses for func. dependencies)
(3) perform reduction of equality clauses using func. dependencies
- (4) estimate the reduced list of clauses using regular statistics
+ (4) check that there are multivariate MCV lists on the table, and that
+ there are at least two attributes referenced by compatible clauses
+ (equalities, inequalities, etc.)
+
+ (5) find the best multivariate statistics (matching the most conditions)
+ and use it to compute the estimate
+
+ (6) estimate the remaining clauses (not estimated using multivariate stats)
+ using the regular per-column statistics
+
+Whenever we find there are no suitable stats, we skip the expensive steps.
+
+
+Further (possibly crazy) ideas
+------------------------------
+
+Currently the clauses are only estimated using a single statistics, even if
+there are multiple candidate statistics - for example assume we have statistics
+on (a,b,c) and (b,c,d), and estimate conditions
+
+ (b = 1) AND (c = 2)
+
+Then both statistics may be used, but we only use one of them. Maybe we could
+use compute estimates using all candidate stats, and somehow aggregate them
+into the final estimate by using average or median.
+
+Some stats may give better estimates than others, but it's very difficult to say
+in advance which stats are the best (it depends on the number of buckets, number
+of additional columns not referenced in the clauses, type of condition etc.).
+
+But of course, this may result in expensive estimation (CPU-wise).
+
+So we might add a GUC to choose between a simple (single statistics) and thus
+multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index dcb7c78..4f5a842 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -16,12 +16,14 @@
#include "common.h"
+#include "utils/array.h"
+
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
- int natts, VacAttrStats **vacattrstats);
+ int natts,
+ VacAttrStats **vacattrstats);
static List* list_mv_stats(Oid relid);
-
/*
* Compute requested multivariate stats, using the rows sampled for the
* plain (single-column) stats.
@@ -49,6 +51,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int j;
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
+ MCVList mcvlist = NULL;
+ int numrows_filtered = 0;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -87,8 +91,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ /* build the MCV list */
+ if (stat->mcv_enabled)
+ mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, attrs);
+ update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
}
}
@@ -166,6 +174,8 @@ list_mv_stats(Oid relid)
info->stakeys = buildint2vector(stats->stakeys.values, stats->stakeys.dim1);
info->deps_enabled = stats->deps_enabled;
info->deps_built = stats->deps_built;
+ info->mcv_enabled = stats->mcv_enabled;
+ info->mcv_built = stats->mcv_built;
result = lappend(result, info);
}
@@ -180,8 +190,56 @@ list_mv_stats(Oid relid)
return result;
}
+
+/*
+ * Find attnims of MV stats using the mvoid.
+ */
+int2vector*
+find_mv_attnums(Oid mvoid, Oid *relid)
+{
+ ArrayType *arr;
+ Datum adatum;
+ bool isnull;
+ HeapTuple htup;
+ int2vector *keys;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ htup = SearchSysCache1(MVSTATOID,
+ ObjectIdGetDatum(mvoid));
+
+ /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+ /* starelid */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_starelid, &isnull);
+ Assert(!isnull);
+
+ *relid = DatumGetObjectId(adatum);
+
+ /* stakeys */
+ adatum = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stakeys, &isnull);
+ Assert(!isnull);
+
+ arr = DatumGetArrayTypeP(adatum);
+
+ keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+ ARR_DIMS(arr)[0]);
+ ReleaseSysCache(htup);
+
+ /* TODO maybe save the list into relcache, as in RelationGetIndexList
+ * (which was used as an inspiration of this one)?. */
+
+ return keys;
+}
+
+
void
-update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
+update_mv_stats(Oid mvoid,
+ MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -206,18 +264,29 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
= PointerGetDatum(serialize_mv_dependencies(dependencies));
}
+ if (mcvlist != NULL)
+ {
+ bytea * data = serialize_mv_mcvlist(mcvlist, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stamcv -1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
+ replaces[Anum_pg_mv_statistic_stamcv -1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
+ nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
+ replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
+ values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
@@ -246,6 +315,21 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
heap_close(sd, RowExclusiveLock);
}
+
+int
+mv_get_index(AttrNumber varattno, int2vector * stakeys)
+{
+ int i, idx = 0;
+ for (i = 0; i < stakeys->dim1; i++)
+ {
+ if (stakeys->values[i] < varattno)
+ idx += 1;
+ else
+ break;
+ }
+ return idx;
+}
+
/* multi-variate stats comparator */
/*
@@ -256,11 +340,15 @@ update_mv_stats(Oid mvoid, MVDependencies dependencies, int2vector *attrs)
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
- Datum da = *(Datum*)a;
- Datum db = *(Datum*)b;
- SortSupport ssup= (SortSupport) arg;
+ return compare_datums_simple(*(Datum*)a,
+ *(Datum*)b,
+ (SortSupport)arg);
+}
- return ApplySortComparator(da, false, db, false, ssup);
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+ return ApplySortComparator(a, false, b, false, ssup);
}
/*
@@ -377,3 +465,32 @@ multi_sort_compare_dims(int start, int end,
return 0;
}
+
+/* simple counterpart to qsort_arg */
+void *
+bsearch_arg(const void *key, const void *base, size_t nmemb, size_t size,
+ int (*compar) (const void *, const void *, void *),
+ void *arg)
+{
+ size_t l, u, idx;
+ const void *p;
+ int comparison;
+
+ l = 0;
+ u = nmemb;
+ while (l < u)
+ {
+ idx = (l + u) / 2;
+ p = (void *) (((const char *) base) + (idx * size));
+ comparison = (*compar) (key, p, arg);
+
+ if (comparison < 0)
+ u = idx;
+ else if (comparison > 0)
+ l = idx + 1;
+ else
+ return (void *) p;
+ }
+
+ return NULL;
+}
diff --git a/src/backend/utils/mvstats/common.h b/src/backend/utils/mvstats/common.h
index 75b9c54..350760b 100644
--- a/src/backend/utils/mvstats/common.h
+++ b/src/backend/utils/mvstats/common.h
@@ -47,6 +47,14 @@ typedef struct
int tupno; /* position index for tuple it came from */
} ScalarItem;
+/* (de)serialization info */
+typedef struct DimensionInfo {
+ int nvalues; /* number of deduplicated values */
+ int nbytes; /* number of bytes (serialized) */
+ int typlen; /* pg_type.typlen */
+ bool typbyval; /* pg_type.typbyval */
+} DimensionInfo;
+
/* multi-sort */
typedef struct MultiSortSupportData {
int ndims; /* number of dimensions supported by the */
@@ -58,6 +66,7 @@ typedef MultiSortSupportData* MultiSortSupport;
typedef struct SortItem {
Datum *values;
bool *isnull;
+ int count;
} SortItem;
MultiSortSupport multi_sort_init(int ndims);
@@ -74,5 +83,11 @@ int multi_sort_compare_dims(int start, int end, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
/* comparators, used when constructing multivariate stats */
+int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
int compare_scalars_simple(const void *a, const void *b, void *arg);
int compare_scalars_partition(const void *a, const void *b, void *arg);
+
+void * bsearch_arg(const void *key, const void *base,
+ size_t nmemb, size_t size,
+ int (*compar) (const void *, const void *, void *),
+ void *arg);
diff --git a/src/backend/utils/mvstats/mcv.c b/src/backend/utils/mvstats/mcv.c
new file mode 100644
index 0000000..b300c1a
--- /dev/null
+++ b/src/backend/utils/mvstats/mcv.c
@@ -0,0 +1,1120 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ * POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+
+/*
+ * Each serialized item needs to store (in this order):
+ *
+ * - indexes (ndim * sizeof(uint16))
+ * - null flags (ndim * sizeof(bool))
+ * - frequency (sizeof(double))
+ *
+ * So in total:
+ *
+ * ndim * (sizeof(uint16) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims) \
+ (ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/* Macros for convenient access to parts of the serialized MCV item */
+#define ITEM_INDEXES(item) ((uint16*)item)
+#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+static MultiSortSupport build_mss(VacAttrStats **stats, int2vector *attrs);
+
+static SortItem *build_sorted_items(int numrows, HeapTuple *rows,
+ TupleDesc tdesc, MultiSortSupport mss,
+ int2vector *attrs);
+
+static SortItem *build_distinct_groups(int numrows, SortItem *items,
+ MultiSortSupport mss, int *ndistinct);
+
+static int count_distinct_groups(int numrows, SortItem *items,
+ MultiSortSupport mss);
+
+/*
+ * Builds MCV list from the set of sampled rows.
+ *
+ * The algorithm is quite simple:
+ *
+ * (1) sort the data (default collation, '<' for the data type)
+ *
+ * (2) count distinct groups, decide how many to keep
+ *
+ * (3) build the MCV list using the threshold determined in (2)
+ *
+ * (4) remove rows represented by the MCV from the sample
+ *
+ * The method also removes rows matching the MCV items from the input array,
+ * and passes the number of remaining rows (useful for building histograms)
+ * using the numrows_filtered parameter.
+ *
+ * FIXME Use max_mcv_items from ALTER TABLE ADD STATISTICS command.
+ *
+ * FIXME Single-dimensional MCV is sorted by frequency (descending). We should
+ * do that too, because when walking through the list we want to check
+ * the most frequent items first.
+ *
+ * TODO We're using Datum (8B), even for data types (e.g. int4 or float4).
+ * Maybe we could save some space here, but the bytea compression should
+ * handle it just fine.
+ *
+ * TODO This probably should not use the ndistinct directly (as computed from
+ * the table, but rather estimate the number of distinct values in the
+ * table), no?
+ */
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ int ndistinct = 0;
+ int mcv_threshold = 0;
+ int nitems = 0;
+
+ MCVList mcvlist = NULL;
+
+ /* comparator for all the columns */
+ MultiSortSupport mss = build_mss(stats, attrs);
+
+ /* sort the rows */
+ SortItem *items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+ mss, attrs);
+
+ /* transform the sorted rows into groups (sorted by frequency) */
+ SortItem *groups = build_distinct_groups(numrows, items, mss, &ndistinct);
+
+ /*
+ * Determine the minimum size of a group to be eligible for MCV list, and
+ * check how many groups actually pass that threshold. We use 1.25x the
+ * avarage group size, just like for regular statistics.
+ *
+ * But if we can fit all the distinct values in the MCV list (i.e. if there
+ * are less distinct groups than MVSTAT_MCVLIST_MAX_ITEMS), we'll require
+ * only 2 rows per group.
+ *
+ * FIXME This should really reference mcv_max_items (from catalog) instead
+ * of the constant MVSTAT_MCVLIST_MAX_ITEMS.
+ */
+ mcv_threshold = 1.25 * numrows / ndistinct;
+ mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+ if (ndistinct <= MVSTAT_MCVLIST_MAX_ITEMS)
+ mcv_threshold = 2;
+
+ /* Walk through the groups and stop once we fall below the threshold. */
+ nitems = 0;
+ for (i = 0; i < ndistinct; i++)
+ {
+ if (groups[i].count < mcv_threshold)
+ break;
+
+ nitems++;
+ }
+
+ /* we know the number of MCV list items, so let's build the list */
+ if (nitems > 0)
+ {
+ /* allocate the MCV list structure, set parameters we know */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ mcvlist->magic = MVSTAT_MCV_MAGIC;
+ mcvlist->type = MVSTAT_MCV_TYPE_BASIC;
+ mcvlist->ndimensions = numattrs;
+ mcvlist->nitems = nitems;
+
+ /*
+ * Preallocate Datum/isnull arrays (not as a single chunk, as we will
+ * pass the result outside and thus it needs to be easy to pfree().
+ *
+ * XXX Although we're the only ones dealing with this.
+ */
+ mcvlist->items = (MCVItem*)palloc0(sizeof(MCVItem)*nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ mcvlist->items[i] = (MCVItem)palloc0(sizeof(MCVItemData));
+ mcvlist->items[i]->values = (Datum*)palloc0(sizeof(Datum)*numattrs);
+ mcvlist->items[i]->isnull = (bool*)palloc0(sizeof(bool)*numattrs);
+ }
+
+ /* Copy the first chunk of groups into the result. */
+ for (i = 0; i < nitems; i++)
+ {
+ /* just pointer to the proper place in the list */
+ MCVItem item = mcvlist->items[i];
+
+ /* copy values from the _previous_ group (last item of) */
+ memcpy(item->values, groups[i].values, sizeof(Datum) * numattrs);
+ memcpy(item->isnull, groups[i].isnull, sizeof(bool) * numattrs);
+
+ /* and finally the group frequency */
+ item->frequency = (double)groups[i].count / numrows;
+ }
+
+ /* make sure the loops are consistent */
+ Assert(nitems == mcvlist->nitems);
+
+ /*
+ * Remove the rows matching the MCV list (i.e. keep only rows that are
+ * not represented by the MCV list). We will first sort the groups
+ * by the keys (not by count) and then use binary search.
+ */
+ if (nitems > ndistinct)
+ {
+ int i, j;
+ int nfiltered = 0;
+
+ /* used for the searches */
+ SortItem key;
+
+ /* wfill this with data from the rows */
+ key.values = (Datum*)palloc0(numattrs * sizeof(Datum));
+ key.isnull = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /*
+ * Sort the groups for bsearch_r (but only the items that actually
+ * made it to the MCV list).
+ */
+ qsort_arg((void *) groups, nitems, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /* walk through the tuples, compare the values to MCV items */
+ for (i = 0; i < numrows; i++)
+ {
+ /* collect the key values from the row */
+ for (j = 0; j < numattrs; j++)
+ key.values[j]
+ = heap_getattr(rows[i], attrs->values[j],
+ stats[j]->tupDesc, &key.isnull[j]);
+
+ /* if not included in the MCV list, keep it in the array */
+ if (bsearch_arg(&key, groups, nitems, sizeof(SortItem),
+ multi_sort_compare, mss) == NULL)
+ rows[nfiltered++] = rows[i];
+ }
+
+ /* remember how many rows we actually kept */
+ *numrows_filtered = nfiltered;
+
+ /* free all the data used here */
+ pfree(key.values);
+ pfree(key.isnull);
+ }
+ else
+ /* the MCV list convers all the rows */
+ *numrows_filtered = 0;
+ }
+
+ pfree(items);
+ pfree(groups);
+
+ return mcvlist;
+}
+
+/* build MultiSortSupport for the attributes passed in attrs */
+static MultiSortSupport
+build_mss(VacAttrStats **stats, int2vector *attrs)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ /* Sort by multiple columns (using array of SortSupport) */
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /* prepare the sort functions for all the attributes */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ return mss;
+}
+
+/* build sorted array of SortItem with values from rows */
+static SortItem *
+build_sorted_items(int numrows, HeapTuple *rows, TupleDesc tdesc,
+ MultiSortSupport mss, int2vector *attrs)
+{
+ int i, j, len;
+ int numattrs = attrs->dim1;
+ int nvalues = numrows * numattrs;
+
+ /*
+ * We won't allocate the arrays for each item independenly, but in one large
+ * chunk and then just set the pointers.
+ */
+ SortItem *items;
+ Datum *values;
+ bool *isnull;
+ char *ptr;
+
+ /* Compute the total amount of memory we need (both items and values). */
+ len = numrows * sizeof(SortItem) + nvalues * (sizeof(Datum) + sizeof(bool));
+
+ /* Allocate the memory and split it into the pieces. */
+ ptr = palloc0(len);
+
+ /* items to sort */
+ items = (SortItem*)ptr;
+ ptr += numrows * sizeof(SortItem);
+
+ /* values and null flags */
+ values = (Datum*)ptr;
+ ptr += nvalues * sizeof(Datum);
+
+ isnull = (bool*)ptr;
+ ptr += nvalues * sizeof(bool);
+
+ /* make sure we consumed the whole buffer exactly */
+ Assert((ptr - (char*)items) == len);
+
+ /* fix the pointers to Datum and bool arrays */
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+
+ /* load the values/null flags from sample rows */
+ for (j = 0; j < numattrs; j++)
+ {
+ items[i].values[j] = heap_getattr(rows[i],
+ attrs->values[j], /* attnum */
+ tdesc,
+ &items[i].isnull[j]); /* isnull */
+ }
+ }
+
+ /* do the sort, using the multi-sort */
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ return items;
+}
+
+/* count distinct combinations of SortItems in the array */
+static int
+count_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss)
+{
+ int i;
+ int ndistinct;
+
+ ndistinct = 1;
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ ndistinct += 1;
+
+ return ndistinct;
+}
+
+/* compares frequencies of the SortItem entries (in descending order) */
+static int
+compare_sort_item_count(const void *a, const void *b)
+{
+ SortItem *ia = (SortItem *)a;
+ SortItem *ib = (SortItem *)b;
+
+ if (ia->count == ib->count)
+ return 0;
+ else if (ia->count > ib->count)
+ return -1;
+
+ return 1;
+}
+
+/* builds SortItems for distinct groups and counts the matching items */
+static SortItem *
+build_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss,
+ int *ndistinct)
+{
+ int i, j;
+ int ngroups = count_distinct_groups(numrows, items, mss);
+
+ SortItem *groups = (SortItem*)palloc0(ngroups * sizeof(SortItem));
+
+ j = 0;
+ groups[0] = items[0];
+ groups[0].count = 1;
+
+ for (i = 1; i < numrows; i++)
+ {
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ groups[++j] = items[i];
+
+ groups[j].count++;
+ }
+
+ pg_qsort((void *) groups, ngroups, sizeof(SortItem),
+ compare_sort_item_count);
+
+ *ndistinct = ngroups;
+ return groups;
+}
+
+
+/* fetch the MCV list (as a bytea) from the pg_mv_statistic catalog */
+MCVList
+load_mv_mcvlist(Oid mvoid)
+{
+ bool isnull = false;
+ Datum mcvlist;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->mcv_enabled && mvstat->mcv_built);
+#endif
+
+ mcvlist = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stamcv, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_mcvlist(DatumGetByteaP(mcvlist));
+}
+
+/* print some basic info about the MCV list
+ *
+ * TODO Add info about what part of the table this covers.
+ */
+Datum
+pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MCVList mcvlist = deserialize_mv_mcvlist(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nitems=%d", mcvlist->nitems);
+
+ pfree(mcvlist);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/*
+ * serialize MCV list into a bytea value
+ *
+ *
+ * The basic algorithm is simple:
+ *
+ * (1) perform deduplication (for each attribute separately)
+ * (a) collect all (non-NULL) attribute values from all MCV items
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ * (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, because we may be mixing
+ * different datatypes, with different sort operators, etc.
+ *
+ * We'll use uint16 values for the indexes in step (3), as we don't allow more
+ * than 8k MCV items (see list max_mcv_items), although that's mostly arbitrary
+ * limit. We might increase this to 65k and still fit into uint16.
+ *
+ * We don't really expect the serialization to save as much space as for
+ * histograms, because we are not doing any bucket splits (which is the source
+ * of high redundancy in histograms).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into a single char
+ * (or a longer type) instead of using an array of bool items.
+ */
+bytea *
+serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i, j;
+ int ndims = mcvlist->ndimensions;
+ int itemsize = ITEM_SIZE(ndims);
+
+ SortSupport ssup;
+ DimensionInfo *info;
+
+ Size total_length;
+
+ /* allocate just once */
+ char *item = palloc0(itemsize);
+
+ /* serialized items (indexes into arrays, etc.) */
+ bytea *output;
+ char *data = NULL;
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /*
+ * We'll include some rudimentary information about the attributes (type
+ * length, etc.), so that we don't have to look them up while deserializing
+ * the MCV list.
+ */
+ info = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data for all attributes included in the MCV list */
+ ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for all attributes */
+ for (i = 0; i < ndims; i++)
+ {
+ int ndistinct;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* copy important info about the data type (length, by-value) */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /* allocate space for values in the attribute and collect them */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * mcvlist->nitems);
+
+ for (j = 0; j < mcvlist->nitems; j++)
+ {
+ /* skip NULL values - we don't need to serialize them */
+ if (mcvlist->items[j]->isnull[i])
+ continue;
+
+ values[i][counts[i]] = mcvlist->items[j]->values[i];
+ counts[i] += 1;
+ }
+
+ /* there are just NULL values in this dimension, we're done */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate the data */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicate values, but keep the
+ * ordering (so that we can do bsearch later). We know there's at least
+ * one item as (counts[i] != 0), so we can skip the first element.
+ */
+ ndistinct = 1; /* number of distinct values */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if the value is the same as the previous one, we can skip it */
+ if (! compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]))
+ continue;
+
+ values[i][ndistinct] = values[i][j];
+ ndistinct += 1;
+ }
+
+ /* we must not exceed UINT16_MAX, as we use uint16 indexes */
+ Assert(ndistinct <= UINT16_MAX);
+
+ /*
+ * Store additional info about the attribute - number of deduplicated
+ * values, and also size of the serialized data. For fixed-length data
+ * types this is trivial to compute, for varwidth types we need to
+ * actually walk the array and sum the sizes.
+ */
+ info[i].nvalues = ndistinct;
+
+ if (info[i].typlen > 0) /* fixed-length data types */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1) /* varlena */
+ {
+ info[i].nbytes = 0;
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2) /* cstring */
+ {
+ info[i].nbytes = 0;
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ }
+
+ /* we know (count>0) so there must be some data */
+ Assert(info[i].nbytes > 0);
+ }
+
+ /*
+ * Now we can finally compute how much space we'll actually need for the
+ * serialized MCV list, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nitems (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized items (nitems * itemsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and then we
+ * will place all the data (values + indexes).
+ */
+ total_length = (sizeof(int32) + offsetof(MCVListData, items)
+ + ndims * sizeof(DimensionInfo)
+ + mcvlist->nitems * itemsize);
+
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (1024 * 1024))
+ elog(ERROR, "serialized MCV list exceeds 1MB (%ld)", total_length);
+
+ /* allocate space for the serialized MCV list, set header fields */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* 'data' points to the current position in the output buffer */
+ data = VARDATA(output);
+
+ /* MCV list header (number of items, ...) */
+ memcpy(data, mcvlist, offsetof(MCVListData, items));
+ data += offsetof(MCVListData, items);
+
+ /* information about the attributes */
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* now serialize the deduplicated values for all attributes */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data; /* remember the starting point */
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ Datum v = values[i][j];
+
+ if (info[i].typbyval) /* passed by value */
+ {
+ memcpy(data, &v, info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0) /* pased by reference */
+ {
+ memcpy(data, DatumGetPointer(v), info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1) /* varlena */
+ {
+ memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+ data += VARSIZE_ANY(v);
+ }
+ else if (info[i].typlen == -2) /* cstring */
+ {
+ memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v))+1);
+ data += strlen(DatumGetPointer(v)) + 1; /* terminator */
+ }
+ }
+
+ /* make sure we got exactly the amount of data we expected */
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* finally serialize the items, with uint16 indexes instead of the values */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ MCVItem mcvitem = mcvlist->items[i];
+
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - itemsize);
+
+ /* reset the item (we only allocate it once and reuse it) */
+ memset(item, 0, itemsize);
+
+ for (j = 0; j < ndims; j++)
+ {
+ Datum *v = NULL;
+
+ /* do the lookup only for non-NULL values */
+ if (mcvlist->items[i]->isnull[j])
+ continue;
+
+ v = (Datum*)bsearch_arg(&mcvitem->values[j], values[j],
+ info[j].nvalues, sizeof(Datum),
+ compare_scalars_simple, &ssup[j]);
+
+ Assert(v != NULL); /* serialization or deduplication error */
+
+ /* compute index within the array */
+ ITEM_INDEXES(item)[j] = (v - values[j]);
+
+ /* check the index is within expected bounds */
+ Assert(ITEM_INDEXES(item)[j] >= 0);
+ Assert(ITEM_INDEXES(item)[j] < info[j].nvalues);
+ }
+
+ /* copy NULL and frequency flags into the item */
+ memcpy(ITEM_NULLS(item, ndims), mcvitem->isnull, sizeof(bool) * ndims);
+ memcpy(ITEM_FREQUENCY(item, ndims), &mcvitem->frequency, sizeof(double));
+
+ /* copy the serialized item into the array */
+ memcpy(data, item, itemsize);
+
+ data += itemsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ return output;
+}
+
+/*
+ * deserialize MCV list from the varlena value
+ *
+ *
+ * We deserialize the MCV list fully, because we don't expect there bo be a lot
+ * of duplicate values. But perhaps we should keep the MCV in serialized form
+ * just like histograms.
+ */
+MCVList deserialize_mv_mcvlist(bytea * data)
+{
+ int i, j;
+ Size expected_size;
+ MCVList mcvlist;
+ char *tmp;
+
+ int ndims, nitems, itemsize;
+ DimensionInfo *info = NULL;
+
+ uint16 *indexes = NULL;
+ Datum **values = NULL;
+
+ /* local allocation buffer (used only for deserialization) */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ /* buffer used for the result */
+ int rbufflen;
+ char *rbuff;
+ char *rptr;
+
+ if (data == NULL)
+ return NULL;
+
+ /* we can't deserialize the MCV if there's not even a complete header */
+ expected_size = offsetof(MCVListData,items);
+
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MCVListData,items));
+
+ /* read the MCV list header */
+ mcvlist = (MCVList)palloc0(sizeof(MCVListData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform further sanity checks */
+ memcpy(mcvlist, tmp, offsetof(MCVListData,items));
+ tmp += offsetof(MCVListData,items);
+
+ if (mcvlist->magic != MVSTAT_MCV_MAGIC)
+ elog(ERROR, "invalid MCV magic %d (expected %dd)",
+ mcvlist->magic, MVSTAT_MCV_MAGIC);
+
+ if (mcvlist->type != MVSTAT_MCV_TYPE_BASIC)
+ elog(ERROR, "invalid MCV type %d (expected %dd)",
+ mcvlist->type, MVSTAT_MCV_TYPE_BASIC);
+
+ nitems = mcvlist->nitems;
+ ndims = mcvlist->ndimensions;
+ itemsize = ITEM_SIZE(ndims);
+
+ Assert((nitems > 0) && (nitems <= MVSTAT_MCVLIST_MAX_ITEMS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * Check amount of data including DimensionInfo for all dimensions and
+ * also the serialized items (including uint16 indexes). Also, walk
+ * through the dimension information and add it to the sum.
+ */
+ expected_size += ndims * sizeof(DimensionInfo) +
+ (nitems * itemsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid MCV size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ {
+ Assert(info[i].nvalues >= 0);
+ Assert(info[i].nbytes >= 0);
+
+ expected_size += info[i].nbytes;
+ }
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid MCV size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /*
+ * Allocate one large chunk of memory for the intermediate data, needed
+ * only for deserializing the MCV list (and allocate densely to minimize
+ * the palloc overhead).
+ *
+ * Let's see how much space we'll actually need, and also include space
+ * for the array with pointers.
+ */
+ bufflen = sizeof(Datum*) * ndims; /* space for pointers */
+
+ for (i = 0; i < ndims; i++)
+ /* for full-size byval types, we reuse the serialized value */
+ if (! (info[i].typbyval && info[i].typlen == sizeof(Datum)))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ values = (Datum**)buff;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * XXX This uses pointers to the original data array (the types not passed
+ * by value), so when someone frees the memory, e.g. by doing something
+ * like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. Should copy the pieces.
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the other types need a chunk of the buffer */
+ values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ /* pased by reference, but fixed length (name, tid, ...) */
+ if (info[i].typlen > 0)
+ {
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ /* we should have exhausted the buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ /* allocate space for all the MCV items in a single piece */
+ rbufflen = (sizeof(MCVItem) + sizeof(MCVItemData) +
+ sizeof(Datum)*ndims + sizeof(bool)*ndims) * nitems;
+
+ rbuff = palloc0(rbufflen);
+ rptr = rbuff;
+
+ mcvlist->items = (MCVItem*)rbuff;
+ rptr += (sizeof(MCVItem) * nitems);
+
+ for (i = 0; i < nitems; i++)
+ {
+ MCVItem item = (MCVItem)rptr;
+ rptr += (sizeof(MCVItemData));
+
+ item->values = (Datum*)rptr;
+ rptr += (sizeof(Datum)*ndims);
+
+ item->isnull = (bool*)rptr;
+ rptr += (sizeof(bool) *ndims);
+
+ /* just point to the right place */
+ indexes = ITEM_INDEXES(tmp);
+
+ memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+ memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+ for (j = 0; j < ndims; j++)
+ Assert(indexes[j] <= UINT16_MAX);
+#endif
+
+ /* translate the values */
+ for (j = 0; j < ndims; j++)
+ if (! item->isnull[j])
+ item->values[j] = values[j][indexes[j]];
+
+ mcvlist->items[i] = item;
+
+ tmp += ITEM_SIZE(ndims);
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+ }
+
+ /* check that we processed all the data */
+ Assert(tmp == (char*)data + VARSIZE_ANY(data));
+
+ /* release the temporary buffer */
+ pfree(buff);
+
+ return mcvlist;
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if
+ * the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_mcv_items);
+
+Datum
+pg_mv_mcv_items(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MCVList mcvlist;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ mcvlist = load_mv_mcvlist(PG_GETARG_OID(0));
+
+ funcctx->user_fctx = mcvlist;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = mcvlist->nitems;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /* build metadata needed later to produce tuples from raw C-strings */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+
+ char *buff = palloc0(1024);
+ char *format;
+
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MCVList mcvlist;
+ MCVItem item;
+
+ mcvlist = (MCVList)funcctx->user_fctx;
+
+ Assert(call_cntr < mcvlist->nitems);
+
+ item = mcvlist->items[call_cntr];
+
+ stakeys = find_mv_attnums(PG_GETARG_OID(0), &relid);
+
+ /*
+ * Prepare a values array for building the returned tuple. This should
+ * be an array of C strings which will be processed later by the type
+ * input functions.
+ */
+ values = (char **) palloc(4 * sizeof(char *));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ /* arrays */
+ values[1] = (char *) palloc0(1024 * sizeof(char));
+ values[2] = (char *) palloc0(1024 * sizeof(char));
+
+ /* frequency */
+ values[3] = (char *) palloc(64 * sizeof(char));
+
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * mcvlist->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* item ID */
+
+ for (i = 0; i < mcvlist->ndimensions; i++)
+ {
+ Datum val, valout;
+
+ format = "%s, %s";
+ if (i == 0)
+ format = "{%s%s";
+ else if (i == mcvlist->ndimensions-1)
+ format = "%s, %s}";
+
+ if (item->isnull[i])
+ valout = CStringGetDatum("NULL");
+ else
+ {
+ val = item->values[i];
+ valout = FunctionCall1(&fmgrinfo[i], val);
+ }
+
+ snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+ strncpy(values[1], buff, 1023);
+ buff[0] = '\0';
+
+ snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+ strncpy(values[2], buff, 1023);
+ buff[0] = '\0';
+ }
+
+ snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[1]);
+ pfree(values[2]);
+ pfree(values[3]);
+
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8ce9c0e..2c22d31 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,8 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled,\n"
- " deps_built,\n"
+ " deps_enabled, mcv_enabled,\n"
+ " deps_built, mcv_built,\n"
+ " mcv_max_items,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2128,6 +2129,8 @@ describeOneTableDetails(const char *schemaname,
printTableAddFooter(&cont, _("Statistics:"));
for (i = 0; i < tuples; i++)
{
+ bool first = true;
+
printfPQExpBuffer(&buf, " ");
/* statistics name (qualified with namespace) */
@@ -2137,10 +2140,22 @@ describeOneTableDetails(const char *schemaname,
/* options */
if (!strcmp(PQgetvalue(result, i, 4), "t"))
- appendPQExpBuffer(&buf, "(dependencies)");
+ {
+ appendPQExpBuffer(&buf, "(dependencies");
+ first = false;
+ }
+
+ if (!strcmp(PQgetvalue(result, i, 5), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", mcv");
+ else
+ appendPQExpBuffer(&buf, "(mcv");
+ first = false;
+ }
- appendPQExpBuffer(&buf, " ON (%s)",
- PQgetvalue(result, i, 6));
+ appendPQExpBuffer(&buf, ") ON (%s)",
+ PQgetvalue(result, i, 9));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index c74af47..3529b03 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -38,15 +38,21 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
+ bool mcv_enabled; /* build MCV list? */
+
+ /* MCV size */
+ int32 mcv_max_items; /* max MCV items */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
+ bool mcv_built; /* MCV list was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
+ bytea stamcv; /* MCV list (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -62,14 +68,18 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 8
+#define Natts_pg_mv_statistic 12
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_staowner 4
#define Anum_pg_mv_statistic_deps_enabled 5
-#define Anum_pg_mv_statistic_deps_built 6
-#define Anum_pg_mv_statistic_stakeys 7
-#define Anum_pg_mv_statistic_stadeps 8
+#define Anum_pg_mv_statistic_mcv_enabled 6
+#define Anum_pg_mv_statistic_mcv_max_items 7
+#define Anum_pg_mv_statistic_deps_built 8
+#define Anum_pg_mv_statistic_mcv_built 9
+#define Anum_pg_mv_statistic_stakeys 10
+#define Anum_pg_mv_statistic_stadeps 11
+#define Anum_pg_mv_statistic_stamcv 12
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index cdcbf95..5640dc1 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2670,6 +2670,10 @@ DATA(insert OID = 3998 ( pg_mv_stats_dependencies_info PGNSP PGUID 12 1 0 0
DESCR("multivariate stats: functional dependencies info");
DATA(insert OID = 3999 ( pg_mv_stats_dependencies_show PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_dependencies_show _null_ _null_ _null_ ));
DESCR("multivariate stats: functional dependencies show");
+DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_mcvlist_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: MCV list info");
+DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 75c4752..f52884a 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -655,9 +655,11 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
+ bool mcv_enabled; /* MCV list enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
+ bool mcv_built; /* MCV list built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index ec55a09..b2643ec 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,6 +17,14 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+/*
+ * Degree of how much MCV item matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define MVSTATS_MATCH_NONE 0 /* no match at all */
+#define MVSTATS_MATCH_PARTIAL 1 /* partial match */
+#define MVSTATS_MATCH_FULL 2 /* full match */
+
#define MVSTATS_MAX_DIMENSIONS 8 /* max number of attributes */
/*
@@ -43,30 +51,89 @@ typedef MVDependenciesData* MVDependencies;
#define MVSTAT_DEPS_TYPE_BASIC 1 /* basic dependencies type */
/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and
+ * null flags.
+ */
+typedef struct MCVItemData {
+ double frequency; /* frequency of this combination */
+ bool *isnull; /* lags of NULL values (up to 32 columns) */
+ Datum *values; /* variable-length (ndimensions) */
+} MCVItemData;
+
+typedef MCVItemData *MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVListData {
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of MCV list (BASIC) */
+ uint32 ndimensions; /* number of dimensions */
+ uint32 nitems; /* number of MCV items in the array */
+ MCVItem *items; /* array of MCV items */
+} MCVListData;
+
+typedef MCVListData *MCVList;
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
+#define MVSTAT_MCV_TYPE_BASIC 1 /* basic MCV list type */
+
+/*
+ * Limits used for mcv_max_items option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_MCVLIST_MIN_ITEMS, and we cannot
+ * have more than MVSTAT_MCVLIST_MAX_ITEMS items.
+ *
+ * This is just a boundary for the 'max' threshold - the actual list
+ * may of course contain less items than MVSTAT_MCVLIST_MIN_ITEMS.
+ */
+#define MVSTAT_MCVLIST_MIN_ITEMS 128 /* min items in MCV list */
+#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
*/
MVDependencies load_mv_dependencies(Oid mvoid);
+MCVList load_mv_mcvlist(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
+bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
+MCVList deserialize_mv_mcvlist(bytea * data);
+
+/*
+ * Returns index of the attribute number within the vector (i.e. a
+ * dimension within the stats).
+ */
+int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
/* FIXME this probably belongs somewhere else (not to operations stats) */
extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
MVDependencies
-build_mv_dependencies(int numrows, HeapTuple *rows,
- int2vector *attrs,
- VacAttrStats **stats);
+build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats);
+
+MCVList
+build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int *numrows_filtered);
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
- int natts, VacAttrStats **vacattrstats);
+ int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, int2vector *attrs);
+void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+ int2vector *attrs, VacAttrStats **stats);
#endif
diff --git a/src/test/regress/expected/mv_mcv.out b/src/test/regress/expected/mv_mcv.out
new file mode 100644
index 0000000..075320b
--- /dev/null
+++ b/src/test/regress/expected/mv_mcv.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s4 ON mcv_list (unknown_column) WITH (mcv);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s4 ON mcv_list (a) WITH (mcv);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a, b) WITH (mcv);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing MCV statistics
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+ERROR: option 'mcv' is required by other options(s)
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+ERROR: max number of MCV items must be at least 128
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+ERROR: max number of MCV items is 8192
+-- correct command
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s5 ON mcv_list (a, b, c) WITH (mcv);
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | f |
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1000
+(1 row)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mcv_list;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=100
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_list
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on mcv_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mcv_list;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s6 ON mcv_list (a, b, c, d) WITH (mcv);
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+ mcv_enabled | mcv_built | pg_mv_stats_mcvlist_info
+-------------+-----------+--------------------------
+ t | t | nitems=1200
+(1 row)
+
+DROP TABLE mcv_list;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 06f2231..3d55ffe 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1373,7 +1373,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
s.staname,
s.stakeys AS attnums,
length(s.stadeps) AS depsbytes,
- pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo
+ pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
+ length(s.stamcv) AS mcvbytes,
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 4f2ffb8..85d94f1 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies
+test: mv_dependencies mv_mcv
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 097a04f..6584d73 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -163,3 +163,4 @@ test: xml
test: event_trigger
test: stats
test: mv_dependencies
+test: mv_mcv
diff --git a/src/test/regress/sql/mv_mcv.sql b/src/test/regress/sql/mv_mcv.sql
new file mode 100644
index 0000000..b31d32d
--- /dev/null
+++ b/src/test/regress/sql/mv_mcv.sql
@@ -0,0 +1,178 @@
+-- data type passed by value
+CREATE TABLE mcv_list (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s4 ON mcv_list (unknown_column) WITH (mcv);
+
+-- single column
+CREATE STATISTICS s4 ON mcv_list (a) WITH (mcv);
+
+-- single column, duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a) WITH (mcv);
+
+-- two columns, one duplicated
+CREATE STATISTICS s4 ON mcv_list (a, a, b) WITH (mcv);
+
+-- unknown option
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (unknown_option);
+
+-- missing MCV statistics
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (dependencies, max_mcv_items=200);
+
+-- invalid mcv_max_items value / too low
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10);
+
+-- invalid mcv_max_items value / too high
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv, max_mcv_items=10000);
+
+-- correct command
+CREATE STATISTICS s4 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = 10 AND b = 5;
+
+DROP TABLE mcv_list;
+
+-- varlena type (text)
+CREATE TABLE mcv_list (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s5 ON mcv_list (a, b, c) WITH (mcv);
+
+-- random data
+INSERT INTO mcv_list
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c, b => c
+INSERT INTO mcv_list
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- a => b, a => c
+INSERT INTO mcv_list
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mcv_list
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX mcv_idx ON mcv_list (a, b);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a = '10' AND b = '5';
+
+TRUNCATE mcv_list;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mcv_list
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mcv_list WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mcv_list;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mcv_list (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s6 ON mcv_list (a, b, c, d) WITH (mcv);
+
+INSERT INTO mcv_list
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mcv_list;
+
+SELECT mcv_enabled, mcv_built, pg_mv_stats_mcvlist_info(stamcv)
+ FROM pg_mv_statistic WHERE starelid = 'mcv_list'::regclass;
+
+DROP TABLE mcv_list;
--
2.5.0
0005-multivariate-histograms.patchtext/x-patch; name=0005-multivariate-histograms.patchDownload
From eb184e590bc0d1e41b9bf69d7cdc6d09e28daac3 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 11 Jan 2015 20:18:24 +0100
Subject: [PATCH 5/9] multivariate histograms
- extends the pg_mv_statistic catalog (add 'hist' fields)
- building the histograms during ANALYZE
- simple estimation while planning the queries
Includes regression tests mostly equal to those for functional
dependencies / MCV lists.
---
doc/src/sgml/ref/create_statistics.sgml | 44 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/statscmds.c | 44 +-
src/backend/nodes/outfuncs.c | 2 +
src/backend/optimizer/path/clausesel.c | 584 +++++++-
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.histogram | 299 ++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 37 +-
src/backend/utils/mvstats/histogram.c | 2023 ++++++++++++++++++++++++++++
src/bin/psql/describe.c | 17 +-
src/include/catalog/pg_mv_statistic.h | 24 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 138 +-
src/test/regress/expected/mv_histogram.out | 207 +++
src/test/regress/expected/rules.out | 4 +-
src/test/regress/parallel_schedule | 2 +-
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/mv_histogram.sql | 176 +++
21 files changed, 3576 insertions(+), 44 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.histogram
create mode 100644 src/backend/utils/mvstats/histogram.c
create mode 100644 src/test/regress/expected/mv_histogram.out
create mode 100644 src/test/regress/sql/mv_histogram.sql
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index d6973e8..f7336fd 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -133,6 +133,24 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</varlistentry>
<varlistentry>
+ <term><literal>histogram</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables histogram for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>max_buckets</> (<type>integer</>)</term>
+ <listitem>
+ <para>
+ Maximum number of histogram buckets.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><literal>max_mcv_items</> (<type>integer</>)</term>
<listitem>
<para>
@@ -220,6 +238,32 @@ EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
</programlisting>
</para>
+ <para>
+ Create table <structname>t3</> with two strongly correlated columns, and
+ a histogram on those two columns:
+
+<programlisting>
+CREATE TABLE t3 (
+ a float,
+ b float
+);
+
+INSERT INTO t3 SELECT mod(i,1000), mod(i,1000) + 50 * (r - 0.5) FROM (
+ SELECT i, random() r FROM generate_series(1,1000000) s(i)
+ ) foo;
+
+CREATE STATISTICS s3 ON t3 (a, b) WITH (histogram);
+
+ANALYZE t2;
+
+-- small overlap
+EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a < 500) AND (b > 500);
+
+-- no overlap
+EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a < 400) AND (b > 600);
+</programlisting>
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5c40334..b151db1 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -167,7 +167,9 @@ CREATE VIEW pg_mv_stats AS
length(S.stadeps) as depsbytes,
pg_mv_stats_dependencies_info(S.stadeps) as depsinfo,
length(S.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
+ length(S.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index c480fbe..e0b085f 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -71,12 +71,15 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
- build_mcv = false;
+ build_mcv = false,
+ build_histogram = false;
- int32 max_mcv_items = -1;
+ int32 max_buckets = -1,
+ max_mcv_items = -1;
/* options required because of other options */
- bool require_mcv = false;
+ bool require_mcv = false,
+ require_histogram = false;
Assert(IsA(stmt, CreateStatsStmt));
@@ -175,6 +178,29 @@ CreateStatistics(CreateStatsStmt *stmt)
MVSTAT_MCVLIST_MAX_ITEMS)));
}
+ else if (strcmp(opt->defname, "histogram") == 0)
+ build_histogram = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "max_buckets") == 0)
+ {
+ max_buckets = defGetInt32(opt);
+
+ /* this option requires 'histogram' to be enabled */
+ require_histogram = true;
+
+ /* sanity check */
+ if (max_buckets < MVSTAT_HIST_MIN_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("minimum number of buckets is %d",
+ MVSTAT_HIST_MIN_BUCKETS)));
+
+ else if (max_buckets > MVSTAT_HIST_MAX_BUCKETS)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("maximum number of buckets is %d",
+ MVSTAT_HIST_MAX_BUCKETS)));
+
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -183,10 +209,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv))
+ if (! (build_dependencies || build_mcv || build_histogram))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -194,6 +220,11 @@ CreateStatistics(CreateStatsStmt *stmt)
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option 'mcv' is required by other options(s)")));
+ if (require_histogram && (! build_histogram))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option 'histogram' is required by other options(s)")));
+
/* sort the attnums and build int2vector */
qsort(attnums, numcols, sizeof(int16), compare_int16);
stakeys = buildint2vector(attnums, numcols);
@@ -214,11 +245,14 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
+ values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
+ values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
+ nulls[Anum_pg_mv_statistic_stahist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 333e24b..9172f21 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2163,10 +2163,12 @@ _outMVStatisticInfo(StringInfo str, const MVStatisticInfo *node)
/* enabled statistics */
WRITE_BOOL_FIELD(deps_enabled);
WRITE_BOOL_FIELD(mcv_enabled);
+ WRITE_BOOL_FIELD(hist_enabled);
/* built/available statistics */
WRITE_BOOL_FIELD(deps_built);
WRITE_BOOL_FIELD(mcv_built);
+ WRITE_BOOL_FIELD(hist_built);
}
static void
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index c16d559..fe96a73 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -49,6 +49,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
+#define MV_CLAUSE_TYPE_HIST 0x04
static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
int type);
@@ -74,6 +75,8 @@ static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
List *clauses, MVStatisticInfo *mvstats,
bool *fullmatch, Selectivity *lowsel);
+static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
+ List *clauses, MVStatisticInfo *mvstats);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -81,6 +84,12 @@ static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
Selectivity *lowsel, bool *fullmatch,
bool is_or);
+static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
@@ -95,6 +104,7 @@ static bool stats_type_matches(MVStatisticInfo *stat, int type);
#define UPDATE_RESULT(m,r,isor) \
(m) = (isor) ? (MAX(m,r)) : (MIN(m,r))
+
/****************************************************************************
* ROUTINES TO COMPUTE SELECTIVITIES
****************************************************************************/
@@ -123,7 +133,7 @@ static bool stats_type_matches(MVStatisticInfo *stat, int type);
*
* First we try to reduce the list of clauses by applying (soft) functional
* dependencies, and then we try to estimate the selectivity of the reduced
- * list of clauses using the multivariate MCV list.
+ * list of clauses using the multivariate MCV list and histograms.
*
* Finally we remove the portion of clauses estimated using multivariate stats,
* and process the rest of the clauses using the regular per-column stats.
@@ -216,11 +226,13 @@ clauselist_selectivity(PlannerInfo *root,
* with the multivariate code and simply skip to estimation using the
* regular per-column stats.
*/
- if (has_stats(stats, MV_CLAUSE_TYPE_MCV) &&
- (count_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV) >= 2))
+ if (has_stats(stats, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) &&
+ (count_mv_attnums(clauses, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
/* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, relid, MV_CLAUSE_TYPE_MCV);
+ Bitmapset *mvattnums = collect_mv_attnums(clauses, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* and search for the statistic covering the most attributes */
MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
@@ -232,7 +244,7 @@ clauselist_selectivity(PlannerInfo *root,
/* split the clauselist into regular and mv-clauses */
clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
- mvstat, MV_CLAUSE_TYPE_MCV);
+ mvstat, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
/* we've chosen the histogram to match the clauses */
Assert(mvclauses != NIL);
@@ -944,6 +956,7 @@ static Selectivity
clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
{
bool fullmatch = false;
+ Selectivity s1 = 0.0, s2 = 0.0;
/*
* Lowest frequency in the MCV list (may be used as an upper bound
@@ -957,9 +970,24 @@ clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvs
* MCV/histogram evaluation).
*/
- /* Evaluate the MCV selectivity */
- return clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
&fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* TODO if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
}
/*
@@ -1039,7 +1067,7 @@ count_varnos(List *clauses, Index *relid)
return cnt;
}
-
+
/*
* We're looking for statistics matching at least 2 attributes, referenced in
* clauses compatible with multivariate statistics. The current selection
@@ -1129,7 +1157,7 @@ choose_mv_statistics(List *stats, Bitmapset *attnums)
int numattrs = attrs->dim1;
/* skip dependencies-only stats */
- if (! info->mcv_built)
+ if (! (info->mcv_built || info->hist_built))
continue;
/* count columns covered by the histogram */
@@ -1251,7 +1279,7 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
}
if (or_clause(node) || and_clause(node) || not_clause(node))
- {
+ {
/*
* AND/OR/NOT-clauses are supported if all sub-clauses are supported
*
@@ -1277,10 +1305,10 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
}
return false;
- }
+ }
if (IsA(node, NullTest))
- {
+ {
NullTest* nt = (NullTest*)node;
/*
@@ -1360,9 +1388,9 @@ mv_compatible_walker(Node *node, mv_compatible_context *context)
case F_SCALARGTSEL:
/* not compatible with functional dependencies */
- if (! (context->types & MV_CLAUSE_TYPE_MCV))
+ if (! (context->types & (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST)))
return true; /* terminate */
-
+
break;
default:
@@ -1588,6 +1616,9 @@ stats_type_matches(MVStatisticInfo *stat, int type)
if ((type & MV_CLAUSE_TYPE_MCV) && stat->mcv_built)
return true;
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
+
return false;
}
@@ -1606,6 +1637,9 @@ has_stats(List *stats, int type)
/* terminate if we've found at least one matching statistics */
if (stats_type_matches(stat, type))
return true;
+
+ if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
+ return true;
}
return false;
@@ -2010,3 +2044,525 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
return nmatches;
}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ * 1) mark all buckets as 'full match'
+ * 2) walk through all the clauses
+ * 3) for a particular clause, walk through all the buckets
+ * 4) skip buckets that are already 'no match'
+ * 5) check clause for buckets that still match (at least partially)
+ * 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+static Selectivity
+clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
+ MVStatisticInfo *mvstats)
+{
+ int i;
+ Selectivity s = 0.0;
+ Selectivity u = 0.0;
+
+ int nmatches = 0;
+ char *matches = NULL;
+
+ MVSerializedHistogram mvhist = NULL;
+
+ /* there's no histogram */
+ if (! mvstats->hist_built)
+ return 0.0;
+
+ /* There may be no histogram in the stats (check hist_built flag) */
+ mvhist = load_mv_histogram(mvstats->mvoid);
+
+ Assert (mvhist != NULL);
+ Assert (clauses != NIL);
+ Assert (list_length(clauses) >= 2);
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full). by default
+ * all buckets fully match (and we'll eliminate them).
+ */
+ matches = palloc0(sizeof(char) * mvhist->nbuckets);
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+
+ nmatches = mvhist->nbuckets;
+
+ /* build the match bitmap */
+ update_match_bitmap_histogram(root, clauses,
+ mvstats->stakeys, mvhist,
+ nmatches, matches, false);
+
+ /* now, walk through the buckets and sum the selectivities */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * Find out what part of the data is covered by the histogram,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample got into the histogram, and the rest
+ * is in a MCV list).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole histogram, which might save us some time
+ * spent accessing the not-matching part of the histogram.
+ * Although it's likely in a cache, so it's very fast.
+ */
+ u += mvhist->buckets[i]->ntuples;
+
+ if (matches[i] == MVSTATS_MATCH_FULL)
+ s += mvhist->buckets[i]->ntuples;
+ else if (matches[i] == MVSTATS_MATCH_PARTIAL)
+ s += 0.5 * mvhist->buckets[i]->ntuples;
+ }
+
+#ifdef DEBUG_MVHIST
+ debug_histogram_matches(mvhist, matches);
+#endif
+
+ /* release the allocated bitmap and deserialized histogram */
+ pfree(matches);
+ pfree(mvhist);
+
+ return s * u;
+}
+
+/* cached result of bucket boundary comparison for a single dimension */
+
+#define HIST_CACHE_NOT_FOUND 0x00
+#define HIST_CACHE_FALSE 0x01
+#define HIST_CACHE_TRUE 0x03
+#define HIST_CACHE_MASK 0x02
+
+static char
+bucket_contains_value(FmgrInfo ltproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache)
+{
+ bool a, b;
+
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /*
+ * First some quick checks on equality - if any of the boundaries equals,
+ * we have a partial match (so no need to call the comparator).
+ */
+ if (((min_value == constvalue) && (min_include)) ||
+ ((max_value == constvalue) && (max_include)))
+ return MVSTATS_MATCH_PARTIAL;
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ /*
+ * If result for the bucket lower bound not in cache, evaluate the function
+ * and store the result in the cache.
+ */
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, min_value));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /* And do the same for the upper bound. */
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(<proc,
+ DEFAULT_COLLATION_OID,
+ constvalue, max_value));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ return (a ^ b) ? MVSTATS_MATCH_PARTIAL : MVSTATS_MATCH_NONE;
+}
+
+static char
+bucket_is_smaller_than_value(FmgrInfo opproc, Datum constvalue,
+ Datum min_value, Datum max_value,
+ int min_index, int max_index,
+ bool min_include, bool max_include,
+ char * callcache, bool isgt)
+{
+ char min_cached = callcache[min_index];
+ char max_cached = callcache[max_index];
+
+ /* Keep the values 0/1 because of the XOR at the end. */
+ bool a = ((min_cached & HIST_CACHE_MASK) >> 1);
+ bool b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+ if (! min_cached)
+ {
+ a = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ min_value,
+ constvalue));
+ /* remember the result */
+ callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ if (! max_cached)
+ {
+ b = DatumGetBool(FunctionCall2Coll(&opproc,
+ DEFAULT_COLLATION_OID,
+ max_value,
+ constvalue));
+ /* remember the result */
+ callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+ }
+
+ /*
+ * Now, we need to combine both results into the final answer, and we need
+ * to be careful about the 'isgt' variable which kinda inverts the meaning.
+ *
+ * First, we handle the case when each boundary returns different results.
+ * In that case the outcome can only be 'partial' match.
+ */
+ if (a != b)
+ return MVSTATS_MATCH_PARTIAL;
+
+ /*
+ * When the results are the same, then it depends on the 'isgt' value. There
+ * are four options:
+ *
+ * isgt=false a=b=true => full match
+ * isgt=false a=b=false => empty
+ * isgt=true a=b=true => empty
+ * isgt=true a=b=false => full match
+ *
+ * We'll cheat a bit, because we know that (a=b) so we'll use just one of them.
+ */
+ if (isgt)
+ return (!a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+ else
+ return ( a) ? MVSTATS_MATCH_FULL : MVSTATS_MATCH_NONE;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
+ int2vector *stakeys,
+ MVSerializedHistogram mvhist,
+ int nmatches, char * matches,
+ bool is_or)
+{
+ int i;
+ ListCell * l;
+
+ /*
+ * Used for caching function calls, only once per deduplicated value.
+ *
+ * We know may have up to (2 * nbuckets) values per dimension. It's
+ * probably overkill, but let's allocate that once for all clauses,
+ * to minimize overhead.
+ *
+ * Also, we only need two bits per value, but this allocates byte
+ * per value. Might be worth optimizing.
+ *
+ * 0x00 - not yet called
+ * 0x01 - called, result is 'false'
+ * 0x03 - called, result is 'true'
+ */
+ char *callcache = palloc(mvhist->nbuckets);
+
+ Assert(mvhist != NULL);
+ Assert(mvhist->nbuckets > 0);
+ Assert(nmatches >= 0);
+ Assert(nmatches <= mvhist->nbuckets);
+
+ Assert(clauses != NIL);
+ Assert(list_length(clauses) >= 1);
+
+ /* loop through the clauses and do the estimation */
+ foreach (l, clauses)
+ {
+ Node * clause = (Node*)lfirst(l);
+
+ /* if it's a RestrictInfo, then extract the clause */
+ if (IsA(clause, RestrictInfo))
+ clause = (Node*)((RestrictInfo*)clause)->clause;
+
+ /* it's either OpClause, or NullTest */
+ if (is_opclause(clause))
+ {
+ OpExpr * expr = (OpExpr*)clause;
+ bool varonleft = true;
+ bool ok;
+
+ FmgrInfo opproc; /* operator */
+ fmgr_info(get_opcode(expr->opno), &opproc);
+
+ /* reset the cache (per clause) */
+ memset(callcache, 0, mvhist->nbuckets);
+
+ ok = (NumRelids(clause) == 1) &&
+ (is_pseudo_constant_clause(lsecond(expr->args)) ||
+ (varonleft = false,
+ is_pseudo_constant_clause(linitial(expr->args))));
+
+ if (ok)
+ {
+ FmgrInfo ltproc;
+ RegProcedure oprrest = get_oprrest(expr->opno);
+
+ Var * var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+ Const * cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+ bool isgt = (! varonleft);
+
+ TypeCacheEntry *typecache
+ = lookup_type_cache(var->vartype, TYPECACHE_LT_OPR);
+
+ /* lookup dimension for the attribute */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ fmgr_info(get_opcode(typecache->lt_opr), <proc);
+
+ /*
+ * Check this for all buckets that still have "true" in the bitmap
+ *
+ * We already know the clauses use suitable operators (because that's
+ * how we filtered them).
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ char res = MVSTATS_MATCH_NONE;
+
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /* histogram boundaries */
+ Datum minval, maxval;
+ bool mininclude, maxinclude;
+ int minidx, maxidx;
+
+ /*
+ * For AND-lists, we can also mark NULL buckets as 'no match'
+ * (and then skip them). For OR-lists this is not possible.
+ */
+ if ((! is_or) && bucket->nullsonly[idx])
+ matches[i] = MVSTATS_MATCH_NONE;
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match).
+ * We can't really do anything about the MATCH_PARTIAL buckets.
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* lookup the values and cache of function calls */
+ minidx = bucket->min[idx];
+ maxidx = bucket->max[idx];
+
+ minval = mvhist->values[idx][bucket->min[idx]];
+ maxval = mvhist->values[idx][bucket->max[idx]];
+
+ mininclude = bucket->min_inclusive[idx];
+ maxinclude = bucket->max_inclusive[idx];
+
+ /*
+ * TODO Maybe it's possible to add here a similar optimization
+ * as for the MCV lists:
+ *
+ * (nmatches == 0) && AND-list => all eliminated (FALSE)
+ * (nmatches == N) && OR-list => all eliminated (TRUE)
+ *
+ * But it's more complex because of the partial matches.
+ */
+
+ /*
+ * If it's not a "<" or ">" or "=" operator, just ignore the
+ * clause. Otherwise note the relid and attnum for the variable.
+ *
+ * TODO I'm really unsure the handling of 'isgt' flag (that is, clauses
+ * with reverse order of variable/constant) is correct. I wouldn't
+ * be surprised if there was some mixup. Using the lt/gt operators
+ * instead of messing with the opproc could make it simpler.
+ * It would however be using a different operator than the query,
+ * although it's not any shadier than using the selectivity function
+ * as is done currently.
+ */
+ switch (oprrest)
+ {
+ case F_SCALARLTSEL: /* Var < Const */
+ case F_SCALARGTSEL: /* Var > Const */
+
+ res = bucket_is_smaller_than_value(opproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache, isgt);
+ break;
+
+ case F_EQSEL:
+
+ /*
+ * We only check whether the value is within the bucket, using the
+ * lt operator, and we also check for equality with the boundaries.
+ */
+
+ res = bucket_contains_value(ltproc, cst->constvalue,
+ minval, maxval,
+ minidx, maxidx,
+ mininclude, maxinclude,
+ callcache);
+ break;
+ }
+
+ UPDATE_RESULT(matches[i], res, is_or);
+
+ }
+ }
+ }
+ else if (IsA(clause, NullTest))
+ {
+ NullTest * expr = (NullTest*)clause;
+ Var * var = (Var*)(expr->arg);
+
+ /* FIXME proper matching attribute to dimension */
+ int idx = mv_get_index(var->varattno, stakeys);
+
+ /*
+ * Walk through the buckets and evaluate the current clause. We can
+ * skip items that were already ruled out, and terminate if there are
+ * no remaining buckets that might possibly match.
+ */
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ /*
+ * Skip buckets that were already eliminated - this is impotant
+ * considering how we update the info (we only lower the match)
+ */
+ if ((! is_or) && (matches[i] == MVSTATS_MATCH_NONE))
+ continue;
+ else if (is_or && (matches[i] == MVSTATS_MATCH_FULL))
+ continue;
+
+ /* if the clause mismatches the bucket, set it as MATCH_NONE */
+ if ((expr->nulltesttype == IS_NULL)
+ && (! bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+
+ else if ((expr->nulltesttype == IS_NOT_NULL) &&
+ (bucket->nullsonly[idx]))
+ UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
+ }
+ }
+ else if (or_clause(clause) || and_clause(clause))
+ {
+ /* AND/OR clause, with all clauses compatible with the selected MV stat */
+
+ int i;
+ BoolExpr *orclause = ((BoolExpr*)clause);
+ List *orclauses = orclause->args;
+
+ /* match/mismatch bitmap for each bucket */
+ int or_nmatches = 0;
+ char * or_matches = NULL;
+
+ Assert(orclauses != NIL);
+ Assert(list_length(orclauses) >= 2);
+
+ /* number of matching buckets */
+ or_nmatches = mvhist->nbuckets;
+
+ /* by default none of the buckets matches the clauses */
+ or_matches = palloc0(sizeof(char) * or_nmatches);
+
+ if (or_clause(clause))
+ {
+ /* OR clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
+ or_nmatches = 0;
+ }
+ else
+ {
+ /* AND clauses assume nothing matches, initially */
+ memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
+ }
+
+ /* build the match bitmap for the OR-clauses */
+ or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ stakeys, mvhist,
+ or_nmatches, or_matches, or_clause(clause));
+
+ /* merge the bitmap into the existing one*/
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ /*
+ * To AND-merge the bitmaps, a MIN() semantics is used.
+ * For OR-merge, use MAX().
+ *
+ * FIXME this does not decrease the number of matches
+ */
+ UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ }
+
+ pfree(or_matches);
+
+ }
+ else
+ elog(ERROR, "unknown clause type: %d", clause->type);
+ }
+
+ /* free the call cache */
+ pfree(callcache);
+
+ return nmatches;
+}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8394111..2519249 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -412,7 +412,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
{
info = makeNode(MVStatisticInfo);
@@ -422,10 +422,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/* enabled statistics */
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
+ info->hist_enabled = mvstat->hist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
+ info->hist_built = mvstat->hist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index f9bf10c..9dbb3b6 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.histogram b/src/backend/utils/mvstats/README.histogram
new file mode 100644
index 0000000..a182fa3
--- /dev/null
+++ b/src/backend/utils/mvstats/README.histogram
@@ -0,0 +1,299 @@
+Multivariate histograms
+=======================
+
+Histograms on individual attributes consist of buckets represented by ranges,
+covering the domain of the attribute. That is, each bucket is a [min,max]
+interval, and contains all values in this range. The histogram is built in such
+a way that all buckets have about the same frequency.
+
+Multivariate histograms are an extension into n-dimensional space - the buckets
+are n-dimensional intervals (i.e. n-dimensional rectagles), covering the domain
+of the combination of attributes. That is, each bucket has a vector of lower
+and upper boundaries, denoted min[i] and max[i] (where i = 1..n).
+
+In addition to the boundaries, each bucket tracks additional info:
+
+ * frequency (fraction of tuples in the bucket)
+ * whether the boundaries are inclusive or exclusive
+ * whether the dimension contains only NULL values
+ * number of distinct values in each dimension (for building only)
+
+It's possible that in the future we'll multiple histogram types, with different
+features. We do however expect all the types to share the same representation
+(buckets as ranges) and only differ in how we build them.
+
+The current implementation builds non-overlapping buckets, that may not be true
+for some histogram types and the code should not rely on this assumption. There
+are interesting types of histograms (or algorithms) with overlapping buckets.
+
+When used on low-cardinality data, histograms usually perform considerably worse
+than MCV lists (which are a good fit for this kind of data). This is especially
+true on label-like values, where ordering of the values is mostly unrelated to
+meaning of the data, as proper ordering is crucial for histograms.
+
+On high-cardinality data the histograms are usually a better choice, because MCV
+lists can't represent the distribution accurately enough.
+
+
+Selectivity estimation
+----------------------
+
+The estimation is implemented in clauselist_mv_selectivity_histogram(), and
+works very similarly to clauselist_mv_selectivity_mcvlist.
+
+The main difference is that while MCV lists support exact matches, histograms
+often result in approximate matches - e.g. with equality we can only say if
+the constant would be part of the bucket, but not whether it really is there
+or what fraction of the bucket it corresponds to. In this case we rely on
+some defaults just like in the per-column histograms.
+
+The current implementation uses histograms to estimates those types of clauses
+(think of WHERE conditions):
+
+ (a) equality clauses WHERE (a = 1) AND (b = 2)
+ (b) inequality clauses WHERE (a < 1) AND (b >= 2)
+ (c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
+ (d) OR-clauses WHERE (a = 1) OR (b = 2)
+
+Similarly to MCV lists, it's possible to add support for additional types of
+clauses, for example:
+
+ (e) multi-var clauses WHERE (a > b)
+
+and so on. These are tasks for the future, not yet implemented.
+
+
+When evaluating a clause on a bucket, we may get one of three results:
+
+ (a) FULL_MATCH - The bucket definitely matches the clause.
+
+ (b) PARTIAL_MATCH - The bucket matches the clause, but not necessarily all
+ the tuples it represents.
+
+ (c) NO_MATCH - The bucket definitely does not match the clause.
+
+This may be illustrated using a range [1, 5], which is essentially a 1-D bucket.
+With clause
+
+ WHERE (a < 10) => FULL_MATCH (all range values are below
+ 10, so the whole bucket matches)
+
+ WHERE (a < 3) => PARTIAL_MATCH (there may be values matching
+ the clause, but we don't know how many)
+
+ WHERE (a < 0) => NO_MATCH (the whole range is above 1, so
+ no values from the bucket can match)
+
+Some clauses may produce only some of those results - for example equality
+clauses may never produce FULL_MATCH as we always hit only part of the bucket
+(we can't match both boundaries at the same time). This results in less accurate
+estimates compared to MCV lists, where we can hit a MCV items exactly (there's
+no PARTIAL match in MCV).
+
+There are also clauses that may not produce any PARTIAL_MATCH results. A nice
+example of that is 'IS [NOT] NULL' clause, which either matches the bucket
+completely (FULL_MATCH) or not at all (NO_MATCH), thanks to how the NULL-buckets
+are constructed.
+
+Computing the total selectivity estimate is trivial - simply sum selectivities
+from all the FULL_MATCH and PARTIAL_MATCH buckets (but for buckets marked with
+PARTIAL_MATCH, multiply the frequency by 0.5 to minimize the average error).
+
+
+Building a histogram
+---------------------
+
+The algorithm of building a histogram in general is quite simple:
+
+ (a) create an initial bucket (containing all sample rows)
+
+ (b) create NULL buckets (by splitting the initial bucket)
+
+ (c) repeat
+
+ (1) choose bucket to split next
+
+ (2) terminate if no bucket that might be split found, or if we've
+ reached the maximum number of buckets (16384)
+
+ (3) choose dimension to partition the bucket by
+
+ (4) partition the bucket by the selected dimension
+
+The main complexity is hidden in steps (c.1) and (c.3), i.e. how we choose the
+bucket and dimension for the split, as discussed in the next section.
+
+
+Partitioning criteria
+---------------------
+
+Similarly to one-dimensional histograms, we want to produce buckets with roughly
+the same frequency.
+
+We also need to produce "regular" buckets, because buckets with one dimension
+much longer than the others are very likely to match a lot of conditions (which
+increases error, even if the bucket frequency is very low).
+
+This is especially important when handling OR-clauses, because in that case each
+clause may add buckets independently. With AND-clauses all the clauses have to
+match each bucket, which makes this issue somewhat less concenrning.
+
+To achieve this, we choose the largest bucket (containing the most sample rows),
+but we only choose buckets that can actually be split (have at least 3 different
+combinations of values).
+
+Then we choose the "longest" dimension of the bucket, which is computed by using
+the distinct values in the sample as a measure.
+
+For details see functions select_bucket_to_partition() and partition_bucket(),
+which also includes further discussion.
+
+
+The current limit on number of buckets (16384) is mostly arbitrary, but chosen
+so that it guarantees we don't exceed the number of distinct values indexable by
+uint16 in any of the dimensions. In practice we could handle more buckets as we
+index each dimension separately and the splits should use the dimensions evenly.
+
+Also, histograms this large (with 16k values in multiple dimensions) would be
+quite expensive to build and process, so the 16k limit is rather reasonable.
+
+The actual number of buckets is also related to statistics target, because we
+require MIN_BUCKET_ROWS (10) tuples per bucket before a split, so we can't have
+more than (2 * 300 * target / 10) buckets. For the default target (100) this
+evaluates to ~6k.
+
+
+NULL handling (create_null_buckets)
+-----------------------------------
+
+When building histograms on a single attribute, we first filter out NULL values.
+In the multivariate case, we can't really do that because the rows may contain
+a mix of NULL and non-NULL values in different columns (so we can't simply
+filter all of them out).
+
+For this reason, the histograms are built in a way so that for each bucket, each
+dimension only contains only NULL or non-NULL values. Building the NULL-buckets
+happens as the first step in the build, by the create_null_buckets() function.
+The number of NULL buckets, as produced by this function, has a clear upper
+boundary (2^N) where N is the number of dimensions (attributes the histogram is
+built on). Or rather 2^K where K is the number of attributes that are not marked
+as not-NULL.
+
+The buckets with NULL dimensions are then subject to the same build algorithm
+(i.e. may be split into smaller buckets) just like any other bucket, but may
+only be split by non-NULL dimension.
+
+
+Serialization
+-------------
+
+To store the histogram in pg_mv_statistic table, it is serialized into a more
+efficient form. We also use the representation for estimation, i.e. we don't
+fully deserialize the histogram.
+
+For example the boundary values are deduplicated to minimize the required space.
+How much redundancy is there, actually? Let's assume there are no NULL values,
+so we start with a single bucket - in that case we have 2*N boundaries. Each
+time we split a bucket we introduce one new value (in the "middle" of one of
+the dimensions), and keep boundries for all the other dimensions. So after K
+splits, we have up to
+
+ 2*N + K
+
+unique boundary values (we may have fewe values, if the same value is used for
+several splits). But after K splits we do have (K+1) buckets, so
+
+ (K+1) * 2 * N
+
+boundary values. Using e.g. N=4 and K=999, we arrive to those numbers:
+
+ 2*N + K = 1007
+ (K+1) * 2 * N = 8000
+
+wich means a lot of redundancy. It's somewhat counter-intuitive that the number
+of distinct values does not really depend on the number of dimensions (except
+for the initial bucket, but that's negligible compared to the total).
+
+By deduplicating the values and replacing them with 16-bit indexes (uint16), we
+reduce the required space to
+
+ 1007 * 8 + 8000 * 2 ~= 24kB
+
+which is significantly less than 64kB required for the 'raw' histogram (assuming
+the values are 8B).
+
+While the bytea compression (pglz) might achieve the same reduction of space,
+the deduplicated representation is used to optimize the estimation by caching
+results of function calls for already visited values. This significantly
+reduces the number of calls to (often quite expensive) operators.
+
+Note: Of course, this reasoning only holds for histograms built by the algorithm
+that simply splits the buckets in half. Other histograms types (e.g. containing
+overlapping buckets) may behave differently and require different serialization.
+
+Serialized histograms are marked with 'magic' constant, to make it easier to
+check the bytea value really is a serialized histogram.
+
+
+varlena compression
+-------------------
+
+This serialization may however disable automatic varlena compression, the array
+of unique values is placed at the beginning of the serialized form. Which is
+exactly the chunk used by pglz to check if the data is compressible, and it
+will probably decide it's not very compressible. This is similar to the issue
+we had with JSONB initially.
+
+Maybe storing buckets first would make it work, as the buckets may be better
+compressible.
+
+On the other hand the serialization is actually a context-aware compression,
+usually compressing to ~30% (or even less, with large data types). So the lack
+of additional pglz compression may be acceptable.
+
+
+Deserialization
+---------------
+
+The deserialization is not a perfect inverse of the serialization, as we keep
+the deduplicated arrays. This reduces the amount of memory and also allows
+optimizations during estimation (e.g. we can cache results for the distinct
+values, saving expensive function calls).
+
+
+Inspecting the histogram
+------------------------
+
+Inspecting the regular (per-attribute) histograms is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarray, so we
+simply get the text representation of the array.
+
+With multivariate histograms it's not that simple due to the possible mix of
+data types in the histogram. It might be possible to produce similar array-like
+text representation, but that'd unnecessarily complicate further processing
+and analysis of the histogram. Instead, there's a SRF function that allows
+access to lower/upper boundaries, frequencies etc.
+
+ SELECT * FROM pg_mv_histogram_buckets();
+
+It has two input parameters:
+
+ oid - OID of the histogram (pg_mv_statistic.staoid)
+ otype - type of output
+
+and produces a table with these columns:
+
+ - bucket ID (0...nbuckets-1)
+ - lower bucket boundaries (string array)
+ - upper bucket boundaries (string array)
+ - nulls only dimensions (boolean array)
+ - lower boundary inclusive (boolean array)
+ - upper boundary includive (boolean array)
+ - frequency (double precision)
+
+The 'otype' accepts three values, determining what will be returned in the
+lower/upper boundary arrays:
+
+ - 0 - values stored in the histogram, encoded as text
+ - 1 - indexes into the deduplicated arrays
+ - 2 - idnexes into the deduplicated arrays, scaled to [0,1]
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 5c5c59a..3e4f4d1 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -18,6 +18,8 @@ Currently we only have two kinds of multivariate statistics
(b) MCV lists (README.mcv)
+ (c) multivariate histograms (README.histogram)
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index 4f5a842..f6d1074 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -13,11 +13,11 @@
*
*-------------------------------------------------------------------------
*/
+#include "postgres.h"
+#include "utils/array.h"
#include "common.h"
-#include "utils/array.h"
-
static VacAttrStats ** lookup_var_attr_stats(int2vector *attrs,
int natts,
VacAttrStats **vacattrstats);
@@ -52,7 +52,8 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(lc);
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
- int numrows_filtered = 0;
+ MVHistogram histogram = NULL;
+ int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
int numatts = 0;
@@ -95,8 +96,12 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
+ /* build a multivariate histogram on the columns */
+ if ((numrows_filtered > 0) && (stat->hist_enabled))
+ histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
+
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
}
}
@@ -176,6 +181,8 @@ list_mv_stats(Oid relid)
info->deps_built = stats->deps_built;
info->mcv_enabled = stats->mcv_enabled;
info->mcv_built = stats->mcv_built;
+ info->hist_enabled = stats->hist_enabled;
+ info->hist_built = stats->hist_built;
result = lappend(result, info);
}
@@ -190,7 +197,6 @@ list_mv_stats(Oid relid)
return result;
}
-
/*
* Find attnims of MV stats using the mvoid.
*/
@@ -236,9 +242,16 @@ find_mv_attnums(Oid mvoid, Oid *relid)
}
+/*
+ * FIXME This adds statistics, but we need to drop statistics when the
+ * table is dropped. Not sure what to do when a column is dropped.
+ * Either we can (a) remove all stats on that column, (b) remove
+ * the column from defined stats and force rebuild, (c) remove the
+ * column on next ANALYZE. Or maybe something else?
+ */
void
update_mv_stats(Oid mvoid,
- MVDependencies dependencies, MCVList mcvlist,
+ MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
@@ -271,22 +284,34 @@ update_mv_stats(Oid mvoid,
values[Anum_pg_mv_statistic_stamcv - 1] = PointerGetDatum(data);
}
+ if (histogram != NULL)
+ {
+ bytea * data = serialize_mv_histogram(histogram, attrs, stats);
+ nulls[Anum_pg_mv_statistic_stahist-1] = (data == NULL);
+ values[Anum_pg_mv_statistic_stahist - 1]
+ = PointerGetDatum(data);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
+ replaces[Anum_pg_mv_statistic_stahist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
+ nulls[Anum_pg_mv_statistic_hist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
+ values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/histogram.c b/src/backend/utils/mvstats/histogram.c
new file mode 100644
index 0000000..4bf7ec6
--- /dev/null
+++ b/src/backend/utils/mvstats/histogram.c
@@ -0,0 +1,2023 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ * POSTGRES multivariate histograms
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/histogram.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+
+#include "utils/lsyscache.h"
+
+#include "common.h"
+#include <math.h>
+
+
+static MVBucket create_initial_mv_bucket(int numrows, HeapTuple *rows,
+ int2vector *attrs,
+ VacAttrStats **stats);
+
+static MVBucket select_bucket_to_partition(int nbuckets, MVBucket * buckets);
+
+static MVBucket partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues);
+
+static MVBucket copy_mv_bucket(MVBucket bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket bucket, int2vector *attrs,
+ VacAttrStats ** stats);
+
+static void update_dimension_ndistinct(MVBucket bucket, int dimension,
+ int2vector *attrs,
+ VacAttrStats ** stats,
+ bool update_boundaries);
+
+static void create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats);
+
+static Datum * build_ndistinct(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int i, int *nvals);
+
+/*
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples (float)
+ * - number of distinct (float)
+ * - min inclusive flags (ndim * sizeof(bool))
+ * - max inclusive flags (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(uint16))
+ * - max boundary indexes (2 * ndim * sizeof(uint16))
+ *
+ * So in total:
+ *
+ * ndim * (4 * sizeof(uint16) + 3 * sizeof(bool)) + (2 * sizeof(float))
+ */
+#define BUCKET_SIZE(ndims) \
+ (ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/* pointers into a flat serialized bucket of BUCKET_SIZE(n) bytes */
+#define BUCKET_NTUPLES(b) (*(float*)b)
+#define BUCKET_MIN_INCL(b,n) ((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n) (BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n) (BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/* can't split bucket with less than 10 rows */
+#define MIN_BUCKET_ROWS 10
+
+/*
+ * Data used while building the histogram.
+ */
+typedef struct HistogramBuildData {
+
+ float ndistinct; /* frequency of distinct values */
+
+ HeapTuple *rows; /* aray of sample rows */
+ uint32 numrows; /* number of sample rows (array size) */
+
+ /*
+ * Number of distinct values in each dimension. This is used when
+ * building the histogram (and is not serialized/deserialized).
+ */
+ uint32 *ndistincts;
+
+} HistogramBuildData;
+
+typedef HistogramBuildData *HistogramBuild;
+
+/*
+ * builds a multivariate algorithm
+ *
+ * The build algorithm is iterative - initially a single bucket containing all
+ * the sample rows is formed, and then repeatedly split into smaller buckets.
+ * In each step the largest bucket (in some sense) is chosen to be split next.
+ *
+ * The criteria for selecting the largest bucket (and the dimension for the
+ * split) needs to be elaborate enough to produce buckets of roughly the same
+ * size, and also regular shape (not very long in one dimension).
+ *
+ * The current algorithm works like this:
+ *
+ * build NULL-buckets (create_null_buckets)
+ *
+ * while [maximum number of buckets not reached]
+ *
+ * choose bucket to partition (largest bucket)
+ * if no bucket to partition
+ * terminate the algorithm
+ *
+ * choose bucket dimension to partition (largest dimension)
+ * split the bucket into two buckets
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket for
+ * more details about the algorithm.
+ */
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total)
+{
+ int i;
+ int numattrs = attrs->dim1;
+
+ int *ndistvalues;
+ Datum **distvalues;
+
+ MVHistogram histogram;
+
+ HeapTuple * rows_copy = (HeapTuple*)palloc0(numrows * sizeof(HeapTuple));
+ memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* build histogram header */
+
+ histogram = (MVHistogram)palloc0(sizeof(MVHistogramData));
+
+ histogram->magic = MVSTAT_HIST_MAGIC;
+ histogram->type = MVSTAT_HIST_TYPE_BASIC;
+
+ histogram->nbuckets = 1;
+ histogram->ndimensions = numattrs;
+
+ /* create max buckets (better than repalloc for short-lived objects) */
+ histogram->buckets
+ = (MVBucket*)palloc0(MVSTAT_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+ /* create the initial bucket, covering the whole sample set */
+ histogram->buckets[0]
+ = create_initial_mv_bucket(numrows, rows_copy, attrs, stats);
+
+ /*
+ * Collect info on distinct values in each dimension (used later to select
+ * dimension to partition).
+ */
+ ndistvalues = (int*)palloc0(sizeof(int) * numattrs);
+ distvalues = (Datum**)palloc0(sizeof(Datum*) * numattrs);
+
+ for (i = 0; i < numattrs; i++)
+ distvalues[i] = build_ndistinct(numrows, rows, attrs, stats, i,
+ &ndistvalues[i]);
+
+ /*
+ * Split the initial bucket into buckets that don't mix NULL and non-NULL
+ * values in a single dimension.
+ */
+ create_null_buckets(histogram, 0, attrs, stats);
+
+ /*
+ * Do the actual histogram build - select a bucket and split it.
+ *
+ * FIXME This should use the max_buckets specified in CREATE STATISTICS.
+ */
+ while (histogram->nbuckets < MVSTAT_HIST_MAX_BUCKETS)
+ {
+ MVBucket bucket = select_bucket_to_partition(histogram->nbuckets,
+ histogram->buckets);
+
+ /* no buckets eligible for partitioning */
+ if (bucket == NULL)
+ break;
+
+ /* we modify the bucket in-place and add one new bucket */
+ histogram->buckets[histogram->nbuckets++]
+ = partition_bucket(bucket, attrs, stats, ndistvalues, distvalues);
+ }
+
+ /* finalize the histogram build - compute the frequencies etc. */
+ for (i = 0; i < histogram->nbuckets; i++)
+ {
+ HistogramBuild build_data
+ = ((HistogramBuild)histogram->buckets[i]->build_data);
+
+ /*
+ * The frequency has to be computed from the whole sample, in case some
+ * of the rows were used for MCV.
+ *
+ * XXX Perhaps this should simply compute frequency with respect to the
+ * local freuquency, and then factor-in the MCV later.
+ *
+ * FIXME The 'ntuples' sounds a bit inappropriate for frequency.
+ */
+ histogram->buckets[i]->ntuples
+ = (build_data->numrows * 1.0) / numrows_total;
+ }
+
+ return histogram;
+}
+
+/* build array of distinct values for a single attribute */
+static Datum *
+build_ndistinct(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int i, int *nvals)
+{
+ int j;
+ int nvalues,
+ ndistinct;
+ Datum *values,
+ *distvalues;
+
+ SortSupportData ssup;
+ StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ nvalues = 0;
+ values = (Datum*)palloc0(sizeof(Datum) * numrows);
+
+ /* collect values from the sample rows, ignore NULLs */
+ for (j = 0; j < numrows; j++)
+ {
+ Datum value;
+ bool isnull;
+
+ /* remember the index of the sample row, to make the partitioning simpler */
+ value = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &isnull);
+
+ if (isnull)
+ continue;
+
+ values[nvalues++] = value;
+ }
+
+ /* if no non-NULL values were found, free the memory and terminate */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ return NULL;
+ }
+
+ /* sort the array of values using the SortSupport */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /* count the distinct values first, and allocate just enough memory */
+ ndistinct = 1;
+ for (j = 1; j < nvalues; j++)
+ if (compare_scalars_simple(&values[j], &values[j-1], &ssup) != 0)
+ ndistinct += 1;
+
+ distvalues = (Datum*)palloc0(sizeof(Datum) * ndistinct);
+
+ /* now collect distinct values into the array */
+ distvalues[0] = values[0];
+ ndistinct = 1;
+
+ for (j = 1; j < nvalues; j++)
+ {
+ if (compare_scalars_simple(&values[j], &values[j-1], &ssup) != 0)
+ {
+ distvalues[ndistinct] = values[j];
+ ndistinct += 1;
+ }
+ }
+
+ pfree(values);
+
+ *nvals = ndistinct;
+ return distvalues;
+}
+
+/* fetch the histogram (as a bytea) from the pg_mv_statistic catalog */
+MVSerializedHistogram
+load_mv_histogram(Oid mvoid)
+{
+ bool isnull = false;
+ Datum histogram;
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat;
+#endif
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+ if (! HeapTupleIsValid(htup))
+ return NULL;
+
+#ifdef USE_ASSERT_CHECKING
+ mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->hist_enabled && mvstat->hist_built);
+#endif
+
+ histogram = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_stahist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return deserialize_mv_histogram(DatumGetByteaP(histogram));
+}
+
+/* print some basic info about the histogram */
+Datum
+pg_mv_stats_histogram_info(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_P(0);
+ char *result;
+
+ MVSerializedHistogram hist = deserialize_mv_histogram(data);
+
+ result = palloc0(128);
+ snprintf(result, 128, "nbuckets=%d", hist->nbuckets);
+
+ PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm is quite
+ * simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ *
+ * (a) collect all (non-NULL) attribute values from all buckets
+ * (b) sort the data (using 'lt' from VacAttrStats)
+ * (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ *
+ * (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we're mixing different
+ * datatypes, and we we need to use the right operators to compare/sort them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc() calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char'
+ * or a longer type (instead of using an array of bool items).
+ */
+bytea *
+serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i = 0, j = 0;
+ Size total_length = 0;
+
+ bytea *output = NULL;
+ char *data = NULL;
+
+ DimensionInfo *info;
+ SortSupport ssup;
+
+ int nbuckets = histogram->nbuckets;
+ int ndims = histogram->ndimensions;
+
+ /* allocated for serialized bucket data */
+ int bucketsize = BUCKET_SIZE(ndims);
+ char *bucket = palloc0(bucketsize);
+
+ /* values per dimension (and number of non-NULL values) */
+ Datum **values = (Datum**)palloc0(sizeof(Datum*) * ndims);
+ int *counts = (int*)palloc0(sizeof(int) * ndims);
+
+ /* info about dimensions (for deserialize) */
+ info = (DimensionInfo *)palloc0(sizeof(DimensionInfo)*ndims);
+
+ /* sort support data */
+ ssup = (SortSupport)palloc0(sizeof(SortSupportData)*ndims);
+
+ /* collect and deduplicate values for each dimension separately */
+ for (i = 0; i < ndims; i++)
+ {
+ int count;
+ StdAnalyzeData *tmp = (StdAnalyzeData *)stats[i]->extra_data;
+
+ /* keep important info about the data type */
+ info[i].typlen = stats[i]->attrtype->typlen;
+ info[i].typbyval = stats[i]->attrtype->typbyval;
+
+ /*
+ * Allocate space for all min/max values, including NULLs (we won't use
+ * them, but we don't know how many are there), and then collect all
+ * non-NULL values.
+ */
+ values[i] = (Datum*)palloc0(sizeof(Datum) * nbuckets * 2);
+
+ for (j = 0; j < histogram->nbuckets; j++)
+ {
+ /* skip buckets where this dimension is NULL-only */
+ if (! histogram->buckets[j]->nullsonly[i])
+ {
+ values[i][counts[i]] = histogram->buckets[j]->min[i];
+ counts[i] += 1;
+
+ values[i][counts[i]] = histogram->buckets[j]->max[i];
+ counts[i] += 1;
+ }
+ }
+
+ /* there are just NULL values in this dimension */
+ if (counts[i] == 0)
+ continue;
+
+ /* sort and deduplicate */
+ ssup[i].ssup_cxt = CurrentMemoryContext;
+ ssup[i].ssup_collation = DEFAULT_COLLATION_OID;
+ ssup[i].ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[i]);
+
+ qsort_arg(values[i], counts[i], sizeof(Datum),
+ compare_scalars_simple, &ssup[i]);
+
+ /*
+ * Walk through the array and eliminate duplicitate values, but
+ * keep the ordering (so that we can do bsearch later). We know
+ * there's at least 1 item, so we can skip the first element.
+ */
+ count = 1; /* number of deduplicated items */
+ for (j = 1; j < counts[i]; j++)
+ {
+ /* if it's different from the previous value, we need to keep it */
+ if (compare_datums_simple(values[i][j-1], values[i][j], &ssup[i]) != 0)
+ {
+ /* XXX: not needed if (count == j) */
+ values[i][count] = values[i][j];
+ count += 1;
+ }
+ }
+
+ /* make sure we fit into uint16 */
+ Assert(count <= UINT16_MAX);
+
+ /* keep info about the deduplicated count */
+ info[i].nvalues = count;
+
+ /* compute size of the serialized data */
+ if (info[i].typlen > 0)
+ /* byval or byref, but with fixed length (name, tid, ...) */
+ info[i].nbytes = info[i].nvalues * info[i].typlen;
+ else if (info[i].typlen == -1)
+ /* varlena, so just use VARSIZE_ANY */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += VARSIZE_ANY(values[i][j]);
+ else if (info[i].typlen == -2)
+ /* cstring, so simply strlen */
+ for (j = 0; j < info[i].nvalues; j++)
+ info[i].nbytes += strlen(DatumGetPointer(values[i][j]));
+ else
+ elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+ info[i].typbyval, info[i].typlen);
+ }
+
+ /*
+ * Now we finally know how much space we'll need for the serialized
+ * histogram, as it contains these fields:
+ *
+ * - length (4B) for varlena
+ * - magic (4B)
+ * - type (4B)
+ * - ndimensions (4B)
+ * - nbuckets (4B)
+ * - info (ndim * sizeof(DimensionInfo)
+ * - arrays of values for each dimension
+ * - serialized buckets (nbuckets * bucketsize)
+ *
+ * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and
+ * then we'll place the data (and buckets).
+ */
+ total_length = (sizeof(int32) + offsetof(MVHistogramData, buckets)
+ + ndims * sizeof(DimensionInfo)
+ + nbuckets * bucketsize);
+
+ /* account for the deduplicated data */
+ for (i = 0; i < ndims; i++)
+ total_length += info[i].nbytes;
+
+ /* enforce arbitrary limit of 1MB */
+ if (total_length > (1024 * 1024))
+ elog(ERROR, "serialized histogram exceeds 1MB (%ld > %d)",
+ total_length, (1024 * 1024));
+
+ /* allocate space for the serialized histogram list, set header */
+ output = (bytea*)palloc0(total_length);
+ SET_VARSIZE(output, total_length);
+
+ /* we'll use 'data' to keep track of the place to write data */
+ data = VARDATA(output);
+
+ memcpy(data, histogram, offsetof(MVHistogramData, buckets));
+ data += offsetof(MVHistogramData, buckets);
+
+ memcpy(data, info, sizeof(DimensionInfo) * ndims);
+ data += sizeof(DimensionInfo) * ndims;
+
+ /* serialize the deduplicated values for all attributes */
+ for (i = 0; i < ndims; i++)
+ {
+#ifdef USE_ASSERT_CHECKING
+ char *tmp = data;
+#endif
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ Datum v = values[i][j];
+
+ if (info[i].typbyval) /* passed by value */
+ {
+ memcpy(data, &v, info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen > 0) /* pased by reference */
+ {
+ memcpy(data, DatumGetPointer(v), info[i].typlen);
+ data += info[i].typlen;
+ }
+ else if (info[i].typlen == -1) /* varlena */
+ {
+ memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+ data += VARSIZE_ANY(values[i][j]);
+ }
+ else if (info[i].typlen == -2) /* cstring */
+ {
+ memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v))+1);
+ data += strlen(DatumGetPointer(v)) + 1;
+ }
+ }
+
+ /* make sure we got exactly the amount of data we expected */
+ Assert((data - tmp) == info[i].nbytes);
+ }
+
+ /* finally serialize the items, with uint16 indexes instead of the values */
+ for (i = 0; i < nbuckets; i++)
+ {
+ /* don't write beyond the allocated space */
+ Assert(data <= (char*)output + total_length - bucketsize);
+
+ /* reset the values for each item */
+ memset(bucket, 0, bucketsize);
+
+ BUCKET_NTUPLES(bucket) = histogram->buckets[i]->ntuples;
+
+ for (j = 0; j < ndims; j++)
+ {
+ /* do the lookup only for non-NULL values */
+ if (! histogram->buckets[i]->nullsonly[j])
+ {
+ uint16 idx;
+ Datum * v = NULL;
+
+ /* min boundary */
+ v = (Datum*)bsearch_arg(&histogram->buckets[i]->min[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ compare_scalars_simple, &ssup[j]);
+
+ Assert(v != NULL); /* serialization or deduplication error */
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MIN_INDEXES(bucket, ndims)[j] = idx;
+
+ /* max boundary */
+ v = (Datum*)bsearch_arg(&histogram->buckets[i]->max[j],
+ values[j], info[j].nvalues, sizeof(Datum),
+ compare_scalars_simple, &ssup[j]);
+
+ Assert(v != NULL); /* serialization or deduplication error */
+
+ /* compute index within the array */
+ idx = (v - values[j]);
+
+ Assert((idx >= 0) && (idx < info[j].nvalues));
+
+ BUCKET_MAX_INDEXES(bucket, ndims)[j] = idx;
+ }
+ }
+
+ /* copy flags (nulls, min/max inclusive) */
+ memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+ histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MIN_INCL(bucket, ndims),
+ histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+ memcpy(BUCKET_MAX_INCL(bucket, ndims),
+ histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+ /* copy the item into the array */
+ memcpy(data, bucket, bucketsize);
+
+ data += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((data - (char*)output) == total_length);
+
+ /* free the values/counts arrays here */
+ pfree(counts);
+ pfree(info);
+ pfree(ssup);
+
+ for (i = 0; i < ndims; i++)
+ pfree(values[i]);
+
+ pfree(values);
+
+ return output;
+}
+
+/*
+ * Returns histogram in a partially-serialized form (keeps the boundary values
+ * deduplicated, so that it's possible to optimize the estimation part by
+ * caching function call results between buckets etc.).
+ */
+MVSerializedHistogram
+deserialize_mv_histogram(bytea * data)
+{
+ int i = 0, j = 0;
+
+ Size expected_size;
+ char *tmp = NULL;
+
+ MVSerializedHistogram histogram;
+ DimensionInfo *info;
+
+ int nbuckets;
+ int ndims;
+ int bucketsize;
+
+ /* temporary deserialization buffer */
+ int bufflen;
+ char *buff;
+ char *ptr;
+
+ if (data == NULL)
+ return NULL;
+
+ if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogramData,buckets))
+ elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+ VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogramData,buckets));
+
+ /* read the histogram header */
+ histogram
+ = (MVSerializedHistogram)palloc(sizeof(MVSerializedHistogramData));
+
+ /* initialize pointer to the data part (skip the varlena header) */
+ tmp = VARDATA(data);
+
+ /* get the header and perform basic sanity checks */
+ memcpy(histogram, tmp, offsetof(MVSerializedHistogramData, buckets));
+ tmp += offsetof(MVSerializedHistogramData, buckets);
+
+ if (histogram->magic != MVSTAT_HIST_MAGIC)
+ elog(ERROR, "invalid histogram magic %d (expected %dd)",
+ histogram->magic, MVSTAT_HIST_MAGIC);
+
+ if (histogram->type != MVSTAT_HIST_TYPE_BASIC)
+ elog(ERROR, "invalid histogram type %d (expected %dd)",
+ histogram->type, MVSTAT_HIST_TYPE_BASIC);
+
+ nbuckets = histogram->nbuckets;
+ ndims = histogram->ndimensions;
+ bucketsize = BUCKET_SIZE(ndims);
+
+ Assert((nbuckets > 0) && (nbuckets <= MVSTAT_HIST_MAX_BUCKETS));
+ Assert((ndims >= 2) && (ndims <= MVSTATS_MAX_DIMENSIONS));
+
+ /*
+ * What size do we expect with those parameters (it's incomplete, as we yet
+ * have to count the array sizes (from DimensionInfo records).
+ */
+ expected_size = offsetof(MVSerializedHistogramData,buckets) +
+ ndims * sizeof(DimensionInfo) +
+ (nbuckets * bucketsize);
+
+ /* check that we have at least the DimensionInfo records */
+ if (VARSIZE_ANY_EXHDR(data) < expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ info = (DimensionInfo*)(tmp);
+ tmp += ndims * sizeof(DimensionInfo);
+
+ /* account for the value arrays */
+ for (i = 0; i < ndims; i++)
+ expected_size += info[i].nbytes;
+
+ if (VARSIZE_ANY_EXHDR(data) != expected_size)
+ elog(ERROR, "invalid histogram size %ld (expected %ld)",
+ VARSIZE_ANY_EXHDR(data), expected_size);
+
+ /* looks OK - not corrupted or something */
+
+ /* a single buffer for all the values and counts */
+ bufflen = (sizeof(int) + sizeof(Datum*)) * ndims;
+
+ for (i = 0; i < ndims; i++)
+ /* don't allocate space for byval types, matching Datum */
+ if (! (info[i].typbyval && (info[i].typlen == sizeof(Datum))))
+ bufflen += (sizeof(Datum) * info[i].nvalues);
+
+ /* also, include space for the result, tracking the buckets */
+ bufflen += nbuckets * (
+ sizeof(MVSerializedBucket) + /* bucket pointer */
+ sizeof(MVSerializedBucketData)); /* bucket data */
+
+ buff = palloc0(bufflen);
+ ptr = buff;
+
+ histogram->nvalues = (int*)ptr;
+ ptr += (sizeof(int) * ndims);
+
+ histogram->values = (Datum**)ptr;
+ ptr += (sizeof(Datum*) * ndims);
+
+ /*
+ * FIXME This uses pointers to the original data array (the types
+ * not passed by value), so when someone frees the memory,
+ * e.g. by doing something like this:
+ *
+ * bytea * data = ... fetch the data from catalog ...
+ * MCVList mcvlist = deserialize_mcv_list(data);
+ * pfree(data);
+ *
+ * then 'mcvlist' references the freed memory. This needs to
+ * copy the pieces.
+ *
+ * TODO same as in MCV deserialization / consider moving to common.c
+ */
+ for (i = 0; i < ndims; i++)
+ {
+ histogram->nvalues[i] = info[i].nvalues;
+
+ if (info[i].typbyval)
+ {
+ /* passed by value / Datum - simply reuse the array */
+ if (info[i].typlen == sizeof(Datum))
+ {
+ histogram->values[i] = (Datum*)tmp;
+ tmp += info[i].nbytes;
+ }
+ else
+ {
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ memcpy(&histogram->values[i][j], tmp, info[i].typlen);
+ tmp += info[i].typlen;
+ }
+ }
+ }
+ else
+ {
+ /* all the other types need a chunk of the buffer */
+ histogram->values[i] = (Datum*)ptr;
+ ptr += (sizeof(Datum) * info[i].nvalues);
+
+ if (info[i].typlen > 0)
+ {
+ /* pased by reference, but fixed length (name, tid, ...) */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += info[i].typlen;
+ }
+ }
+ else if (info[i].typlen == -1)
+ {
+ /* varlena */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += VARSIZE_ANY(tmp);
+ }
+ }
+ else if (info[i].typlen == -2)
+ {
+ /* cstring */
+ for (j = 0; j < info[i].nvalues; j++)
+ {
+ /* just point into the array */
+ histogram->values[i][j] = PointerGetDatum(tmp);
+ tmp += (strlen(tmp) + 1); /* don't forget the \0 */
+ }
+ }
+ }
+ }
+
+ histogram->buckets = (MVSerializedBucket*)ptr;
+ ptr += (sizeof(MVSerializedBucket) * nbuckets);
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ MVSerializedBucket bucket = (MVSerializedBucket)ptr;
+ ptr += sizeof(MVSerializedBucketData);
+
+ bucket->ntuples = BUCKET_NTUPLES(tmp);
+ bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+ bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+ bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+ bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+ bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+ histogram->buckets[i] = bucket;
+
+ Assert(tmp <= (char*)data + VARSIZE_ANY(data));
+
+ tmp += bucketsize;
+ }
+
+ /* at this point we expect to match the total_length exactly */
+ Assert((tmp - VARDATA(data)) == expected_size);
+
+ /* we should exhaust the output buffer exactly */
+ Assert((ptr - buff) == bufflen);
+
+ return histogram;
+}
+
+/*
+ * Build the initial bucket, which will be then split into smaller ones.
+ */
+static MVBucket
+create_initial_mv_bucket(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats)
+{
+ int i;
+ int numattrs = attrs->dim1;
+ HistogramBuild data = NULL;
+
+ /* TODO allocate bucket as a single piece, including all the fields. */
+ MVBucket bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+
+ Assert(numrows > 0);
+ Assert(rows != NULL);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* allocate the per-dimension arrays */
+
+ /* flags for null-only dimensions */
+ bucket->nullsonly = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ bucket->min_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+ bucket->max_inclusive = (bool*)palloc0(numattrs * sizeof(bool));
+
+ /* lower/upper boundaries */
+ bucket->min = (Datum*)palloc0(numattrs * sizeof(Datum));
+ bucket->max = (Datum*)palloc0(numattrs * sizeof(Datum));
+
+ /* build-data */
+ data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* number of distinct values (per dimension) */
+ data->ndistincts = (uint32*)palloc0(numattrs * sizeof(uint32));
+
+ /* all the sample rows fall into the initial bucket */
+ data->numrows = numrows;
+ data->rows = rows;
+
+ bucket->build_data = data;
+
+ /*
+ * Update the number of ndistinct combinations in the bucket (which we use
+ * when selecting bucket to partition), and then number of distinct values
+ * for each partition (which we use when choosing which dimension to split).
+ */
+ update_bucket_ndistinct(bucket, attrs, stats);
+
+ /* Update ndistinct (and also set min/max) for all dimensions. */
+ for (i = 0; i < numattrs; i++)
+ update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+ return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm produces
+ * buckets with about equal frequency and regular size. We select the bucket
+ * with the highest number of distinct values, and then split it by the longest
+ * dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this is used
+ * to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this contains
+ * values for all the tuples from the sample, not just the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned, or NULL if
+ * there are no buckets that may be split (e.g. if all buckets are too small
+ * or contain too few distinct values).
+ *
+ *
+ * Tricky example
+ * --------------
+ *
+ * Consider this table:
+ *
+ * CREATE TABLE t AS SELECT i AS a, i AS b
+ * FROM generate_series(1,1000000) s(i);
+ *
+ * CREATE STATISTICS s1 ON t (a,b) WITH (histogram);
+ *
+ * ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because every bucket
+ * always has exactly the same number of distinct values in all dimensions,
+ * which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ * SELECT * FROM t WHERE (a < 100) AND (b < 100);
+ *
+ * is estimated to return ~120 rows, while in reality it returns only 99.
+ *
+ * QUERY PLAN
+ * -------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=117 width=8)
+ * (actual time=0.129..82.776 rows=99 loops=1)
+ * Filter: ((a < 100) AND (b < 100))
+ * Rows Removed by Filter: 999901
+ * Planning time: 1.286 ms
+ * Execution time: 82.984 ms
+ * (5 rows)
+ *
+ * So this estimate is reasonably close. Let's change the query to OR clause:
+ *
+ * SELECT * FROM t WHERE (a < 100) OR (b < 100);
+ *
+ * QUERY PLAN
+ * -------------------------------------------------------------
+ * Seq Scan on t (cost=0.00..19425.00 rows=8100 width=8)
+ * (actual time=0.145..99.910 rows=99 loops=1)
+ * Filter: ((a < 100) OR (b < 100))
+ * Rows Removed by Filter: 999901
+ * Planning time: 1.578 ms
+ * Execution time: 100.132 ms
+ * (5 rows)
+ *
+ * That's clearly a much worse estimate. This happens because the histogram
+ * contains buckets like this:
+ *
+ * bucket 592 [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the length of "b"
+ * is (30593-30134)=459. So the "b" dimension is much narrower than "a".
+ * Of course, there are also buckets where "b" is the wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension but that
+ * only happens after we already selected the bucket. So if we never select the
+ * bucket, this optimization does not apply.
+ *
+ * The other reason why this particular example behaves so poorly is due to the
+ * way we actually split the selected bucket. We do attempt to divide the bucket
+ * into two parts containing about the same number of tuples, but that does not
+ * too well when most of the tuples is squashed on one side of the bucket.
+ *
+ * For example for columns with data on the diagonal (i.e. when a=b), we end up
+ * with a narrow bucket on the diagonal and a huge bucket overing the remaining
+ * part (with much lower density).
+ *
+ * So perhaps we need two partitioning strategies - one aiming to split buckets
+ * with high frequency (number of sampled rows), the other aiming to split
+ * "large" buckets. And alternating between them, somehow.
+ *
+ * TODO Consider using similar lower boundary for row count as for simple
+ * histograms, i.e. 300 tuples per bucket.
+ */
+static MVBucket
+select_bucket_to_partition(int nbuckets, MVBucket * buckets)
+{
+ int i;
+ int numrows = 0;
+ MVBucket bucket = NULL;
+
+ for (i = 0; i < nbuckets; i++)
+ {
+ HistogramBuild data = (HistogramBuild)buckets[i]->build_data;
+
+ /* if the number of rows is higher, use this bucket */
+ if ((data->ndistinct > 2) &&
+ (data->numrows > numrows) &&
+ (data->numrows >= MIN_BUCKET_ROWS)) {
+ bucket = buckets[i];
+ numrows = data->numrows;
+ }
+ }
+
+ /* may be NULL if there are not buckets with (ndistinct>1) */
+ return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest bucket
+ * dimension, measured using the array of distinct values built at the very
+ * beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly distributed,
+ * and then use this to measure length. It's essentially a number of distinct
+ * values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts with
+ * roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning the new
+ * bucket (essentially shrinking the existing one in-place and returning the
+ * other "half" as a new bucket). The caller is responsible for adding the new
+ * bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension most in
+ * need of a split. For a nice summary and general overview, see "rK-Hist : an
+ * R-Tree based histogram for multi-dimensional selectivity estimation" thesis
+ * by J. A. Lopez, Concordia University, p.34-37 (and possibly p. 32-34 for
+ * explanation of the terms).
+ *
+ * It requires care to prevent splitting only one dimension and not splitting
+ * another one at all (which might happen easily in case of strongly dependent
+ * columns - e.g. y=x). The current algorithm minimizes this, but may still
+ * happen for perfectly dependent examples (when all the dimensions have equal
+ * length, the first one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucket
+partition_bucket(MVBucket bucket, int2vector *attrs,
+ VacAttrStats **stats,
+ int *ndistvalues, Datum **distvalues)
+{
+ int i;
+ int dimension;
+ int numattrs = attrs->dim1;
+
+ Datum split_value;
+ MVBucket new_bucket;
+ HistogramBuild new_data;
+
+ /* needed for sort, when looking for the split value */
+ bool isNull;
+ int nvalues = 0;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ StdAnalyzeData * mystats = NULL;
+ ScalarItem * values = (ScalarItem*)palloc0(data->numrows * sizeof(ScalarItem));
+ SortSupportData ssup;
+
+ int nrows = 1; /* number of rows below current value */
+ double delta;
+
+ /* needed when splitting the values */
+ HeapTuple * oldrows = data->rows;
+ int oldnrows = data->numrows;
+
+ /*
+ * We can't split buckets with a single distinct value (this also
+ * disqualifies NULL-only dimensions). Also, there has to be multiple
+ * sample rows (otherwise, how could there be more distinct values).
+ */
+ Assert(data->ndistinct > 1);
+ Assert(data->numrows > 1);
+ Assert((numattrs >= 2) && (numattrs <= MVSTATS_MAX_DIMENSIONS));
+
+ /* Look for the next dimension to split. */
+ delta = 0.0;
+ dimension = -1;
+
+ for (i = 0; i < numattrs; i++)
+ {
+ Datum *a, *b;
+
+ mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ /* can't split NULL-only dimension */
+ if (bucket->nullsonly[i])
+ continue;
+
+ /* can't split dimension with a single ndistinct value */
+ if (data->ndistincts[i] <= 1)
+ continue;
+
+ /* search for min boundary in the distinct list */
+ a = (Datum*)bsearch_arg(&bucket->min[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), compare_scalars_simple, &ssup);
+
+ b = (Datum*)bsearch_arg(&bucket->max[i],
+ distvalues[i], ndistvalues[i],
+ sizeof(Datum), compare_scalars_simple, &ssup);
+
+ /* if this dimension is 'larger' then partition by it */
+ if (((b-a)*1.0 / ndistvalues[i]) > delta)
+ {
+ delta = ((b-a)*1.0 / ndistvalues[i]);
+ dimension = i;
+ }
+ }
+
+ /*
+ * If we haven't found a dimension here, we've done something
+ * wrong in select_bucket_to_partition.
+ */
+ Assert(dimension != -1);
+
+ /*
+ * Walk through the selected dimension, collect and sort the values and
+ * then choose the value to use as the new boundary.
+ */
+ mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* initialize sort support, etc. */
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (i = 0; i < data->numrows; i++)
+ {
+ /* remember the index of the sample row, to make the partitioning simpler */
+ values[nvalues].value = heap_getattr(data->rows[i], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+ values[nvalues].tupno = i;
+
+ /* no NULL values allowed here (we never split null-only dimension) */
+ Assert(!isNull);
+
+ nvalues++;
+ }
+
+ /* sort the array of values */
+ qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+ compare_scalars_partition, (void *) &ssup);
+
+ /*
+ * We know there are bucket->ndistincts[dimension] distinct values in this
+ * dimension, and we want to split this into half, so walk through the
+ * array and stop once we see (ndistinct/2) values.
+ *
+ * We always choose the "next" value, i.e. (n/2+1)-th distinct value, and
+ * use it as an exclusive upper boundary (and inclusive lower boundary).
+ *
+ * TODO Maybe we should use "average" of the two middle distinct values
+ * (at least for even distinct counts), but that would require being
+ * able to do an average (which does not work for non-numeric types).
+ *
+ * TODO Another option is to look for a split that'd give about 50% tuples
+ * (not distinct values) in each partition. That might work better
+ * when there are a few very frequent values, and many rare ones.
+ */
+ delta = fabs(data->numrows);
+ split_value = values[0].value;
+
+ for (i = 1; i < data->numrows; i++)
+ {
+ if (values[i].value != values[i-1].value)
+ {
+ /* are we closer to splitting the bucket in half? */
+ if (fabs(i - data->numrows/2.0) < delta)
+ {
+ /* let's assume we'll use this value for the split */
+ split_value = values[i].value;
+ delta = fabs(i - data->numrows/2.0);
+ nrows = i;
+ }
+ }
+ }
+
+ Assert(nrows > 0);
+ Assert(nrows < data->numrows);
+
+ /* create the new bucket as a (incomplete) copy of the one being partitioned. */
+ new_bucket = copy_mv_bucket(bucket, numattrs);
+ new_data = (HistogramBuild)new_bucket->build_data;
+
+ /*
+ * Do the actual split of the chosen dimension, using the split value as the
+ * upper bound for the existing bucket, and lower bound for the new one.
+ */
+ bucket->max[dimension] = split_value;
+ new_bucket->min[dimension] = split_value;
+
+ bucket->max_inclusive[dimension] = false;
+ new_bucket->max_inclusive[dimension] = true;
+
+ /*
+ * Redistribute the sample tuples using the 'ScalarItem->tupno' index. We
+ * know 'nrows' rows should remain in the original bucket and the rest goes
+ * to the new one.
+ */
+
+ data->rows = (HeapTuple*)palloc0(nrows * sizeof(HeapTuple));
+ new_data->rows = (HeapTuple*)palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+ data->numrows = nrows;
+ new_data->numrows = (oldnrows - nrows);
+
+ /*
+ * The first nrows should go to the first bucket, the rest should go to the
+ * new one. Use the tupno field to get the actual HeapTuple row from the
+ * original array of sample rows.
+ */
+ for (i = 0; i < nrows; i++)
+ memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ for (i = nrows; i < oldnrows; i++)
+ memcpy(&new_data->rows[i-nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(new_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each partition.
+ */
+ for (i = 0; i < numattrs; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+ pfree(values);
+
+ return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time data, i.e.
+ * sampled rows etc.
+ */
+static MVBucket
+copy_mv_bucket(MVBucket bucket, uint32 ndimensions)
+{
+ /* TODO allocate as a single piece (including all the fields) */
+ MVBucket new_bucket = (MVBucket)palloc0(sizeof(MVBucketData));
+ HistogramBuild data = (HistogramBuild)palloc0(sizeof(HistogramBuildData));
+
+ /* Copy only the attributes that will stay the same after the split, and
+ * we'll recompute the rest after the split. */
+
+ /* allocate the per-dimension arrays */
+ new_bucket->nullsonly = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* inclusiveness boundaries - lower/upper bounds */
+ new_bucket->min_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+ new_bucket->max_inclusive = (bool*)palloc0(ndimensions * sizeof(bool));
+
+ /* lower/upper boundaries */
+ new_bucket->min = (Datum*)palloc0(ndimensions * sizeof(Datum));
+ new_bucket->max = (Datum*)palloc0(ndimensions * sizeof(Datum));
+
+ /* copy data */
+ memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+ memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->min, bucket->min, ndimensions*sizeof(Datum));
+
+ memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions*sizeof(bool));
+ memcpy(new_bucket->max, bucket->max, ndimensions*sizeof(Datum));
+
+ /* allocate and copy the interesting part of the build data */
+ data->ndistincts = (uint32*)palloc0(ndimensions * sizeof(uint32));
+
+ new_bucket->build_data = data;
+
+ return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies the
+ * Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types (assuming
+ * they don't use collations etc.)
+ */
+static void
+update_bucket_ndistinct(MVBucket bucket, int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int numattrs = attrs->dim1;
+
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ int numrows = data->numrows;
+
+ MultiSortSupport mss = multi_sort_init(numattrs);
+
+ /*
+ * We could collect this while walking through all the attributes above
+ * (this way we have to call heap_getattr twice).
+ */
+ SortItem *items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+ Datum *values = (Datum*)palloc0(numrows * sizeof(Datum) * numattrs);
+ bool *isnull = (bool*)palloc0(numrows * sizeof(bool) * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ /* prepare the sort function for the first dimension */
+ for (i = 0; i < numattrs; i++)
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* collect the values */
+ for (i = 0; i < numrows; i++)
+ for (j = 0; j < numattrs; j++)
+ items[i].values[j]
+ = heap_getattr(data->rows[i], attrs->values[j],
+ stats[j]->tupDesc, &items[i].isnull[j]);
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ data->ndistinct = 1;
+
+ for (i = 1; i < numrows; i++)
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ data->ndistinct += 1;
+
+ pfree(items);
+ pfree(values);
+ pfree(isnull);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket bucket, int dimension, int2vector *attrs,
+ VacAttrStats ** stats, bool update_boundaries)
+{
+ int j;
+ int nvalues = 0;
+ bool isNull;
+ HistogramBuild data = (HistogramBuild)bucket->build_data;
+ Datum * values = (Datum*)palloc0(data->numrows * sizeof(Datum));
+ SortSupportData ssup;
+
+ StdAnalyzeData * mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+ /* we may already know this is a NULL-only dimension */
+ if (bucket->nullsonly[dimension])
+ data->ndistincts[dimension] = 1;
+
+ memset(&ssup, 0, sizeof(ssup));
+ ssup.ssup_cxt = CurrentMemoryContext;
+
+ /* We always use the default collation for statistics */
+ ssup.ssup_collation = DEFAULT_COLLATION_OID;
+ ssup.ssup_nulls_first = false;
+
+ PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+ for (j = 0; j < data->numrows; j++)
+ {
+ values[nvalues] = heap_getattr(data->rows[j], attrs->values[dimension],
+ stats[dimension]->tupDesc, &isNull);
+
+ /* ignore NULL values */
+ if (! isNull)
+ nvalues++;
+ }
+
+ /* there's always at least 1 distinct value (may be NULL) */
+ data->ndistincts[dimension] = 1;
+
+ /* if there are only NULL values in the column, mark it so and continue
+ * with the next one */
+ if (nvalues == 0)
+ {
+ pfree(values);
+ bucket->nullsonly[dimension] = true;
+ return;
+ }
+
+ /* sort the array (pass-by-value datum */
+ qsort_arg((void *) values, nvalues, sizeof(Datum),
+ compare_scalars_simple, (void *) &ssup);
+
+ /*
+ * Update min/max boundaries to the smallest bounding box. Generally, this
+ * needs to be done only when constructing the initial bucket.
+ */
+ if (update_boundaries)
+ {
+ /* store the min/max values */
+ bucket->min[dimension] = values[0];
+ bucket->min_inclusive[dimension] = true;
+
+ bucket->max[dimension] = values[nvalues-1];
+ bucket->max_inclusive[dimension] = true;
+ }
+
+ /*
+ * Walk through the array and count distinct values by comparing
+ * succeeding values.
+ *
+ * FIXME This only works for pass-by-value types (i.e. not VARCHARs
+ * etc.). Although thanks to the deduplication it might work
+ * even for those types (equal values will get the same item
+ * in the deduplicated array).
+ */
+ for (j = 1; j < nvalues; j++) {
+ if (values[j] != values[j-1])
+ data->ndistincts[dimension] += 1;
+ }
+
+ pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and non-NULL
+ * values in a single dimension. Each dimension may either be marked as 'nulls
+ * only', and thus containing only NULL values, or it must not contain any NULL
+ * values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns, it's
+ * necessary to build those NULL-buckets. This is done in an iterative way
+ * using this algorithm, operating on a single bucket:
+ *
+ * (1) Check that all dimensions are well-formed (not mixing NULL and
+ * non-NULL values).
+ *
+ * (2) If all dimensions are well-formed, terminate.
+ *
+ * (3) If the dimension contains only NULL values, but is not marked as
+ * NULL-only, mark it as NULL-only and run the algorithm again (on
+ * this bucket).
+ *
+ * (4) If the dimension mixes NULL and non-NULL values, split the bucket
+ * into two parts - one with NULL values, one with non-NULL values
+ * (replacing the current one). Then run the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions should
+ * be quite low - limited by the number of NULL-buckets. Also, in each branch
+ * the number of nested calls is limited by the number of dimensions
+ * (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The number of
+ * buckets produced by this algorithm is rather limited - with N dimensions,
+ * there may be only 2^N such buckets (each dimension may be either NULL or
+ * non-NULL). So with 8 dimensions (current value of MVSTATS_MAX_DIMENSIONS)
+ * there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further optimizing
+ * the histogram.
+ */
+static void
+create_null_buckets(MVHistogram histogram, int bucket_idx,
+ int2vector *attrs, VacAttrStats ** stats)
+{
+ int i, j;
+ int null_dim = -1;
+ int null_count = 0;
+ bool null_found = false;
+ MVBucket bucket, null_bucket;
+ int null_idx, curr_idx;
+ HistogramBuild data, null_data;
+
+ /* remember original values from the bucket */
+ int numrows;
+ HeapTuple *oldrows = NULL;
+
+ Assert(bucket_idx < histogram->nbuckets);
+ Assert(histogram->ndimensions == attrs->dim1);
+
+ bucket = histogram->buckets[bucket_idx];
+ data = (HistogramBuild)bucket->build_data;
+
+ numrows = data->numrows;
+ oldrows = data->rows;
+
+ /*
+ * Walk through all rows / dimensions, and stop once we find NULL in a
+ * dimension not yet marked as NULL-only.
+ */
+ for (i = 0; i < data->numrows; i++)
+ {
+ /*
+ * FIXME We don't need to start from the first attribute here - we can
+ * start from the last known dimension.
+ */
+ for (j = 0; j < histogram->ndimensions; j++)
+ {
+ /* Is this a NULL-only dimension? If yes, skip. */
+ if (bucket->nullsonly[j])
+ continue;
+
+ /* found a NULL in that dimension? */
+ if (heap_attisnull(data->rows[i], attrs->values[j]))
+ {
+ null_found = true;
+ null_dim = j;
+ break;
+ }
+ }
+
+ /* terminate if we found attribute with NULL values */
+ if (null_found)
+ break;
+ }
+
+ /* no regular dimension contains NULL values => we're done */
+ if (! null_found)
+ return;
+
+ /* walk through the rows again, count NULL values in 'null_dim' */
+ for (i = 0; i < data->numrows; i++)
+ {
+ if (heap_attisnull(data->rows[i], attrs->values[null_dim]))
+ null_count += 1;
+ }
+
+ Assert(null_count <= data->numrows);
+
+ /*
+ * If (null_count == numrows) the dimension already is NULL-only, but is
+ * not yet marked like that. It's enough to mark it and repeat the process
+ * recursively (until we run out of dimensions).
+ */
+ if (null_count == data->numrows)
+ {
+ bucket->nullsonly[null_dim] = true;
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+ return;
+ }
+
+ /*
+ * We have to split the bucket into two - one with NULL values in the
+ * dimension, one with non-NULL values. We don't need to sort the data or
+ * anything, but otherwise it's similar to what partition_bucket() does.
+ */
+
+ /* create bucket with NULL-only dimension 'dim' */
+ null_bucket = copy_mv_bucket(bucket, histogram->ndimensions);
+ null_data = (HistogramBuild)null_bucket->build_data;
+
+ /* remember the current array info */
+ oldrows = data->rows;
+ numrows = data->numrows;
+
+ /* we'll keep non-NULL values in the current bucket */
+ data->numrows = (numrows - null_count);
+ data->rows
+ = (HeapTuple*)palloc0(data->numrows * sizeof(HeapTuple));
+
+ /* and the NULL values will go to the new one */
+ null_data->numrows = null_count;
+ null_data->rows
+ = (HeapTuple*)palloc0(null_data->numrows * sizeof(HeapTuple));
+
+ /* mark the dimension as NULL-only (in the new bucket) */
+ null_bucket->nullsonly[null_dim] = true;
+
+ /* walk through the sample rows and distribute them accordingly */
+ null_idx = 0;
+ curr_idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ if (heap_attisnull(oldrows[i], attrs->values[null_dim]))
+ /* NULL => copy to the new bucket */
+ memcpy(&null_data->rows[null_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ else
+ memcpy(&data->rows[curr_idx++], &oldrows[i],
+ sizeof(HeapTuple));
+ }
+
+ /* update ndistinct values for the buckets (total and per dimension) */
+ update_bucket_ndistinct(bucket, attrs, stats);
+ update_bucket_ndistinct(null_bucket, attrs, stats);
+
+ /*
+ * TODO We don't need to do this for the dimension we used for split,
+ * because we know how many distinct values went to each bucket (NULL
+ * is not a value, so NULL buckets get 0, and the other bucket got all
+ * the distinct values).
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ update_dimension_ndistinct(bucket, i, attrs, stats, false);
+ update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+ }
+
+ pfree(oldrows);
+
+ /* add the NULL bucket to the histogram */
+ histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+ /*
+ * And now run the function recursively on both buckets (the new
+ * one first, because the call may change number of buckets, and
+ * it's used as an index).
+ */
+ create_null_buckets(histogram, (histogram->nbuckets-1), attrs, stats);
+ create_null_buckets(histogram, bucket_idx, attrs, stats);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if the
+ * statistics contains no histogram (or if there's no statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ * - prints actual values
+ * - using the output function of the data type (as string)
+ * - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ * - prints index of the distinct value (into the serialized array)
+ * - makes it easier to spot neighbor buckets, etc.
+ * - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ * - prints index of the distinct value, but normalized into [0,1]
+ * - similar to 1, but shows how 'long' the bucket range is
+ * - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options skew the
+ * lengths by distributing the distinct values uniformly. For data types
+ * without a clear meaning of 'distance' (e.g. strings) that is not a big deal,
+ * but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_mv_histogram_buckets);
+
+#define OUTPUT_FORMAT_RAW 0
+#define OUTPUT_FORMAT_INDEXES 1
+#define OUTPUT_FORMAT_DISTINCT 2
+
+Datum
+pg_mv_histogram_buckets(PG_FUNCTION_ARGS)
+{
+ FuncCallContext *funcctx;
+ int call_cntr;
+ int max_calls;
+ TupleDesc tupdesc;
+ AttInMetadata *attinmeta;
+
+ Oid mvoid = PG_GETARG_OID(0);
+ int otype = PG_GETARG_INT32(1);
+
+ if ((otype < 0) || (otype > 2))
+ elog(ERROR, "invalid output type specified");
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ MVSerializedHistogram histogram;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ /* switch to memory context appropriate for multiple function calls */
+ oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ histogram = load_mv_histogram(mvoid);
+
+ funcctx->user_fctx = histogram;
+
+ /* total number of tuples to be returned */
+ funcctx->max_calls = 0;
+ if (funcctx->user_fctx != NULL)
+ funcctx->max_calls = histogram->nbuckets;
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept type record")));
+
+ /*
+ * generate attribute metadata needed later to produce tuples
+ * from raw C strings
+ */
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ funcctx->attinmeta = attinmeta;
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ call_cntr = funcctx->call_cntr;
+ max_calls = funcctx->max_calls;
+ attinmeta = funcctx->attinmeta;
+
+ if (call_cntr < max_calls) /* do when there is more left to send */
+ {
+ char **values;
+ HeapTuple tuple;
+ Datum result;
+ int2vector *stakeys;
+ Oid relid;
+ double bucket_volume = 1.0;
+ StringInfo bufs;
+
+ char *format;
+ int i;
+
+ Oid *outfuncs;
+ FmgrInfo *fmgrinfo;
+
+ MVSerializedHistogram histogram;
+ MVSerializedBucket bucket;
+
+ histogram = (MVSerializedHistogram)funcctx->user_fctx;
+
+ Assert(call_cntr < histogram->nbuckets);
+
+ bucket = histogram->buckets[call_cntr];
+
+ stakeys = find_mv_attnums(mvoid, &relid);
+
+ /*
+ * The scalar values will be formatted directly, using snprintf.
+ *
+ * The 'array' values will be formatted through StringInfo.
+ */
+ values = (char **) palloc0(9 * sizeof(char *));
+ bufs = (StringInfo) palloc0(9 * sizeof(StringInfoData));
+
+ values[0] = (char *) palloc(64 * sizeof(char));
+
+ initStringInfo(&bufs[1]); /* lower boundaries */
+ initStringInfo(&bufs[2]); /* upper boundaries */
+ initStringInfo(&bufs[3]); /* nulls-only */
+ initStringInfo(&bufs[4]); /* lower inclusive */
+ initStringInfo(&bufs[5]); /* upper inclusive */
+
+ values[6] = (char *) palloc(64 * sizeof(char));
+ values[7] = (char *) palloc(64 * sizeof(char));
+ values[8] = (char *) palloc(64 * sizeof(char));
+
+ /* we need to do this only when printing the actual values */
+ outfuncs = (Oid*)palloc0(sizeof(Oid) * histogram->ndimensions);
+ fmgrinfo = (FmgrInfo*)palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+ /*
+ * lookup output functions for all histogram dimensions
+ *
+ * XXX This might be one in the first call and stored in user_fctx.
+ */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ bool isvarlena;
+
+ getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+ &outfuncs[i], &isvarlena);
+
+ fmgr_info(outfuncs[i], &fmgrinfo[i]);
+ }
+
+ snprintf(values[0], 64, "%d", call_cntr); /* bucket ID */
+
+ /* for the arrays of lower/upper boundaries, formated according to otype */
+ for (i = 0; i < histogram->ndimensions; i++)
+ {
+ Datum *vals = histogram->values[i];
+
+ uint16 minidx = bucket->min[i];
+ uint16 maxidx = bucket->max[i];
+
+ /* compute bucket volume, using distinct values as a measure
+ *
+ * XXX Not really sure what to do for NULL dimensions here, so let's
+ * simply count them as '1'.
+ */
+ bucket_volume
+ *= (double)(maxidx - minidx + 1) / (histogram->nvalues[i]-1);
+
+ if (i == 0)
+ format = "{%s"; /* fist dimension */
+ else if (i < (histogram->ndimensions - 1))
+ format = ", %s"; /* medium dimensions */
+ else
+ format = ", %s}"; /* last dimension */
+
+ appendStringInfo(&bufs[3], format, bucket->nullsonly[i] ? "t" : "f");
+ appendStringInfo(&bufs[4], format, bucket->min_inclusive[i] ? "t" : "f");
+ appendStringInfo(&bufs[5], format, bucket->max_inclusive[i] ? "t" : "f");
+
+ /* for NULL-only dimension, simply put there the NULL and continue */
+ if (bucket->nullsonly[i])
+ {
+ if (i == 0)
+ format = "{%s";
+ else if (i < (histogram->ndimensions - 1))
+ format = ", %s";
+ else
+ format = ", %s}";
+
+ appendStringInfo(&bufs[1], format, "NULL");
+ appendStringInfo(&bufs[2], format, "NULL");
+
+ continue;
+ }
+
+ /* otherwise we really need to format the value */
+ switch (otype)
+ {
+ case OUTPUT_FORMAT_RAW: /* actual boundary values */
+
+ if (i == 0)
+ format = "{%s";
+ else if (i < (histogram->ndimensions - 1))
+ format = ", %s";
+ else
+ format = ", %s}";
+
+ appendStringInfo(&bufs[1], format,
+ FunctionCall1(&fmgrinfo[i], vals[minidx]));
+
+ appendStringInfo(&bufs[2], format,
+ FunctionCall1(&fmgrinfo[i], vals[maxidx]));
+
+ break;
+
+ case OUTPUT_FORMAT_INDEXES: /* indexes into deduplicated arrays */
+
+ if (i == 0)
+ format = "{%d";
+ else if (i < (histogram->ndimensions - 1))
+ format = ", %d";
+ else
+ format = ", %d}";
+
+ appendStringInfo(&bufs[1], format, minidx);
+
+ appendStringInfo(&bufs[2], format, maxidx);
+
+ break;
+
+ case OUTPUT_FORMAT_DISTINCT: /* distinct arrays as measure */
+
+ if (i == 0)
+ format = "{%f";
+ else if (i < (histogram->ndimensions - 1))
+ format = ", %f";
+ else
+ format = ", %f}";
+
+ appendStringInfo(&bufs[1], format,
+ (minidx * 1.0 / (histogram->nvalues[i]-1)));
+
+ appendStringInfo(&bufs[2], format,
+ (maxidx * 1.0 / (histogram->nvalues[i]-1)));
+
+ break;
+
+ default:
+ elog(ERROR, "unknown output type: %d", otype);
+ }
+ }
+
+ values[1] = bufs[1].data;
+ values[2] = bufs[2].data;
+ values[3] = bufs[3].data;
+ values[4] = bufs[4].data;
+ values[5] = bufs[5].data;
+
+ snprintf(values[6], 64, "%f", bucket->ntuples); /* frequency */
+ snprintf(values[7], 64, "%f", bucket->ntuples / bucket_volume); /* density */
+ snprintf(values[8], 64, "%f", bucket_volume); /* volume (as a fraction) */
+
+ /* build a tuple */
+ tuple = BuildTupleFromCStrings(attinmeta, values);
+
+ /* make the tuple into a datum */
+ result = HeapTupleGetDatum(tuple);
+
+ /* clean up (this is not really necessary) */
+ pfree(values[0]);
+ pfree(values[6]);
+ pfree(values[7]);
+ pfree(values[8]);
+
+ resetStringInfo(&bufs[1]);
+ resetStringInfo(&bufs[2]);
+ resetStringInfo(&bufs[3]);
+ resetStringInfo(&bufs[4]);
+ resetStringInfo(&bufs[5]);
+
+ pfree(bufs);
+ pfree(values);
+
+ SRF_RETURN_NEXT(funcctx, result);
+ }
+ else /* do when there is no more left */
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+}
+
+#ifdef DEBUG_MVHIST
+/*
+ * prints debugging info about matched histogram buckets (full/partial)
+ *
+ * XXX Currently works only for INT data type.
+ */
+void
+debug_histogram_matches(MVSerializedHistogram mvhist, char *matches)
+{
+ int i, j;
+
+ float ffull = 0, fpartial = 0;
+ int nfull = 0, npartial = 0;
+
+ StringInfoData buf;
+
+ initStringInfo(&buf);
+
+ for (i = 0; i < mvhist->nbuckets; i++)
+ {
+ MVSerializedBucket bucket = mvhist->buckets[i];
+
+ if (! matches[i])
+ continue;
+
+ /* increment the counters */
+ nfull += (matches[i] == MVSTATS_MATCH_FULL) ? 1 : 0;
+ npartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? 1 : 0;
+
+ /* and also update the frequencies */
+ ffull += (matches[i] == MVSTATS_MATCH_FULL) ? bucket->ntuples : 0;
+ fpartial += (matches[i] == MVSTATS_MATCH_PARTIAL) ? bucket->ntuples : 0;
+
+ resetStringInfo(&buf);
+
+ /* build ranges for all the dimentions */
+ for (j = 0; j < mvhist->ndimensions; j++)
+ {
+ appendStringInfo(&buf, '[%d %d]',
+ DatumGetInt32(mvhist->values[j][bucket->min[j]]),
+ DatumGetInt32(mvhist->values[j][bucket->max[j]]));
+ }
+
+ elog(WARNING, "bucket %d %s => %d [%f]", i, buf.data, matches[i], bucket->ntuples);
+ }
+
+ elog(WARNING, "full=%f partial=%f (%f)", ffull, fpartial, (ffull + 0.5 * fpartial));
+}
+#endif
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2c22d31..b693f36 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2109,9 +2109,9 @@ describeOneTableDetails(const char *schemaname,
{
printfPQExpBuffer(&buf,
"SELECT oid, stanamespace::regnamespace AS nsp, staname, stakeys,\n"
- " deps_enabled, mcv_enabled,\n"
- " deps_built, mcv_built,\n"
- " mcv_max_items,\n"
+ " deps_enabled, mcv_enabled, hist_enabled,\n"
+ " deps_built, mcv_built, hist_built,\n"
+ " mcv_max_items, hist_max_buckets,\n"
" (SELECT string_agg(attname::text,', ')\n"
" FROM ((SELECT unnest(stakeys) AS attnum) s\n"
" JOIN pg_attribute a ON (starelid = a.attrelid and a.attnum = s.attnum))) AS attnums\n"
@@ -2154,8 +2154,17 @@ describeOneTableDetails(const char *schemaname,
first = false;
}
+ if (!strcmp(PQgetvalue(result, i, 6), "t"))
+ {
+ if (! first)
+ appendPQExpBuffer(&buf, ", histogram");
+ else
+ appendPQExpBuffer(&buf, "(histogram");
+ first = false;
+ }
+
appendPQExpBuffer(&buf, ") ON (%s)",
- PQgetvalue(result, i, 9));
+ PQgetvalue(result, i, 12));
printTableAddFooter(&cont, buf.data);
}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 3529b03..7020772 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -39,13 +39,16 @@ CATALOG(pg_mv_statistic,3381)
/* statistics requested to build */
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
+ bool hist_enabled; /* build histogram? */
- /* MCV size */
+ /* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
+ int32 hist_max_buckets; /* max histogram buckets */
/* statistics that are available (if requested) */
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
+ bool hist_built; /* histogram was built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -53,6 +56,7 @@ CATALOG(pg_mv_statistic,3381)
#ifdef CATALOG_VARLEN
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
+ bytea stahist; /* MV histogram (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -68,18 +72,22 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 12
+#define Natts_pg_mv_statistic 16
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
#define Anum_pg_mv_statistic_staowner 4
#define Anum_pg_mv_statistic_deps_enabled 5
#define Anum_pg_mv_statistic_mcv_enabled 6
-#define Anum_pg_mv_statistic_mcv_max_items 7
-#define Anum_pg_mv_statistic_deps_built 8
-#define Anum_pg_mv_statistic_mcv_built 9
-#define Anum_pg_mv_statistic_stakeys 10
-#define Anum_pg_mv_statistic_stadeps 11
-#define Anum_pg_mv_statistic_stamcv 12
+#define Anum_pg_mv_statistic_hist_enabled 7
+#define Anum_pg_mv_statistic_mcv_max_items 8
+#define Anum_pg_mv_statistic_hist_max_buckets 9
+#define Anum_pg_mv_statistic_deps_built 10
+#define Anum_pg_mv_statistic_mcv_built 11
+#define Anum_pg_mv_statistic_hist_built 12
+#define Anum_pg_mv_statistic_stakeys 13
+#define Anum_pg_mv_statistic_stadeps 14
+#define Anum_pg_mv_statistic_stamcv 15
+#define Anum_pg_mv_statistic_stahist 16
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 5640dc1..f2f735d 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2674,6 +2674,10 @@ DATA(insert OID = 3376 ( pg_mv_stats_mcvlist_info PGNSP PGUID 12 1 0 0 0 f f f
DESCR("multi-variate statistics: MCV list info");
DATA(insert OID = 3373 ( pg_mv_mcv_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_mv_mcv_items _null_ _null_ _null_ ));
DESCR("details about MCV list items");
+DATA(insert OID = 3375 ( pg_mv_stats_histogram_info PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "17" _null_ _null_ _null_ _null_ _null_ pg_mv_stats_histogram_info _null_ _null_ _null_ ));
+DESCR("multi-variate statistics: histogram info");
+DATA(insert OID = 3374 ( pg_mv_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_volume}" _null_ _null_ pg_mv_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
DATA(insert OID = 1928 ( pg_stat_get_numscans PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
DESCR("statistics: number of scans done for table/index");
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index f52884a..84be0ce 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -656,10 +656,12 @@ typedef struct MVStatisticInfo
/* enabled statistics */
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
+ bool hist_enabled; /* histogram enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
+ bool hist_built; /* histogram built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index b2643ec..777c7da 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -18,7 +18,7 @@
#include "commands/vacuum.h"
/*
- * Degree of how much MCV item matches a clause.
+ * Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
*/
#define MVSTATS_MATCH_NONE 0 /* no match at all */
@@ -92,6 +92,123 @@ typedef MCVListData *MCVList;
#define MVSTAT_MCVLIST_MAX_ITEMS 8192 /* max items in MCV list */
/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ Datum *min;
+ bool *min_inclusive;
+
+ /* upper boundaries - values and information about the inequalities */
+ Datum *max;
+ bool *max_inclusive;
+
+ /* used when building the histogram (not serialized/deserialized) */
+ void *build_data;
+
+} MVBucketData;
+
+typedef MVBucketData *MVBucket;
+
+
+typedef struct MVHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ MVBucket *buckets; /* array of buckets */
+
+} MVHistogramData;
+
+typedef MVHistogramData *MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ *
+ * TODO add more detailed description here
+ */
+
+typedef struct MVSerializedBucketData {
+
+ /* Frequencies of this bucket. */
+ float ntuples; /* frequency of tuples tuples */
+
+ /*
+ * Information about dimensions being NULL-only. Not yet used.
+ */
+ bool *nullsonly;
+
+ /* lower boundaries - values and information about the inequalities */
+ uint16 *min;
+ bool *min_inclusive;
+
+ /* indexes of upper boundaries - values and information about the
+ * inequalities (exclusive vs. inclusive) */
+ uint16 *max;
+ bool *max_inclusive;
+
+} MVSerializedBucketData;
+
+typedef MVSerializedBucketData *MVSerializedBucket;
+
+typedef struct MVSerializedHistogramData {
+
+ uint32 magic; /* magic constant marker */
+ uint32 type; /* type of histogram (BASIC) */
+ uint32 nbuckets; /* number of buckets (buckets array) */
+ uint32 ndimensions; /* number of dimensions */
+
+ /*
+ * keep this the same with MVHistogramData, because of
+ * deserialization (same offset)
+ */
+ MVSerializedBucket *buckets; /* array of buckets */
+
+ /*
+ * serialized boundary values, one array per dimension, deduplicated
+ * (the min/max indexes point into these arrays)
+ */
+ int *nvalues;
+ Datum **values;
+
+} MVSerializedHistogramData;
+
+typedef MVSerializedHistogramData *MVSerializedHistogram;
+
+
+/* used to flag stats serialized to bytea */
+#define MVSTAT_HIST_MAGIC 0x7F8C5670 /* marks serialized bytea */
+#define MVSTAT_HIST_TYPE_BASIC 1 /* basic histogram type */
+
+/*
+ * Limits used for max_buckets option, i.e. we're always guaranteed
+ * to have space for at least MVSTAT_HIST_MIN_BUCKETS, and we cannot
+ * have more than MVSTAT_HIST_MAX_BUCKETS buckets.
+ *
+ * This is just a boundary for the 'max' threshold - the actual
+ * histogram may use less buckets than MVSTAT_HIST_MAX_BUCKETS.
+ *
+ * TODO The MVSTAT_HIST_MIN_BUCKETS should be related to the number of
+ * attributes (MVSTATS_MAX_DIMENSIONS) because of NULL-buckets.
+ * There should be at least 2^N buckets, otherwise we may be unable
+ * to build the NULL buckets.
+ */
+#define MVSTAT_HIST_MIN_BUCKETS 128 /* min number of buckets */
+#define MVSTAT_HIST_MAX_BUCKETS 16384 /* max number of buckets */
+
+/*
* TODO Maybe fetching the histogram/MCV list separately is inefficient?
* Consider adding a single `fetch_stats` method, fetching all
* stats specified using flags (or something like that).
@@ -99,20 +216,25 @@ typedef MCVListData *MCVList;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
+MVSerializedHistogram load_mv_histogram(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
VacAttrStats **stats);
+bytea * serialize_mv_histogram(MVHistogram histogram, int2vector *attrs,
+ VacAttrStats **stats);
/* deserialization of stats (serialization is private to analyze) */
MVDependencies deserialize_mv_dependencies(bytea * data);
MCVList deserialize_mv_mcvlist(bytea * data);
+MVSerializedHistogram deserialize_mv_histogram(bytea * data);
/*
* Returns index of the attribute number within the vector (i.e. a
* dimension within the stats).
*/
int mv_get_index(AttrNumber varattno, int2vector * stakeys);
+int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
int2vector* find_mv_attnums(Oid mvoid, Oid *relid);
@@ -121,6 +243,8 @@ extern Datum pg_mv_stats_dependencies_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_dependencies_show(PG_FUNCTION_ARGS);
extern Datum pg_mv_stats_mcvlist_info(PG_FUNCTION_ARGS);
extern Datum pg_mv_mcvlist_items(PG_FUNCTION_ARGS);
+extern Datum pg_mv_stats_histogram_info(PG_FUNCTION_ARGS);
+extern Datum pg_mv_histogram_buckets(PG_FUNCTION_ARGS);
MVDependencies
build_mv_dependencies(int numrows, HeapTuple *rows, int2vector *attrs,
@@ -130,10 +254,20 @@ MCVList
build_mv_mcvlist(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int *numrows_filtered);
+MVHistogram
+build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
+ VacAttrStats **stats, int numrows_total);
+
void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
-void update_mv_stats(Oid relid, MVDependencies dependencies, MCVList mcvlist,
+void update_mv_stats(Oid relid, MVDependencies dependencies,
+ MCVList mcvlist, MVHistogram histogram,
int2vector *attrs, VacAttrStats **stats);
+#ifdef DEBUG_MVHIST
+extern void debug_histogram_matches(MVSerializedHistogram mvhist, char *matches);
+#endif
+
+
#endif
diff --git a/src/test/regress/expected/mv_histogram.out b/src/test/regress/expected/mv_histogram.out
new file mode 100644
index 0000000..e830816
--- /dev/null
+++ b/src/test/regress/expected/mv_histogram.out
@@ -0,0 +1,207 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+-- unknown column
+CREATE STATISTICS s7 ON mv_histogram (unknown_column) WITH (histogram);
+ERROR: column "unknown_column" referenced in statistics does not exist
+-- single column
+CREATE STATISTICS s7 ON mv_histogram (a) WITH (histogram);
+ERROR: multivariate stats require 2 or more columns
+-- single column, duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- two columns, one duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a, b) WITH (histogram);
+ERROR: duplicate column name in statistics definition
+-- unknown option
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (unknown_option);
+ERROR: unrecognized STATISTICS option "unknown_option"
+-- missing histogram statistics
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+ERROR: option 'histogram' is required by other options(s)
+-- invalid max_buckets value / too low
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+ERROR: minimum number of buckets is 128
+-- invalid max_buckets value / too high
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+ERROR: maximum number of buckets is 16384
+-- correct command
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+ QUERY PLAN
+--------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = 10) AND (b = 5))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = 10) AND (b = 5))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+CREATE STATISTICS s8 ON mv_histogram (a, b, c) WITH (histogram);
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+ QUERY PLAN
+------------------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a = '10'::text) AND (b = '5'::text))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a = '10'::text) AND (b = '5'::text))
+(4 rows)
+
+TRUNCATE mv_histogram;
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+ QUERY PLAN
+---------------------------------------------------
+ Bitmap Heap Scan on mv_histogram
+ Recheck Cond: ((a IS NULL) AND (b IS NULL))
+ -> Bitmap Index Scan on hist_idx
+ Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+DROP TABLE mv_histogram;
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+CREATE STATISTICS s9 ON mv_histogram (a, b, c, d) WITH (histogram);
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+ hist_enabled | hist_built
+--------------+------------
+ t | t
+(1 row)
+
+DROP TABLE mv_histogram;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 3d55ffe..528ac36 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1375,7 +1375,9 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stadeps) AS depsbytes,
pg_mv_stats_dependencies_info(s.stadeps) AS depsinfo,
length(s.stamcv) AS mcvbytes,
- pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo
+ pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
+ length(s.stahist) AS histbytes,
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 85d94f1..a885235 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -112,4 +112,4 @@ test: event_trigger
test: stats
# run tests of multivariate stats
-test: mv_dependencies mv_mcv
+test: mv_dependencies mv_mcv mv_histogram
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 6584d73..2efdcd7 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -164,3 +164,4 @@ test: event_trigger
test: stats
test: mv_dependencies
test: mv_mcv
+test: mv_histogram
diff --git a/src/test/regress/sql/mv_histogram.sql b/src/test/regress/sql/mv_histogram.sql
new file mode 100644
index 0000000..27c2510
--- /dev/null
+++ b/src/test/regress/sql/mv_histogram.sql
@@ -0,0 +1,176 @@
+-- data type passed by value
+CREATE TABLE mv_histogram (
+ a INT,
+ b INT,
+ c INT
+);
+
+-- unknown column
+CREATE STATISTICS s7 ON mv_histogram (unknown_column) WITH (histogram);
+
+-- single column
+CREATE STATISTICS s7 ON mv_histogram (a) WITH (histogram);
+
+-- single column, duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a) WITH (histogram);
+
+-- two columns, one duplicated
+CREATE STATISTICS s7 ON mv_histogram (a, a, b) WITH (histogram);
+
+-- unknown option
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (unknown_option);
+
+-- missing histogram statistics
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (dependencies, max_buckets=200);
+
+-- invalid max_buckets value / too low
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=10);
+
+-- invalid max_buckets value / too high
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (mcv, max_buckets=100000);
+
+-- correct command
+CREATE STATISTICS s7 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = 10 AND b = 5;
+
+DROP TABLE mv_histogram;
+
+-- varlena type (text)
+CREATE TABLE mv_histogram (
+ a TEXT,
+ b TEXT,
+ c TEXT
+);
+
+CREATE STATISTICS s8 ON mv_histogram (a, b, c) WITH (histogram);
+
+-- random data (no functional dependencies)
+INSERT INTO mv_histogram
+ SELECT mod(i, 111), mod(i, 123), mod(i, 23) FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c, b => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/100, i/200 FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- a => b, a => c
+INSERT INTO mv_histogram
+ SELECT i/10, i/150, i/200 FROM generate_series(1,10000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan)
+INSERT INTO mv_histogram
+ SELECT i/10000, i/20000, i/40000 FROM generate_series(1,1000000) s(i);
+CREATE INDEX hist_idx ON mv_histogram (a, b);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a = '10' AND b = '5';
+
+TRUNCATE mv_histogram;
+
+-- check explain (expect bitmap index scan, not plain index scan) with NULLs
+INSERT INTO mv_histogram
+ SELECT
+ (CASE WHEN i/10000 = 0 THEN NULL ELSE i/10000 END),
+ (CASE WHEN i/20000 = 0 THEN NULL ELSE i/20000 END),
+ (CASE WHEN i/40000 = 0 THEN NULL ELSE i/40000 END)
+ FROM generate_series(1,1000000) s(i);
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+EXPLAIN (COSTS off)
+ SELECT * FROM mv_histogram WHERE a IS NULL AND b IS NULL;
+
+DROP TABLE mv_histogram;
+
+-- NULL values (mix of int and text columns)
+CREATE TABLE mv_histogram (
+ a INT,
+ b TEXT,
+ c INT,
+ d TEXT
+);
+
+CREATE STATISTICS s9 ON mv_histogram (a, b, c, d) WITH (histogram);
+
+INSERT INTO mv_histogram
+ SELECT
+ mod(i, 100),
+ (CASE WHEN mod(i, 200) = 0 THEN NULL ELSE mod(i,200) END),
+ mod(i, 400),
+ (CASE WHEN mod(i, 300) = 0 THEN NULL ELSE mod(i,600) END)
+ FROM generate_series(1,10000) s(i);
+
+ANALYZE mv_histogram;
+
+SELECT hist_enabled, hist_built
+ FROM pg_mv_statistic WHERE starelid = 'mv_histogram'::regclass;
+
+DROP TABLE mv_histogram;
--
2.5.0
0006-multi-statistics-estimation.patchtext/x-patch; name=0006-multi-statistics-estimation.patchDownload
From 6a965d339eca0b8573dbc709dc30eb9ac3c95e02 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Fri, 6 Feb 2015 01:42:38 +0100
Subject: [PATCH 6/9] multi-statistics estimation
The general idea is that a probability (which is what selectivity is)
can be split into a product of conditional probabilities like this:
P(A & B & C) = P(A & B) * P(C|A & B)
If we assume that C and B are independent, the last part may be
simplified like this
P(A & B & C) = P(A & B) * P(C|A)
we only need probabilities on [A,B] and [C,A] to compute the original
probability.
The implementation works in the other direction, though. We know what
probability P(A & B & C) we need to compute, and also what statistics
are available.
So we search for a combinations of statistics, covering the clauses in
an optimal way (most clauses covered, most dependencies exploited).
There are two possible approaches - exhaustive and greedy. The
exhaustive one walks through all permutations of stats using dynamic
programming, so it's guaranteed to find the optimal solution, but it
soon gets very slow as it's roughly O(N!). The dynamic programming may
improve that a bit, but it's still far too expensive for large numbers
of statistics (on a single table).
The greedy algorithm is very simple - in every step choose the best
solution. That may not guarantee the best solution globally (but maybe
it does?), but it only needs N steps to find the solution, so it's very
fast (processing the selected stats is usually way more expensive).
There's a GUC for selecting the search algorithm
mvstat_search = {'greedy', 'exhaustive'}
The default value is 'greedy' as that's much safer (with respect to
runtime). See choose_mv_statistics().
Once we have found a sequence of statistics, we apply them to the
clauses using the conditional probabilities. We process the selected
stats one by one, and for each we select the estimated clauses and
conditions. See clauselist_selectivity() for more details.
Limitations
-----------
It's still true that each clause at a given level has to be covered by
a single MV statistics. So with this query
WHERE (clause1) AND (clause2) AND (clause3 OR clause4)
each parenthesized clause has to be covered by a single multivariate
statistics.
Clauses not covered by a single statistics at this level will be passed
to clause_selectivity() but this will treat them as a collection of
simpler clauses (connected by AND or OR), and the clauses from the
previous level will be used as conditions.
So using the same example, the last clause will be passed to
clause_selectivity() with 'clause1' and 'clause2' as conditions, and it
will be processed using multivariate stats if possible.
The other limitation is that all the expressions have to be
mv-compatible, i.e. there can't be a mix of expressions. If this is
violated, the clause may be passed to the next level (just like with
list of clauses not covered by a single statistics), which splits that
into clauses handled by multivariate stats and clauses handler by
regular statistics.
rework clauselist_selectivity_or to handle OR-clauses correctly
We might invent a completely new set of functions here, resembling
clauselist_selectivity but adapting the ideas to OR-clauses.
But luckily we know that each OR-clause
(a OR b OR c)
may be rewritten as an equivalent AND-clause using negation:
NOT ((NOT a) AND (NOT b) AND (NOT c))
And that's something we can pass to clauselist_selectivity.
---
contrib/file_fdw/file_fdw.c | 3 +-
contrib/postgres_fdw/postgres_fdw.c | 11 +-
src/backend/optimizer/path/clausesel.c | 2030 ++++++++++++++++++++++++++------
src/backend/optimizer/path/costsize.c | 23 +-
src/backend/optimizer/util/orclauses.c | 4 +-
src/backend/utils/adt/selfuncs.c | 17 +-
src/backend/utils/misc/guc.c | 20 +
src/backend/utils/mvstats/README.stats | 166 +++
src/include/optimizer/cost.h | 6 +-
src/include/utils/mvstats.h | 8 +
10 files changed, 1916 insertions(+), 372 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index dc035d7..8f11b7a 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -969,7 +969,8 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
baserel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index aa745f2..d89f9e3 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -500,7 +500,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
fpinfo->local_conds,
baserel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
@@ -2136,7 +2137,8 @@ estimate_path_cost_size(PlannerInfo *root,
local_param_join_conds,
foreignrel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
local_sel *= fpinfo->local_conds_sel;
rows = clamp_row_est(rows * local_sel);
@@ -3661,7 +3663,8 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
fpinfo->local_conds,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
/*
@@ -3680,7 +3683,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
*/
fpinfo->joinclause_sel = clauselist_selectivity(root, fpinfo->joinclauses,
0, fpinfo->jointype,
- extra->sjinfo);
+ extra->sjinfo, NIL);
}
fpinfo->server = GetForeignServer(joinrel->serverid);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index fe96a73..14e3444 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -29,6 +29,8 @@
#include "utils/selfuncs.h"
#include "utils/typcache.h"
+#include "miscadmin.h"
+
/*
* Data structure for accumulating info about possible range-query
@@ -44,6 +46,13 @@ typedef struct RangeQueryClause
Selectivity hibound; /* Selectivity of a var < something clause */
} RangeQueryClause;
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
+
static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
bool varonleft, bool isLTsel, Selectivity s2);
@@ -60,23 +69,25 @@ static int count_mv_attnums(List *clauses, Index relid, int type);
static int count_varnos(List *clauses, Index *relid);
+static List *clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ Index relid, int types, bool remove);
+
static List *clauselist_apply_dependencies(PlannerInfo *root, List *clauses,
Index relid, List *stats);
-static MVStatisticInfo *choose_mv_statistics(List *mvstats, Bitmapset *attnums);
-
-static List *clauselist_mv_split(PlannerInfo *root, Index relid,
- List *clauses, List **mvclauses,
- MVStatisticInfo *mvstats, int types);
-
static Selectivity clauselist_mv_selectivity(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats, List *clauses,
+ List *conditions, bool is_or);
static Selectivity clauselist_mv_selectivity_mcvlist(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats,
- bool *fullmatch, Selectivity *lowsel);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or, bool *fullmatch,
+ Selectivity *lowsel);
static Selectivity clauselist_mv_selectivity_histogram(PlannerInfo *root,
- List *clauses, MVStatisticInfo *mvstats);
+ MVStatisticInfo *mvstats,
+ List *clauses, List *conditions,
+ bool is_or);
static int update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
int2vector *stakeys, MCVList mcvlist,
@@ -90,12 +101,33 @@ static int update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
int nmatches, char * matches,
bool is_or);
+/*
+ * Describes a combination of multiple statistics to cover attributes
+ * referenced by the clauses. The array 'stats' (with nstats elements)
+ * lists attributes (in the order as they are applied), and number of
+ * clause attributes covered by this solution.
+ *
+ * choose_mv_statistics_exhaustive() uses this to track both the current
+ * and the best solutions, while walking through the state of possible
+ * combination.
+ */
+typedef struct mv_solution_t {
+ int nclauses; /* number of clauses covered */
+ int nconditions; /* number of conditions covered */
+ int nstats; /* number of stats applied */
+ int *stats; /* stats (in the apply order) */
+} mv_solution_t;
+
+static List *choose_mv_statistics(PlannerInfo *root, Index relid,
+ List *mvstats, List *clauses, List *conditions);
+
static bool has_stats(List *stats, int type);
static List * find_stats(PlannerInfo *root, Index relid);
static bool stats_type_matches(MVStatisticInfo *stat, int type);
+int mvstat_search_type = MVSTAT_SEARCH_GREEDY;
/* used for merging bitmaps - AND (min), OR (max) */
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
@@ -170,14 +202,15 @@ clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 1.0;
RangeQueryClause *rqlist = NULL;
ListCell *l;
/* processing mv stats */
- Oid relid = InvalidOid;
+ Index relid = InvalidOid;
/* list of multivariate stats on the relation */
List *stats = NIL;
@@ -193,12 +226,13 @@ clauselist_selectivity(PlannerInfo *root,
stats = find_stats(root, relid);
/*
- * If there's exactly one clause, then no use in trying to match up pairs,
- * so just go directly to clause_selectivity().
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, or matching multivariate statistics, so just go directly
+ * to clause_selectivity().
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
- varRelid, jointype, sjinfo);
+ varRelid, jointype, sjinfo, conditions);
/*
* Apply functional dependencies, but first check that there are some stats
@@ -230,31 +264,100 @@ clauselist_selectivity(PlannerInfo *root,
(count_mv_attnums(clauses, relid,
MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2))
{
- /* collect attributes from the compatible conditions */
- Bitmapset *mvattnums = collect_mv_attnums(clauses, relid,
- MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ ListCell *s;
+
+ /*
+ * Copy the conditions we got from the upper part of the expression tree
+ * so that we can add local conditions to it (we need to keep the
+ * original list intact, for sibling expressions - other expressions
+ * at the same level).
+ */
+ List *conditions_local = list_copy(conditions);
+
+ /* find the best combination of statistics */
+ List *solution = choose_mv_statistics(root, relid, stats,
+ clauses, conditions);
- /* and search for the statistic covering the most attributes */
- MVStatisticInfo *mvstat = choose_mv_statistics(stats, mvattnums);
+ /* FIXME we must not scribble over the original list */
+ if (solution)
+ clauses = list_copy(clauses);
- if (mvstat != NULL) /* we have a matching stats */
+ /*
+ * We have a good solution, which is merely a list of statistics that
+ * we need to apply. We'll apply the statistics one by one (in the order
+ * as they appear in the list), and for each statistic we'll
+ *
+ * (1) find clauses compatible with the statistic (and remove them
+ * from the list)
+ *
+ * (2) find local conditions compatible with the statistic
+ *
+ * (3) do the estimation P(clauses | conditions)
+ *
+ * (4) append the estimated clauses to local conditions
+ *
+ * continuously modify
+ */
+ foreach (s, solution)
{
- /* clauses compatible with multi-variate stats */
- List *mvclauses = NIL;
+ MVStatisticInfo *mvstat = (MVStatisticInfo *)lfirst(s);
- /* split the clauselist into regular and mv-clauses */
- clauses = clauselist_mv_split(root, relid, clauses, &mvclauses,
- mvstat, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
+ /* clauses compatible with the statistic we're applying right now */
+ List *stat_clauses = NIL;
+ List *stat_conditions = NIL;
- /* we've chosen the histogram to match the clauses */
- Assert(mvclauses != NIL);
+ /*
+ * Find clauses and conditions matching the statistic - the clauses
+ * need to be removed from the list, while conditions should remain
+ * there (so that we can apply them repeatedly).
+ */
+ stat_clauses
+ = clauses_matching_statistic(&clauses, mvstat, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ true);
+
+ stat_conditions
+ = clauses_matching_statistic(&conditions_local, mvstat, relid,
+ MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST,
+ false);
+
+ /*
+ * If we got no clauses to estimate, we've done something wrong,
+ * either during the optimization, detecting compatible clause, or
+ * somewhere else.
+ *
+ * Also, we need at least two attributes in clauses and conditions.
+ */
+ Assert(stat_clauses != NIL);
+ Assert(count_mv_attnums(list_union(stat_clauses, stat_conditions),
+ relid, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2);
/* compute the multivariate stats */
- s1 *= clauselist_mv_selectivity(root, mvclauses, mvstat);
+ s1 *= clauselist_mv_selectivity(root, mvstat,
+ stat_clauses, stat_conditions,
+ false); /* AND */
+
+ /*
+ * Add the new clauses to the local conditions, so that we can use
+ * them for the subsequent statistics. We only add the clauses,
+ * because the conditions are already there (or should be).
+ */
+ conditions_local = list_concat(conditions_local, stat_clauses);
}
+
+ /* from now on, work only with the 'local' list of conditions */
+ conditions = conditions_local;
}
/*
+ * If there's exactly one clause, then no use in trying to match up
+ * pairs, so just go directly to clause_selectivity().
+ */
+ if (list_length(clauses) == 1)
+ return s1 * clause_selectivity(root, (Node *) linitial(clauses),
+ varRelid, jointype, sjinfo, conditions);
+
+ /*
* Initial scan over clauses. Anything that doesn't look like a potential
* rangequery clause gets multiplied into s1 and forgotten. Anything that
* does gets inserted into an rqlist entry.
@@ -266,7 +369,8 @@ clauselist_selectivity(PlannerInfo *root,
Selectivity s2;
/* Always compute the selectivity using clause_selectivity */
- s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo);
+ s2 = clause_selectivity(root, clause, varRelid, jointype, sjinfo,
+ conditions);
/*
* Check for being passed a RestrictInfo.
@@ -425,6 +529,55 @@ clauselist_selectivity(PlannerInfo *root,
}
/*
+ * Similar to clauselist_selectivity(), but for OR-clauses. We can't simply use
+ * the same multi-statistic estimation logic for AND-clauses, at least not
+ * directly, because there are a few key differences:
+ *
+ * - functional dependencies don't really apply to OR-clauses
+ *
+ * - clauselist_selectivity() is based on decomposing the selectivity into
+ * a sequence of conditional probabilities (selectivities), but that can
+ * be done only for AND-clauses
+ *
+ * We might invent a similar infrastructure for optimizing OR-clauses, doing
+ * something similar to what clause_selectivity does for AND-clauses, but
+ * luckily we know that each disjunctive normal form (aka OR-clause)
+ *
+ * (a OR b OR c)
+ *
+ * may be rewritten as an equivalent conjunctive normal form (aka AND-clause)
+ * by using negation:
+ *
+ * NOT ((NOT a) AND (NOT b) AND (NOT c))
+ *
+ * And that's something we can pass to clauselist_selectivity and let it do
+ * all the heavy lifting.
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
+{
+ List *args = NIL;
+ ListCell *l;
+ Expr *expr;
+
+ /* build arguments for the AND-clause by negating args of the OR-clause */
+ foreach (l, clauses)
+ args = lappend(args, makeBoolExpr(NOT_EXPR, list_make1(lfirst(l)), -1));
+
+ /* and then the actual OR-clause on the negated args */
+ expr = makeBoolExpr(AND_EXPR, args, -1);
+
+ /* instead of constructing NOT expression, just do (1.0 - s) */
+ return 1.0 - clauselist_selectivity(root, list_make1(expr), varRelid,
+ jointype, sjinfo, conditions);
+}
+
+/*
* addRangeClause --- add a new range clause for clauselist_selectivity
*
* Here is where we try to match up pairs of range-query clauses
@@ -631,7 +784,8 @@ clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo)
+ SpecialJoinInfo *sjinfo,
+ List *conditions)
{
Selectivity s1 = 0.5; /* default for any unhandled clause type */
RestrictInfo *rinfo = NULL;
@@ -751,7 +905,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) get_notclausearg((Expr *) clause),
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (and_clause(clause))
{
@@ -760,29 +915,18 @@ clause_selectivity(PlannerInfo *root,
((BoolExpr *) clause)->args,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (or_clause(clause))
{
- /*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2 to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
- */
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
- (Node *) lfirst(arg),
- varRelid,
- jointype,
- sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ /* just call to clauselist_selectivity_or() */
+ s1 = clauselist_selectivity_or(root,
+ ((BoolExpr *) clause)->args,
+ varRelid,
+ jointype,
+ sjinfo,
+ conditions);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
@@ -872,7 +1016,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((RelabelType *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else if (IsA(clause, CoerceToDomain))
{
@@ -881,7 +1026,8 @@ clause_selectivity(PlannerInfo *root,
(Node *) ((CoerceToDomain *) clause)->arg,
varRelid,
jointype,
- sjinfo);
+ sjinfo,
+ conditions);
}
else
{
@@ -945,300 +1091,1395 @@ clause_selectivity(PlannerInfo *root,
* in the MCV list, then the selectivity is below the lowest frequency
* found in the MCV list,
*
- * TODO When applying the clauses to the histogram/MCV list, we can do
- * that from the most selective clauses first, because that'll
- * eliminate the buckets/items sooner (so we'll be able to skip
- * them without inspection, which is more expensive). But this
- * requires really knowing the per-clause selectivities in advance,
- * and that's not what we do now.
+ * TODO When applying the clauses to the histogram/MCV list, we can do that from
+ * the most selective clauses first, because that'll eliminate the
+ * buckets/items sooner (so we'll be able to skip them without inspection,
+ * which is more expensive). But this requires really knowing the
+ * per-clause selectivities in advance, and that's not what we do now.
+ *
*/
static Selectivity
-clauselist_mv_selectivity(PlannerInfo *root, List *clauses, MVStatisticInfo *mvstats)
+clauselist_mv_selectivity(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
bool fullmatch = false;
Selectivity s1 = 0.0, s2 = 0.0;
- /*
- * Lowest frequency in the MCV list (may be used as an upper bound
- * for full equality conditions that did not match any MCV item).
- */
- Selectivity mcv_low = 0.0;
+ /*
+ * Lowest frequency in the MCV list (may be used as an upper bound
+ * for full equality conditions that did not match any MCV item).
+ */
+ Selectivity mcv_low = 0.0;
+
+ /* TODO Evaluate simple 1D selectivities, use the smallest one as
+ * an upper bound, product as lower bound, and sort the
+ * clauses in ascending order by selectivity (to optimize the
+ * MCV/histogram evaluation).
+ */
+
+ /* Evaluate the MCV first. */
+ s1 = clauselist_mv_selectivity_mcvlist(root, mvstats,
+ clauses, conditions, is_or,
+ &fullmatch, &mcv_low);
+
+ /*
+ * If we got a full equality match on the MCV list, we're done (and
+ * the estimate is pretty good).
+ */
+ if (fullmatch && (s1 > 0.0))
+ return s1;
+
+ /* TODO if (fullmatch) without matching MCV item, use the mcv_low
+ * selectivity as upper bound */
+
+ s2 = clauselist_mv_selectivity_histogram(root, mvstats,
+ clauses, conditions, is_or);
+
+ /* TODO clamp to <= 1.0 (or more strictly, when possible) */
+ return s1 + s2;
+}
+
+/*
+ * Pull varattnos from the clauses, similarly to pull_varattnos() but:
+ *
+ * (a) only get attributes for a particular relation (relid)
+ * (b) ignore system attributes (we can't build stats on them anyway)
+ *
+ * This makes it possible to directly compare the result with attnum
+ * values from pg_attribute etc.
+ */
+static Bitmapset *
+get_varattnos(Node * node, Index relid)
+{
+ int k;
+ Bitmapset *varattnos = NULL;
+ Bitmapset *result = NULL;
+
+ /* get the varattnos */
+ pull_varattnos(node, relid, &varattnos);
+
+ k = -1;
+ while ((k = bms_next_member(varattnos, k)) >= 0)
+ {
+ if (k + FirstLowInvalidHeapAttributeNumber > 0)
+ result
+ = bms_add_member(result,
+ k + FirstLowInvalidHeapAttributeNumber);
+ }
+
+ bms_free(varattnos);
+
+ return result;
+}
+
+/*
+ * Collect attributes from mv-compatible clauses.
+ */
+static Bitmapset *
+collect_mv_attnums(List *clauses, Index relid, int types)
+{
+ Bitmapset *attnums = NULL;
+ ListCell *l;
+
+ /*
+ * Walk through the clauses and identify the ones we can estimate
+ * using multivariate stats, and remember the relid/columns. We'll
+ * then cross-check if we have suitable stats, and only if needed
+ * we'll split the clauses into multivariate and regular lists.
+ *
+ * For now we're only interested in RestrictInfo nodes with nested
+ * OpExpr, using either a range or equality.
+ */
+ foreach (l, clauses)
+ {
+ Node *clause = (Node *) lfirst(l);
+
+ /* ignore the result here - we only need the attnums */
+ clause_is_mv_compatible(clause, relid, &attnums, types);
+ }
+
+ /*
+ * If there are not at least two attributes referenced by the clause(s),
+ * we can throw everything out (as we'll revert to simple stats).
+ */
+ if (bms_num_members(attnums) <= 1)
+ {
+ bms_free(attnums);
+ attnums = NULL;
+ }
+
+ return attnums;
+}
+
+/*
+ * Count the number of attributes in clauses compatible with multivariate stats.
+ */
+static int
+count_mv_attnums(List *clauses, Index relid, int type)
+{
+ int c;
+ Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
+
+ c = bms_num_members(attnums);
+
+ bms_free(attnums);
+
+ return c;
+}
+
+/*
+ * Count varnos referenced in the clauses, and if there's a single varno then
+ * return the index in 'relid'.
+ */
+static int
+count_varnos(List *clauses, Index *relid)
+{
+ int cnt;
+ Bitmapset *varnos = NULL;
+
+ varnos = pull_varnos((Node *) clauses);
+ cnt = bms_num_members(varnos);
+
+ /* if there's a single varno in the clauses, remember it */
+ if (bms_num_members(varnos) == 1)
+ *relid = bms_singleton_member(varnos);
+
+ bms_free(varnos);
+
+ return cnt;
+}
+
+static List *
+clauses_matching_statistic(List **clauses, MVStatisticInfo *statistic,
+ Index relid, int types, bool remove)
+{
+ int i;
+ Bitmapset *stat_attnums = NULL;
+ List *matching_clauses = NIL;
+ ListCell *lc;
+
+ /* build attnum bitmapset for this statistics */
+ for (i = 0; i < statistic->stakeys->dim1; i++)
+ stat_attnums = bms_add_member(stat_attnums,
+ statistic->stakeys->values[i]);
+
+ /*
+ * We can't use foreach here, because we may need to remove some of the
+ * clauses if (remove=true).
+ */
+ lc = list_head(*clauses);
+ while (lc)
+ {
+ Node *clause = (Node*)lfirst(lc);
+ Bitmapset *attnums = NULL;
+
+ /* must advance lc before list_delete possibly pfree's it */
+ lc = lnext(lc);
+
+ /*
+ * skip clauses that are not compatible with stats (just leave them
+ * in the original list)
+ *
+ * XXX Perhaps this should check what stats are actually available in
+ * the statistics (not a big deal now, because MCV and histograms
+ * handle the same types of conditions).
+ */
+ if (! clause_is_mv_compatible(clause, relid, &attnums, types))
+ {
+ bms_free(attnums);
+ continue;
+ }
+
+ /* if the clause is covered by the statistic, add it to the list */
+ if (bms_is_subset(attnums, stat_attnums))
+ {
+ matching_clauses = lappend(matching_clauses, clause);
+
+ /* if remove=true, remove the matching item from the main list */
+ if (remove)
+ *clauses = list_delete_ptr(*clauses, clause);
+ }
+
+ bms_free(attnums);
+ }
+
+ bms_free(stat_attnums);
+
+ return matching_clauses;
+}
+
+/*
+ * Selects the best combination of multivariate statistics, in an exhaustive
+ * way, where 'best' means:
+ *
+ * (a) covering the most attributes (referenced by clauses)
+ * (b) using the least number of multivariate stats
+ * (c) using the most conditions to exploit dependency
+ *
+ * Don't call this directly but through choose_mv_statistics(), which does some
+ * additional tricks to minimize the runtime.
+ *
+ *
+ * Algorithm
+ * ---------
+ * The algorithm is a recursive implementation of backtracking, with maximum
+ * depth equal to the number of multi-variate statistics available on the table.
+ * It actually explores all valid combinations of stats.
+ *
+ * Whenever it considers adding the next statistics, the clauses it matches are
+ * divided into 'conditions' (clauses already matched by at least one previous
+ * statistics) and clauses that are estimated.
+ *
+ * Then several checks are performed:
+ *
+ * (a) The statistics covers at least 2 columns, referenced in the estimated
+ * clauses (otherwise multi-variate stats are useless).
+ *
+ * (b) The statistics covers at least 1 new column, i.e. column not refefenced
+ * by the already used stats (and the new column has to be referenced by
+ * the clauses, of couse). Otherwise the statistics would not add any new
+ * information.
+ *
+ * There are some other sanity checks (e.g. stats must not be used twice etc.).
+ *
+ *
+ * Weaknesses
+ * ----------
+ * The current implemetation uses a rather simple optimality criteria, so it may
+ * not do the best choice when
+ *
+ * (a) There may be multiple solutions with the same number of covered
+ * attributes and number of statistics (e.g. the same solution but with
+ * statistics in a different order). It's unclear which solution in the best
+ * one - in a sense all of them are equal.
+ *
+ * TODO It might be possible to compute estimate for each of those solutions,
+ * and then combine them to get the final estimate (e.g. by using average
+ * or median).
+ *
+ * (b) Does not consider that some types of stats are a better match for some
+ * types of clauses (e.g. MCV list is generally a better match for equality
+ * conditions than a histogram).
+ *
+ * But maybe this is pointless - generally, each column is either a label
+ * (it's not important whether because of the data type or how it's used),
+ * or a value with ordering that makes sense. So either a MCV list is more
+ * appropriate (labels) or a histogram (values with orderings).
+ *
+ * Now sure what to do with statistics on columns mixing both types of data
+ * (some columns would work best with MCVs, some with histograms). Maybe we
+ * could invent a new type of statistics combining MCV list and histogram
+ * (keeping a small histogram for each MCV item, and a separate histogram
+ * for values not on the MCV list).
+ *
+ * TODO The algorithm should probably count number of Vars (not just attnums)
+ * when computing the 'score' of each solution. Computing the ratio of
+ * (num of all vars) / (num of condition vars) as a measure of how well
+ * the solution uses conditions might be useful.
+ */
+static void
+choose_mv_statistics_exhaustive(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* this may run for a long sime, so let's make it interruptible */
+ CHECK_FOR_INTERRUPTS();
+
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ /*
+ * Now try to apply each statistics, matching at least two attributes,
+ * unless it's already used in one of the previous steps.
+ */
+ for (i = 0; i < nmvstats; i++)
+ {
+ int c;
+
+ int ncovered_clauses = 0; /* number of covered clauses */
+ int ncovered_conditions = 0; /* number of covered conditions */
+ int nattnums = 0; /* number of covered attributes */
+
+ Bitmapset *all_attnums = NULL;
+
+ /* skip statistics that were already used or eliminated */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nclauses; c++)
+ {
+ bool covered = false;
+ Bitmapset *clause_attnums = clauses_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! cover_map[i * nclauses + c])
+ continue;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+
+ /* let's see if it's covered by any of the previous stats */
+ for (j = 0; j < step; j++)
+ {
+ /* already covered by the previous stats */
+ if (cover_map[current->stats[j] * nclauses + c])
+ covered = true;
+
+ if (covered)
+ break;
+ }
+
+ /* if already covered, continue with the next clause */
+ if (covered)
+ {
+ ncovered_conditions += 1;
+ continue;
+ }
+
+ /*
+ * OK, this clause is covered by this statistics (and not by
+ * any of the previous ones)
+ */
+ ncovered_clauses += 1;
+ }
+
+ /* can't have more new clauses than original clauses */
+ Assert(nclauses >= ncovered_clauses);
+ Assert(ncovered_clauses >= 0); /* mostly paranoia */
+
+ nattnums = bms_num_members(all_attnums);
+
+ /* free all the bitmapsets - we don't need them anymore */
+ bms_free(all_attnums);
+
+ all_attnums = NULL;
+
+ /*
+ * See if we have clauses covered by this statistics, but not
+ * yet covered by any of the preceding onces.
+ */
+ for (c = 0; c < nconditions; c++)
+ {
+ Bitmapset *clause_attnums = conditions_attnums[c];
+ Bitmapset *tmp = NULL;
+
+ /*
+ * If this clause is not covered by this stats, we can't
+ * use the stats to estimate that at all.
+ */
+ if (! condition_map[i * nconditions + c])
+ continue;
+
+ /* count this as a condition */
+ ncovered_conditions += 1;
+
+ /*
+ * Now we know we'll use this clause - either as a condition
+ * or as a new clause (the estimated one). So let's add the
+ * attributes to the attnums from all the clauses usable with
+ * this statistics.
+ */
+ tmp = bms_union(all_attnums, clause_attnums);
+
+ /* free the old bitmap */
+ bms_free(all_attnums);
+ all_attnums = tmp;
+ }
+
+ /*
+ * Let's mark the statistics as 'ruled out' - either we'll use
+ * it (and proceed to the next step), or it's incompatible.
+ */
+ ruled_out[i] = step;
+
+ /*
+ * There are no clauses usable with this statistics (not already
+ * covered by aome of the previous stats).
+ *
+ * Similarly, if the clauses only use a single attribute, we
+ * can't really use that.
+ */
+ if ((ncovered_clauses == 0) || (nattnums < 2))
+ continue;
+
+ /*
+ * TODO Not sure if it's possible to add a clause referencing
+ * only attributes already covered by previous stats?
+ * Introducing only some new dependency, not a new
+ * attribute. Couldn't come up with an example, though.
+ * Might be worth adding some assert.
+ */
+
+ /*
+ * got a suitable statistics - let's update the current solution,
+ * maybe use it as the best solution
+ */
+ current->nclauses += ncovered_clauses;
+ current->nconditions += ncovered_conditions;
+ current->nstats += 1;
+ current->stats[step] = i;
+
+ /*
+ * We can never cover more clauses, or use more stats that we
+ * actually have at the beginning.
+ */
+ Assert(nclauses >= current->nclauses);
+ Assert(nmvstats >= current->nstats);
+ Assert(step < nmvstats);
+
+ if (*best == NULL)
+ {
+ *best = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ (*best)->nstats = 0;
+ (*best)->nclauses = 0;
+ (*best)->nconditions = 0;
+ }
+
+ /* see if it's better than the current 'best' solution */
+ if ((current->nclauses > (*best)->nclauses) ||
+ ((current->nclauses == (*best)->nclauses) &&
+ ((current->nstats > (*best)->nstats))))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_exhaustive(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= ncovered_clauses;
+ current->nconditions -= ncovered_conditions;
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[i] = -1;
+
+ Assert(current->nclauses >= 0);
+ Assert(current->nstats >= 0);
+ }
+
+ /* reset all statistics as 'incompatible' in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+}
+
+/*
+ * Greedy search for a multivariate solution - a sequence of statistics covering
+ * the clauses. This chooses the "best" statistics at each step, so the
+ * resulting solution may not be the best solution globally, but this produces
+ * the solution in only N steps (where N is the number of statistics), while
+ * the exhaustive approach may have to walk through ~N! combinations (although
+ * some of those are terminated early).
+ *
+ * See the comments at choose_mv_statistics_exhaustive() as this does the same
+ * thing (but in a different way).
+ *
+ * Don't call this directly, but through choose_mv_statistics().
+ *
+ * TODO There are probably other metrics we might use - e.g. using number of
+ * columns (num_cond_columns / num_cov_columns), which might work better
+ * with a mix of simple and complex clauses.
+ *
+ * TODO Also the choice at the very first step should be handled in a special
+ * way, because there will be 0 conditions at that moment, so there needs
+ * to be some other criteria - e.g. using the simplest (or most complex?)
+ * clause might be a good idea.
+ *
+ * TODO We might also select multiple stats using different criteria, and branch
+ * the search. This is however tricky, because if we choose k statistics at
+ * each step, we get k^N branches to walk through (with N steps). That's
+ * not really good with large number of stats (yet better than exhaustive
+ * search).
+ */
+static void
+choose_mv_statistics_greedy(PlannerInfo *root, int step,
+ int nmvstats, MVStatisticInfo *mvstats, Bitmapset ** stats_attnums,
+ int nclauses, Node ** clauses, Bitmapset ** clauses_attnums,
+ int nconditions, Node ** conditions, Bitmapset ** conditions_attnums,
+ bool *cover_map, bool *condition_map, int *ruled_out,
+ mv_solution_t *current, mv_solution_t **best)
+{
+ int i, j;
+ int best_stat = -1;
+ double gain, max_gain = -1.0;
+
+ /*
+ * Bitmap tracking which clauses are already covered (by the previous
+ * statistics) and may thus serve only as a condition in this step.
+ */
+ bool *covered_clauses = (bool*)palloc0(nclauses);
+
+ /*
+ * Number of clauses and columns covered by each statistics - this
+ * includes both conditions and clauses covered by the statistics for
+ * the first time. The number of columns may count some columns
+ * repeatedly - if a column is shared by multiple clauses, it will
+ * be counted once for each clause (covered by the statistics).
+ * So with two clauses [(a=1 OR b=2),(a<2 OR c>1)] the column "a"
+ * will be counted twice (if both clauses are covered).
+ *
+ * The values for reduded statistics (that can't be applied) are
+ * not computed, because that'd be pointless.
+ */
+ int *num_cov_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cov_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Same as above, but this only includes clauses that are already
+ * covered by the previous stats (and the current one).
+ */
+ int *num_cond_clauses = (int*)palloc0(sizeof(int) * nmvstats);
+ int *num_cond_columns = (int*)palloc0(sizeof(int) * nmvstats);
+
+ /*
+ * Number of attributes for each clause.
+ *
+ * TODO Might be computed in choose_mv_statistics() and then passed
+ * here, but then the function would not have the same signature
+ * as _exhaustive().
+ */
+ int *attnum_counts = (int*)palloc0(sizeof(int) * nclauses);
+ int *attnum_cond_counts = (int*)palloc0(sizeof(int) * nconditions);
+
+ CHECK_FOR_INTERRUPTS();
+
+ Assert(best != NULL);
+ Assert((step == 0 && current == NULL) || (step > 0 && current != NULL));
+
+ /* compute attributes (columns) for each clause */
+ for (i = 0; i < nclauses; i++)
+ attnum_counts[i] = bms_num_members(clauses_attnums[i]);
+
+ /* compute attributes (columns) for each condition */
+ for (i = 0; i < nconditions; i++)
+ attnum_cond_counts[i] = bms_num_members(conditions_attnums[i]);
+
+ /* see which clauses are already covered at this point (by previous stats) */
+ for (i = 0; i < step; i++)
+ for (j = 0; j < nclauses; j++)
+ covered_clauses[j] |= (cover_map[current->stats[i] * nclauses + j]);
+
+ /* which remaining statistics covers most clauses / uses most conditions? */
+ for (i = 0; i < nmvstats; i++)
+ {
+ Bitmapset *attnums_covered = NULL;
+ Bitmapset *attnums_conditions = NULL;
+
+ /* skip stats that are already ruled out (either used or inapplicable) */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* count covered clauses and conditions (for the statistics) */
+ for (j = 0; j < nclauses; j++)
+ {
+ if (cover_map[i * nclauses + j])
+ {
+ Bitmapset *attnums_new
+ = bms_union(attnums_covered, clauses_attnums[j]);
+
+ /* get rid of the old bitmap and keep the unified result */
+ bms_free(attnums_covered);
+ attnums_covered = attnums_new;
+
+ num_cov_clauses[i] += 1;
+ num_cov_columns[i] += attnum_counts[j];
+
+ /* is the clause already covered (i.e. a condition)? */
+ if (covered_clauses[j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_counts[j];
+ attnums_new = bms_union(attnums_conditions,
+ clauses_attnums[j]);
+
+ bms_free(attnums_conditions);
+ attnums_conditions = attnums_new;
+ }
+ }
+ }
+
+ /* if all covered clauses are covered by prev stats (thus conditions) */
+ if (num_cov_clauses[i] == num_cond_clauses[i])
+ ruled_out[i] = step;
+
+ /* same if there are no new attributes */
+ else if (bms_num_members(attnums_conditions) == bms_num_members(attnums_covered))
+ ruled_out[i] = step;
+
+ bms_free(attnums_covered);
+ bms_free(attnums_conditions);
+
+ /* if the statistics is inapplicable, try the next one */
+ if (ruled_out[i] != -1)
+ continue;
+
+ /* now let's walk through conditions and count the covered */
+ for (j = 0; j < nconditions; j++)
+ {
+ if (condition_map[i * nconditions + j])
+ {
+ num_cond_clauses[i] += 1;
+ num_cond_columns[i] += attnum_cond_counts[j];
+ }
+ }
+
+ /* otherwise see if this improves the interesting metrics */
+ gain = num_cond_columns[i] / (double)num_cov_columns[i];
+
+ if (gain > max_gain)
+ {
+ max_gain = gain;
+ best_stat = i;
+ }
+ }
+
+ /*
+ * Have we found a suitable statistics? Add it to the solution and
+ * try next step.
+ */
+ if (best_stat != -1)
+ {
+ /* mark the statistics, so that we skip it in next steps */
+ ruled_out[best_stat] = step;
+
+ /* allocate current solution if necessary */
+ if (current == NULL)
+ {
+ current = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ current->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ current->nstats = 0;
+ current->nclauses = 0;
+ current->nconditions = 0;
+ }
+
+ current->nclauses += num_cov_clauses[best_stat];
+ current->nconditions += num_cond_clauses[best_stat];
+ current->stats[step] = best_stat;
+ current->nstats++;
+
+ if (*best == NULL)
+ {
+ (*best) = (mv_solution_t*)palloc0(sizeof(mv_solution_t));
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+
+ (*best)->stats = (int*)palloc0(sizeof(int)*nmvstats);
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ else
+ {
+ /* see if this is a better solution */
+ double current_gain = (double)current->nconditions / current->nclauses;
+ double best_gain = (double)(*best)->nconditions / (*best)->nclauses;
+
+ if ((current_gain > best_gain) ||
+ ((current_gain == best_gain) && (current->nstats < (*best)->nstats)))
+ {
+ (*best)->nstats = current->nstats;
+ (*best)->nclauses = current->nclauses;
+ (*best)->nconditions = current->nconditions;
+ memcpy((*best)->stats, current->stats, nmvstats * sizeof(int));
+ }
+ }
+
+ /*
+ * The recursion only makes sense if we haven't covered all the
+ * attributes (then adding stats is not really possible).
+ */
+ if ((step + 1) < nmvstats)
+ choose_mv_statistics_greedy(root, step+1,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses, clauses_attnums,
+ nconditions, conditions, conditions_attnums,
+ cover_map, condition_map, ruled_out,
+ current, best);
+
+ /* reset the last step */
+ current->nclauses -= num_cov_clauses[best_stat];
+ current->nconditions -= num_cond_clauses[best_stat];
+ current->nstats -= 1;
+ current->stats[step] = 0;
+
+ /* mark the statistics as usable again */
+ ruled_out[best_stat] = -1;
+ }
+
+ /* reset all statistics eliminated in this step */
+ for (i = 0; i < nmvstats; i++)
+ if (ruled_out[i] == step)
+ ruled_out[i] = -1;
+
+ /* free everything allocated in this step */
+ pfree(covered_clauses);
+ pfree(attnum_counts);
+ pfree(num_cov_clauses);
+ pfree(num_cov_columns);
+ pfree(num_cond_clauses);
+ pfree(num_cond_columns);
+}
+
+/*
+ * Remove clauses not covered by any of the available statistics
+ *
+ * This helps us to reduce the amount of work done in choose_mv_statistics()
+ * by not having to deal with clauses that can't possibly be useful.
+ */
+static List *
+filter_clauses(PlannerInfo *root, Index relid, int type,
+ List *stats, List *clauses, Bitmapset **attnums)
+{
+ ListCell *c;
+ ListCell *s;
+
+ /* results (list of compatible clauses, attnums) */
+ List *rclauses = NIL;
+
+ foreach (c, clauses)
+ {
+ Node *clause = (Node*)lfirst(c);
+ Bitmapset *clause_attnums = NULL;
+
+ /*
+ * We do assume that thanks to previous checks, we should not run into
+ * clauses that are incompatible with multivariate stats here. We also
+ * need to collect the attnums for the clause.
+ *
+ * XXX Maybe turn this into an assert?
+ */
+ if (! clause_is_mv_compatible(clause, relid, &clause_attnums, type))
+ elog(ERROR, "should not get non-mv-compatible cluase");
+
+ /* Is there a multivariate statistics covering the clause? */
+ foreach (s, stats)
+ {
+ int k, matches = 0;
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* skip statistics not matching the required type */
+ if (! stats_type_matches(stat, type))
+ continue;
+
+ /*
+ * see if all clause attributes are covered by the statistic
+ *
+ * We'll do that in the opposite direction, i.e. we'll see how many
+ * attributes of the statistic are referenced in the clause, and then
+ * compare the counts.
+ */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ if (bms_is_member(stat->stakeys->values[k], clause_attnums))
+ matches += 1;
+
+ /*
+ * If the number of matches is equal to attributes referenced by the
+ * clause, then the clause is covered by the statistic.
+ */
+ if (bms_num_members(clause_attnums) == matches)
+ {
+ *attnums = bms_union(*attnums, clause_attnums);
+ rclauses = lappend(rclauses, clause);
+ break;
+ }
+ }
+
+ bms_free(clause_attnums);
+ }
+
+ /* we can't have more compatible conditions than source conditions */
+ Assert(list_length(clauses) >= list_length(rclauses));
+
+ return rclauses;
+}
+
+/*
+ * Remove statistics not covering any new clauses
+ *
+ * Statistics not covering any new clauses (conditions don't count) are not
+ * really useful, so let's ignore them. Also, we need the statistics to
+ * reference at least two different attributes (both in conditions and clauses
+ * combined), and at least one of them in the clauses alone.
+ *
+ * This check might be made more strict by checking against individual clauses,
+ * because by using the bitmapsets of all attnums we may actually use attnums
+ * from clauses that are not covered by the statistics. For example, we may
+ * have a condition
+ *
+ * (a=1 AND b=2)
+ *
+ * and a new clause
+ *
+ * (c=1 AND d=1)
+ *
+ * With only bitmapsets, statistics on [b,c] will pass through this (assuming
+ * there are some statistics covering both clases).
+ *
+ * Parameters:
+ *
+ * stats - list of statistics to filter
+ * new_attnums - attnums referenced in new clauses
+ * all_attnums - attnums referenced by contidions and new clauses combined
+ *
+ * Returns filtered list of statistics.
+ *
+ * TODO Do the more strict check, i.e. walk through individual clauses and
+ * conditions and only use those covered by the statistics.
+ */
+static List *
+filter_stats(List *stats, Bitmapset *new_attnums, Bitmapset *all_attnums)
+{
+ ListCell *s;
+ List *stats_filtered = NIL;
+
+ foreach (s, stats)
+ {
+ int k;
+ int matches_new = 0,
+ matches_all = 0;
+
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(s);
+
+ /* see how many attributes the statistics covers */
+ for (k = 0; k < stat->stakeys->dim1; k++)
+ {
+ /* attributes from new clauses */
+ if (bms_is_member(stat->stakeys->values[k], new_attnums))
+ matches_new += 1;
+
+ /* attributes from onditions */
+ if (bms_is_member(stat->stakeys->values[k], all_attnums))
+ matches_all += 1;
+ }
+
+ /* check we have enough attributes for this statistics */
+ if ((matches_new >= 1) && (matches_all >= 2))
+ stats_filtered = lappend(stats_filtered, stat);
+ }
+
+ /* we can't have more useful stats than we had originally */
+ Assert(list_length(stats) >= list_length(stats_filtered));
+
+ return stats_filtered;
+}
+
+static MVStatisticInfo *
+make_stats_array(List *stats, int *nmvstats)
+{
+ int i;
+ ListCell *l;
+
+ MVStatisticInfo *mvstats = NULL;
+ *nmvstats = list_length(stats);
- /* TODO Evaluate simple 1D selectivities, use the smallest one as
- * an upper bound, product as lower bound, and sort the
- * clauses in ascending order by selectivity (to optimize the
- * MCV/histogram evaluation).
- */
+ mvstats
+ = (MVStatisticInfo*)palloc0((*nmvstats) * sizeof(MVStatisticInfo));
- /* Evaluate the MCV first. */
- s1 = clauselist_mv_selectivity_mcvlist(root, clauses, mvstats,
- &fullmatch, &mcv_low);
+ i = 0;
+ foreach (l, stats)
+ {
+ MVStatisticInfo *stat = (MVStatisticInfo *)lfirst(l);
+ memcpy(&mvstats[i++], stat, sizeof(MVStatisticInfo));
+ }
- /*
- * If we got a full equality match on the MCV list, we're done (and
- * the estimate is pretty good).
- */
- if (fullmatch && (s1 > 0.0))
- return s1;
+ return mvstats;
+}
- /* TODO if (fullmatch) without matching MCV item, use the mcv_low
- * selectivity as upper bound */
+static Bitmapset **
+make_stats_attnums(MVStatisticInfo *mvstats, int nmvstats)
+{
+ int i, j;
+ Bitmapset **stats_attnums = NULL;
- s2 = clauselist_mv_selectivity_histogram(root, clauses, mvstats);
+ Assert(nmvstats > 0);
- /* TODO clamp to <= 1.0 (or more strictly, when possible) */
- return s1 + s2;
+ /* build bitmaps of attnums for the stats (easier to compare) */
+ stats_attnums = (Bitmapset **)palloc0(nmvstats * sizeof(Bitmapset*));
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < mvstats[i].stakeys->dim1; j++)
+ stats_attnums[i]
+ = bms_add_member(stats_attnums[i],
+ mvstats[i].stakeys->values[j]);
+
+ return stats_attnums;
}
+
/*
- * Collect attributes from mv-compatible clauses.
+ * Remove redundant statistics
+ *
+ * If there are multiple statistics covering the same set of columns (counting
+ * only those referenced by clauses and conditions), we can apply one of those
+ * anyway and further reduce the size of the optimization problem.
+ *
+ * Thus when redundant stats are detected, we keep the smaller one (the one with
+ * fewer columns), based on the assumption that it's more accurate and also
+ * faster to process. That may be untrue for two reasons - first, the accuracy
+ * really depends on number of buckets/MCV items, not the number of columns.
+ * Second, some types of statistics may work better for certain types of clauses
+ * (e.g. MCV lists for equality conditions) etc.
*/
-static Bitmapset *
-collect_mv_attnums(List *clauses, Index relid, int types)
+static List*
+filter_redundant_stats(List *stats, List *clauses, List *conditions)
{
- Bitmapset *attnums = NULL;
- ListCell *l;
+ int i, j, nmvstats;
+
+ MVStatisticInfo *mvstats;
+ bool *redundant;
+ Bitmapset **stats_attnums;
+ Bitmapset *varattnos;
+ Index relid;
+
+ Assert(list_length(stats) > 0);
+ Assert(list_length(clauses) > 0);
+
+ /*
+ * We'll convert the list of statistics into an array now, because
+ * the reduction of redundant statistics is easier to do that way
+ * (we can mark previous stats as redundant, etc.).
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
+
+ /* by default, none of the stats is redundant (so palloc0) */
+ redundant = palloc0(nmvstats * sizeof(bool));
+
+ /*
+ * We only expect a single relid here, and also we should get the
+ * same relid from clauses and conditions (but we get it from
+ * clauses, because those are certainly non-empty).
+ */
+ relid = bms_singleton_member(pull_varnos((Node*)clauses));
/*
- * Walk through the clauses and identify the ones we can estimate using
- * multivariate stats, and remember the relid/columns. We'll then
- * cross-check if we have suitable stats, and only if needed we'll split
- * the clauses into multivariate and regular lists.
+ * Get the varattnos from both conditions and clauses.
+ *
+ * This skips system attributes, although that should be impossible
+ * thanks to previous filtering out of incompatible clauses.
*
- * For now we're only interested in RestrictInfo nodes with nested OpExpr,
- * using either a range or equality.
+ * XXX Is that really true?
*/
- foreach (l, clauses)
+ varattnos = bms_union(get_varattnos((Node*)clauses, relid),
+ get_varattnos((Node*)conditions, relid));
+
+ for (i = 1; i < nmvstats; i++)
{
- Node *clause = (Node *) lfirst(l);
+ /* intersect with current statistics */
+ Bitmapset *curr = bms_intersect(stats_attnums[i], varattnos);
- /* ignore the result here - we only need the attnums */
- clause_is_mv_compatible(clause, relid, &attnums, types);
+ /* walk through 'previous' stats and check redundancy */
+ for (j = 0; j < i; j++)
+ {
+ /* intersect with current statistics */
+ Bitmapset *prev;
+
+ /* skip stats already identified as redundant */
+ if (redundant[j])
+ continue;
+
+ prev = bms_intersect(stats_attnums[j], varattnos);
+
+ switch (bms_subset_compare(curr, prev))
+ {
+ case BMS_EQUAL:
+ /*
+ * Use the smaller one (hopefully more accurate).
+ * If both have the same size, use the first one.
+ */
+ if (mvstats[i].stakeys->dim1 >= mvstats[j].stakeys->dim1)
+ redundant[i] = TRUE;
+ else
+ redundant[j] = TRUE;
+
+ break;
+
+ case BMS_SUBSET1: /* curr is subset of prev */
+ redundant[i] = TRUE;
+ break;
+
+ case BMS_SUBSET2: /* prev is subset of curr */
+ redundant[j] = TRUE;
+ break;
+
+ case BMS_DIFFERENT:
+ /* do nothing - keep both stats */
+ break;
+ }
+
+ bms_free(prev);
+ }
+
+ bms_free(curr);
}
- /*
- * If there are not at least two attributes referenced by the clause(s),
- * we can throw everything out (as we'll revert to simple stats).
- */
- if (bms_num_members(attnums) <= 1)
+ /* can't reduce all statistics (at least one has to remain) */
+ Assert(nmvstats > 0);
+
+ /* now, let's remove the reduced statistics from the arrays */
+ list_free(stats);
+ stats = NIL;
+
+ for (i = 0; i < nmvstats; i++)
{
- if (attnums != NULL)
- pfree(attnums);
- attnums = NULL;
+ MVStatisticInfo *info;
+
+ pfree(stats_attnums[i]);
+
+ if (redundant[i])
+ continue;
+
+ info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[i], sizeof(MVStatisticInfo));
+
+ stats = lappend(stats, info);
}
- return attnums;
+ pfree(mvstats);
+ pfree(stats_attnums);
+ pfree(redundant);
+
+ return stats;
}
-/*
- * Count the number of attributes in clauses compatible with multivariate stats.
- */
-static int
-count_mv_attnums(List *clauses, Index relid, int type)
+static Node**
+make_clauses_array(List *clauses, int *nclauses)
{
- int c;
- Bitmapset *attnums = collect_mv_attnums(clauses, relid, type);
+ int i;
+ ListCell *l;
- c = bms_num_members(attnums);
+ Node** clauses_array;
- bms_free(attnums);
+ *nclauses = list_length(clauses);
+ clauses_array = (Node **)palloc0((*nclauses) * sizeof(Node *));
- return c;
+ i = 0;
+ foreach (l, clauses)
+ clauses_array[i++] = (Node *)lfirst(l);
+
+ *nclauses = i;
+
+ return clauses_array;
}
-/*
- * Count varnos referenced in the clauses, and if there's a single varno then
- * return the index in 'relid'.
- */
-static int
-count_varnos(List *clauses, Index *relid)
+static Bitmapset **
+make_clauses_attnums(PlannerInfo *root, Index relid,
+ int type, Node **clauses, int nclauses)
{
- int cnt;
- Bitmapset *varnos = NULL;
+ int i;
+ Bitmapset **clauses_attnums
+ = (Bitmapset **)palloc0(nclauses * sizeof(Bitmapset *));
- varnos = pull_varnos((Node *) clauses);
- cnt = bms_num_members(varnos);
+ for (i = 0; i < nclauses; i++)
+ {
+ Bitmapset * attnums = NULL;
- /* if there's a single varno in the clauses, remember it */
- if (bms_num_members(varnos) == 1)
- *relid = bms_singleton_member(varnos);
+ if (! clause_is_mv_compatible(clauses[i], relid, &attnums, type))
+ elog(ERROR, "should not get non-mv-compatible clause");
- bms_free(varnos);
+ clauses_attnums[i] = attnums;
+ }
- return cnt;
+ return clauses_attnums;
+}
+
+static bool*
+make_cover_map(Bitmapset **stats_attnums, int nmvstats,
+ Bitmapset **clauses_attnums, int nclauses)
+{
+ int i, j;
+ bool *cover_map = (bool*)palloc0(nclauses * nmvstats);
+
+ for (i = 0; i < nmvstats; i++)
+ for (j = 0; j < nclauses; j++)
+ cover_map[i * nclauses + j]
+ = bms_is_subset(clauses_attnums[j], stats_attnums[i]);
+
+ return cover_map;
}
/*
- * We're looking for statistics matching at least 2 attributes, referenced in
- * clauses compatible with multivariate statistics. The current selection
- * criteria is very simple - we choose the statistics referencing the most
- * attributes.
- *
- * If there are multiple statistics referencing the same number of columns
- * (from the clauses), the one with less source columns (as listed in the
- * ADD STATISTICS when creating the statistics) wins. Else the first one wins.
- *
- * This is a very simple criteria, and has several weaknesses:
- *
- * (a) does not consider the accuracy of the statistics
+ * Chooses the combination of statistics, optimal for estimation of a particular
+ * clause list.
*
- * If there are two histograms built on the same set of columns, but one
- * has 100 buckets and the other one has 1000 buckets (thus likely
- * providing better estimates), this is not currently considered.
+ * This only handles a 'preparation' shared by the exhaustive and greedy
+ * implementations (see the previous methods), mostly trying to reduce the size
+ * of the problem (eliminate clauses/statistics that can't be really used in
+ * the solution).
*
- * (b) does not consider the type of statistics
+ * It also precomputes bitmaps for attributes covered by clauses and statistics,
+ * so that we don't need to do that over and over in the actual optimizations
+ * (as it's both CPU and memory intensive).
*
- * If there are three statistics - one containing just a MCV list, another
- * one with just a histogram and a third one with both, we treat them equally.
*
- * (c) does not consider the number of clauses
+ * TODO Another way to make the optimization problems smaller might be splitting
+ * the statistics into several disjoint subsets, i.e. if we can split the
+ * graph of statistics (after the elimination) into multiple components
+ * (so that stats in different components share no attributes), we can do
+ * the optimization for each component separately.
*
- * As explained, only the number of referenced attributes counts, so if
- * there are multiple clauses on a single attribute, this still counts as
- * a single attribute.
- *
- * (d) does not consider type of condition
- *
- * Some clauses may work better with some statistics - for example equality
- * clauses probably work better with MCV lists than with histograms. But
- * IS [NOT] NULL conditions may often work better with histograms (thanks
- * to NULL-buckets).
- *
- * So for example with five WHERE conditions
- *
- * WHERE (a = 1) AND (b = 1) AND (c = 1) AND (d = 1) AND (e = 1)
- *
- * and statistics on (a,b), (a,b,e) and (a,b,c,d), the last one will be selected
- * as it references the most columns.
- *
- * Once we have selected the multivariate statistics, we split the list of
- * clauses into two parts - conditions that are compatible with the selected
- * stats, and conditions are estimated using simple statistics.
- *
- * From the example above, conditions
- *
- * (a = 1) AND (b = 1) AND (c = 1) AND (d = 1)
- *
- * will be estimated using the multivariate statistics (a,b,c,d) while the last
- * condition (e = 1) will get estimated using the regular ones.
- *
- * There are various alternative selection criteria (e.g. counting conditions
- * instead of just referenced attributes), but eventually the best option should
- * be to combine multiple statistics. But that's much harder to do correctly.
- *
- * TODO Select multiple statistics and combine them when computing the estimate.
- *
- * TODO This will probably have to consider compatibility of clauses, because
- * 'dependencies' will probably work only with equality clauses.
+ * TODO If we could compute what is a "perfect solution" maybe we could
+ * terminate the search after reaching ~90% of it? Say, if we knew that we
+ * can cover 10 clauses and reuse 8 dependencies, maybe covering 9 clauses
+ * and 7 dependencies would be OK?
*/
-static MVStatisticInfo *
-choose_mv_statistics(List *stats, Bitmapset *attnums)
+static List*
+choose_mv_statistics(PlannerInfo *root, Index relid, List *stats,
+ List *clauses, List *conditions)
{
int i;
- ListCell *lc;
+ mv_solution_t *best = NULL;
+ List *result = NIL;
+
+ int nmvstats;
+ MVStatisticInfo *mvstats;
- MVStatisticInfo *choice = NULL;
+ /* we only work with MCV lists and histograms here */
+ int type = (MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST);
- int current_matches = 1; /* goal #1: maximize */
- int current_dims = (MVSTATS_MAX_DIMENSIONS+1); /* goal #2: minimize */
+ bool *clause_cover_map = NULL,
+ *condition_cover_map = NULL;
+ int *ruled_out = NULL;
+
+ /* build bitmapsets for all stats and clauses */
+ Bitmapset **stats_attnums;
+ Bitmapset **clauses_attnums;
+ Bitmapset **conditions_attnums;
+
+ int nclauses, nconditions;
+ Node ** clauses_array;
+ Node ** conditions_array;
+
+ /* copy lists, so that we can free them during elimination easily */
+ clauses = list_copy(clauses);
+ conditions = list_copy(conditions);
+ stats = list_copy(stats);
/*
- * Walk through the statistics (simple array with nmvstats elements) and for
- * each one count the referenced attributes (encoded in the 'attnums' bitmap).
+ * Reduce the optimization problem size as much as possible.
+ *
+ * Eliminate clauses and conditions not covered by any statistics,
+ * or statistics not matching at least two attributes (one of them
+ * has to be in a regular clause).
+ *
+ * It's possible that removing a statistics in one iteration
+ * eliminates clause in the next one, so we'll repeat this until we
+ * eliminate no clauses/stats in that iteration.
+ *
+ * This can only happen after eliminating a statistics - clauses are
+ * eliminated first, so statistics always reflect that.
*/
- foreach (lc, stats)
+ while (true)
{
- MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+ List *tmp;
- /* columns matching this statistics */
- int matches = 0;
+ Bitmapset *compatible_attnums = NULL;
+ Bitmapset *condition_attnums = NULL;
+ Bitmapset *all_attnums = NULL;
- int2vector * attrs = info->stakeys;
- int numattrs = attrs->dim1;
-
- /* skip dependencies-only stats */
- if (! (info->mcv_built || info->hist_built))
- continue;
+ /*
+ * Clauses
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. We'll also keep info
+ * about attnums in clauses (without conditions) so that we can
+ * ignore stats covering just conditions (which is pointless).
+ */
+ tmp = filter_clauses(root, relid, type,
+ stats, clauses, &compatible_attnums);
- /* count columns covered by the histogram */
- for (i = 0; i < numattrs; i++)
- if (bms_is_member(attrs->values[i], attnums))
- matches++;
+ /* discard the original list */
+ list_free(clauses);
+ clauses = tmp;
/*
- * Use this statistics when it improves the number of matches or
- * when it matches the same number of attributes but is smaller.
+ * Conditions
+ *
+ * Walk through clauses and keep only those covered by at least
+ * one of the statistics we still have. Also, collect bitmap of
+ * attributes so that we can make sure we add at least one new
+ * attribute (by comparing with clauses).
*/
- if ((matches > current_matches) ||
- ((matches == current_matches) && (current_dims > numattrs)))
+ if (conditions != NIL)
{
- choice = info;
- current_matches = matches;
- current_dims = numattrs;
+ tmp = filter_clauses(root, relid, type,
+ stats, conditions, &condition_attnums);
+
+ /* discard the original list */
+ list_free(conditions);
+ conditions = tmp;
}
- }
- return choice;
-}
+ /* get a union of attnums (from conditions and new clauses) */
+ all_attnums = bms_union(compatible_attnums, condition_attnums);
+ /*
+ * Statisitics
+ *
+ * Walk through statistics and only keep those covering at least
+ * one new attribute (excluding conditions) and at two attributes
+ * in both clauses and conditions.
+ */
+ tmp = filter_stats(stats, compatible_attnums, all_attnums);
-/*
- * This splits the clauses list into two parts - one containing clauses that
- * will be evaluated using the chosen statistics, and the remaining clauses
- * (either non-mvcompatible, or not related to the histogram).
- */
-static List *
-clauselist_mv_split(PlannerInfo *root, Index relid,
- List *clauses, List **mvclauses,
- MVStatisticInfo *mvstats, int types)
-{
- int i;
- ListCell *l;
- List *non_mvclauses = NIL;
+ /* if we've not eliminated anything, terminate */
+ if (list_length(stats) == list_length(tmp))
+ break;
- /* FIXME is there a better way to get info on int2vector? */
- int2vector * attrs = mvstats->stakeys;
- int numattrs = mvstats->stakeys->dim1;
+ /* work only with filtered statistics from now */
+ list_free(stats);
+ stats = tmp;
+ }
- Bitmapset *mvattnums = NULL;
+ /* only do the optimization if we have clauses/statistics */
+ if ((list_length(stats) == 0) || (list_length(clauses) == 0))
+ return NULL;
- /* build bitmap of attributes, so we can do bms_is_subset later */
- for (i = 0; i < numattrs; i++)
- mvattnums = bms_add_member(mvattnums, attrs->values[i]);
+ /* remove redundant stats (stats covered by another stats) */
+ stats = filter_redundant_stats(stats, clauses, conditions);
- /* erase the list of mv-compatible clauses */
- *mvclauses = NIL;
+ /*
+ * TODO We should sort the stats to make the order deterministic,
+ * otherwise we may get different estimates on different
+ * executions - if there are multiple "equally good" solutions,
+ * we'll keep the first solution we see.
+ *
+ * Sorting by OID probably is not the right solution though,
+ * because we'd like it to be somehow reproducible,
+ * irrespectedly of the order of ADD STATISTICS commands.
+ * So maybe statkeys?
+ */
+ mvstats = make_stats_array(stats, &nmvstats);
+ stats_attnums = make_stats_attnums(mvstats, nmvstats);
- foreach (l, clauses)
- {
- bool match = false; /* by default not mv-compatible */
- Bitmapset *attnums = NULL;
- Node *clause = (Node *) lfirst(l);
+ /* collect clauses an bitmap of attnums */
+ clauses_array = make_clauses_array(clauses, &nclauses);
+ clauses_attnums = make_clauses_attnums(root, relid, type,
+ clauses_array, nclauses);
+
+ /* collect conditions and bitmap of attnums */
+ conditions_array = make_clauses_array(conditions, &nconditions);
+ conditions_attnums = make_clauses_attnums(root, relid, type,
+ conditions_array, nconditions);
- if (clause_is_mv_compatible(clause, relid, &attnums, types))
+ /*
+ * Build bitmaps with info about which clauses/conditions are
+ * covered by each statistics (so that we don't need to call the
+ * bms_is_subset over and over again).
+ */
+ clause_cover_map = make_cover_map(stats_attnums, nmvstats,
+ clauses_attnums, nclauses);
+
+ condition_cover_map = make_cover_map(stats_attnums, nmvstats,
+ conditions_attnums, nconditions);
+
+ ruled_out = (int*)palloc0(nmvstats * sizeof(int));
+
+ /* no stats are ruled out by default */
+ for (i = 0; i < nmvstats; i++)
+ ruled_out[i] = -1;
+
+ /* do the optimization itself */
+ if (mvstat_search_type == MVSTAT_SEARCH_EXHAUSTIVE)
+ choose_mv_statistics_exhaustive(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+ else
+ choose_mv_statistics_greedy(root, 0,
+ nmvstats, mvstats, stats_attnums,
+ nclauses, clauses_array, clauses_attnums,
+ nconditions, conditions_array, conditions_attnums,
+ clause_cover_map, condition_cover_map,
+ ruled_out, NULL, &best);
+
+ /* create a list of statistics from the array */
+ if (best != NULL)
+ {
+ for (i = 0; i < best->nstats; i++)
{
- /* are all the attributes part of the selected stats? */
- if (bms_is_subset(attnums, mvattnums))
- match = true;
+ MVStatisticInfo *info = makeNode(MVStatisticInfo);
+ memcpy(info, &mvstats[best->stats[i]], sizeof(MVStatisticInfo));
+ result = lappend(result, info);
}
- /*
- * The clause matches the selected stats, so put it to the list of
- * mv-compatible clauses. Otherwise, keep it in the list of 'regular'
- * clauses (that may be selected later).
- */
- if (match)
- *mvclauses = lappend(*mvclauses, clause);
- else
- non_mvclauses = lappend(non_mvclauses, clause);
+ pfree(best);
}
- /*
- * Perform regular estimation using the clauses incompatible with the chosen
- * histogram (or MV stats in general).
- */
- return non_mvclauses;
+ /* cleanup (maybe leave it up to the memory context?) */
+ for (i = 0; i < nmvstats; i++)
+ bms_free(stats_attnums[i]);
+
+ for (i = 0; i < nclauses; i++)
+ bms_free(clauses_attnums[i]);
+
+ for (i = 0; i < nconditions; i++)
+ bms_free(conditions_attnums[i]);
+
+ pfree(stats_attnums);
+ pfree(clauses_attnums);
+ pfree(conditions_attnums);
+
+ pfree(clauses_array);
+ pfree(conditions_array);
+ pfree(clause_cover_map);
+ pfree(condition_cover_map);
+ pfree(ruled_out);
+ pfree(mvstats);
+
+ list_free(clauses);
+ list_free(conditions);
+ list_free(stats);
+ return result;
}
typedef struct
@@ -1637,9 +2878,6 @@ has_stats(List *stats, int type)
/* terminate if we've found at least one matching statistics */
if (stats_type_matches(stat, type))
return true;
-
- if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
- return true;
}
return false;
@@ -1689,22 +2927,26 @@ find_stats(PlannerInfo *root, Index relid)
* as the clauses are processed (and skip items that are 'match').
*/
static Selectivity
-clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats, bool *fullmatch,
- Selectivity *lowsel)
+clauselist_mv_selectivity_mcvlist(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or,
+ bool *fullmatch, Selectivity *lowsel)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
MCVList mcvlist = NULL;
+
int nmatches = 0;
+ int nconditions = 0;
/* match/mismatch bitmap for each MCV item */
char * matches = NULL;
+ char * condition_matches = NULL;
Assert(clauses != NIL);
- Assert(list_length(clauses) >= 2);
+ Assert(list_length(clauses) >= 1);
/* there's no MCV list built yet */
if (! mvstats->mcv_built)
@@ -1715,32 +2957,85 @@ clauselist_mv_selectivity_mcvlist(PlannerInfo *root, List *clauses,
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
- /* by default all the MCV items match the clauses fully */
- matches = palloc0(sizeof(char) * mcvlist->nitems);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
-
/* number of matching MCV items */
nmatches = mcvlist->nitems;
+ nconditions = mcvlist->nitems;
+
+ /*
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
+ */
+ matches = palloc0(sizeof(char) * nmatches);
+
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char) * nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ nconditions = update_match_bitmap_mcvlist(root, conditions,
+ mvstats->stakeys, mcvlist,
+ nconditions, condition_matches,
+ lowsel, fullmatch, false);
+
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all MCV items, even those
+ * ruled out by the conditions. The final result should be the
+ * same, but it might be faster.
+ */
nmatches = update_match_bitmap_mcvlist(root, clauses,
mvstats->stakeys, mcvlist,
- nmatches, matches,
- lowsel, fullmatch, false);
+ ((is_or) ? 0 : nmatches), matches,
+ lowsel, fullmatch, is_or);
/* sum frequencies for all the matching MCV items */
for (i = 0; i < mcvlist->nitems; i++)
{
- /* used to 'scale' for MCV lists not covering all tuples */
+ /*
+ * Find out what part of the data is covered by the MCV list,
+ * so that we can 'scale' the selectivity properly (e.g. when
+ * only 50% of the sample items got into the MCV, and the rest
+ * is either in a histogram, or not covered by stats).
+ *
+ * TODO This might be handled by keeping a global "frequency"
+ * for the whole list, which might save us a bit of time
+ * spent on accessing the not-matching part of the MCV list.
+ * Although it's likely in a cache, so it's very fast.
+ */
u += mcvlist->items[i]->frequency;
+ /* skit MCV items not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+
if (matches[i] != MVSTATS_MATCH_NONE)
s += mcvlist->items[i]->frequency;
+
+ t += mcvlist->items[i]->frequency;
}
pfree(matches);
+ pfree(condition_matches);
pfree(mcvlist);
- return s*u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/*
@@ -1971,64 +3266,57 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
}
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each MCV item */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching MCV items */
- or_nmatches = mcvlist->nitems;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mcvlist->nitems;
/* by default none of the MCV items matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ tmp_matches = palloc0(sizeof(char) * mcvlist->nitems);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mcvlist->nitems);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_mcvlist(root, orclauses,
+ tmp_nmatches = update_match_bitmap_mcvlist(root, tmp_clauses,
stakeys, mcvlist,
- or_nmatches, or_matches,
+ tmp_nmatches, tmp_matches,
lowsel, fullmatch, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mcvlist->nitems; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
+ pfree(tmp_matches);
}
else
- {
elog(ERROR, "unknown clause type: %d", clause->type);
- }
}
/*
@@ -2086,15 +3374,18 @@ update_match_bitmap_mcvlist(PlannerInfo *root, List *clauses,
* this is not uncommon, but for histograms it's not that clear.
*/
static Selectivity
-clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
- MVStatisticInfo *mvstats)
+clauselist_mv_selectivity_histogram(PlannerInfo *root, MVStatisticInfo *mvstats,
+ List *clauses, List *conditions, bool is_or)
{
int i;
Selectivity s = 0.0;
+ Selectivity t = 0.0;
Selectivity u = 0.0;
int nmatches = 0;
+ int nconditions = 0;
char *matches = NULL;
+ char *condition_matches = NULL;
MVSerializedHistogram mvhist = NULL;
@@ -2107,25 +3398,55 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
Assert (mvhist != NULL);
Assert (clauses != NIL);
- Assert (list_length(clauses) >= 2);
+ Assert (list_length(clauses) >= 1);
+
+ nmatches = mvhist->nbuckets;
+ nconditions = mvhist->nbuckets;
/*
- * Bitmap of bucket matches (mismatch, partial, full). by default
- * all buckets fully match (and we'll eliminate them).
+ * Bitmap of bucket matches (mismatch, partial, full).
+ *
+ * For AND clauses all buckets match (and we'll eliminate them).
+ * For OR clauses no buckets match (and we'll add them).
+ *
+ * We only need to do the memset for AND clauses (for OR clauses
+ * it's already set correctly by the palloc0).
*/
- matches = palloc0(sizeof(char) * mvhist->nbuckets);
- memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
+ matches = palloc0(sizeof(char) * nmatches);
- nmatches = mvhist->nbuckets;
+ if (! is_or) /* AND-clause */
+ memset(matches, MVSTATS_MATCH_FULL, sizeof(char)*nmatches);
+
+ /* Conditions are treated as AND clause, so match by default. */
+ condition_matches = palloc0(sizeof(char)*nconditions);
+ memset(condition_matches, MVSTATS_MATCH_FULL, sizeof(char)*nconditions);
+
+ /*
+ * build the match bitmap for the conditions (conditions are always
+ * connected by AND)
+ */
+ if (conditions != NIL)
+ update_match_bitmap_histogram(root, conditions,
+ mvstats->stakeys, mvhist,
+ nconditions, condition_matches, false);
- /* build the match bitmap */
+ /*
+ * build the match bitmap for the estimated clauses
+ *
+ * TODO This evaluates the clauses for all buckets, even those
+ * ruled out by the conditions. The final result should be
+ * the same, but it might be faster.
+ */
update_match_bitmap_histogram(root, clauses,
mvstats->stakeys, mvhist,
- nmatches, matches, false);
+ ((is_or) ? 0 : nmatches), matches,
+ is_or);
/* now, walk through the buckets and sum the selectivities */
for (i = 0; i < mvhist->nbuckets; i++)
{
+ float coeff = 1.0;
+
/*
* Find out what part of the data is covered by the histogram,
* so that we can 'scale' the selectivity properly (e.g. when
@@ -2139,10 +3460,23 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
*/
u += mvhist->buckets[i]->ntuples;
+ /* skip buckets not matching the conditions */
+ if (condition_matches[i] == MVSTATS_MATCH_NONE)
+ continue;
+ else if (condition_matches[i] == MVSTATS_MATCH_PARTIAL)
+ coeff = 0.5;
+
+ t += coeff * mvhist->buckets[i]->ntuples;
+
if (matches[i] == MVSTATS_MATCH_FULL)
- s += mvhist->buckets[i]->ntuples;
+ s += coeff * mvhist->buckets[i]->ntuples;
else if (matches[i] == MVSTATS_MATCH_PARTIAL)
- s += 0.5 * mvhist->buckets[i]->ntuples;
+ /*
+ * TODO If both conditions and clauses match partially, this
+ * will use 0.25 match - not sure if that's the right
+ * thing solution, but seems about right.
+ */
+ s += coeff * 0.5 * mvhist->buckets[i]->ntuples;
}
#ifdef DEBUG_MVHIST
@@ -2151,9 +3485,14 @@ clauselist_mv_selectivity_histogram(PlannerInfo *root, List *clauses,
/* release the allocated bitmap and deserialized histogram */
pfree(matches);
+ pfree(condition_matches);
pfree(mvhist);
- return s * u;
+ /* no condition matches */
+ if (t == 0.0)
+ return (Selectivity)0.0;
+
+ return (s / t) * u;
}
/* cached result of bucket boundary comparison for a single dimension */
@@ -2344,7 +3683,7 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
FmgrInfo opproc; /* operator */
fmgr_info(get_opcode(expr->opno), &opproc);
-
+
/* reset the cache (per clause) */
memset(callcache, 0, mvhist->nbuckets);
@@ -2504,64 +3843,57 @@ update_match_bitmap_histogram(PlannerInfo *root, List *clauses,
UPDATE_RESULT(matches[i], MVSTATS_MATCH_NONE, is_or);
}
}
- else if (or_clause(clause) || and_clause(clause))
+ else if (or_clause(clause) || and_clause(clause) || not_clause(clause))
{
/* AND/OR clause, with all clauses compatible with the selected MV stat */
int i;
- BoolExpr *orclause = ((BoolExpr*)clause);
- List *orclauses = orclause->args;
+ List *tmp_clauses = ((BoolExpr*)clause)->args;
/* match/mismatch bitmap for each bucket */
- int or_nmatches = 0;
- char * or_matches = NULL;
+ int tmp_nmatches = 0;
+ char * tmp_matches = NULL;
- Assert(orclauses != NIL);
- Assert(list_length(orclauses) >= 2);
+ Assert(tmp_clauses != NIL);
+ Assert((list_length(tmp_clauses) >= 2) || (not_clause(clause) && (list_length(tmp_clauses)==1)));
/* number of matching buckets */
- or_nmatches = mvhist->nbuckets;
+ tmp_nmatches = (or_clause(clause)) ? 0 : mvhist->nbuckets;
- /* by default none of the buckets matches the clauses */
- or_matches = palloc0(sizeof(char) * or_nmatches);
+ /* by default none of the buckets matches the clauses (OR clause) */
+ tmp_matches = palloc0(sizeof(char) * mvhist->nbuckets);
- if (or_clause(clause))
- {
- /* OR clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_NONE, sizeof(char)*or_nmatches);
- or_nmatches = 0;
- }
- else
- {
- /* AND clauses assume nothing matches, initially */
- memset(or_matches, MVSTATS_MATCH_FULL, sizeof(char)*or_nmatches);
- }
+ /* but AND (and NOT) clauses assume everything matches, initially */
+ if (! or_clause(clause))
+ memset(tmp_matches, MVSTATS_MATCH_FULL, sizeof(char)*mvhist->nbuckets);
/* build the match bitmap for the OR-clauses */
- or_nmatches = update_match_bitmap_histogram(root, orclauses,
+ tmp_nmatches = update_match_bitmap_histogram(root, tmp_clauses,
stakeys, mvhist,
- or_nmatches, or_matches, or_clause(clause));
+ tmp_nmatches, tmp_matches, or_clause(clause));
/* merge the bitmap into the existing one*/
for (i = 0; i < mvhist->nbuckets; i++)
{
+ /* if this is a NOT clause, we need to invert the results first */
+ if (not_clause(clause))
+ tmp_matches[i] = (MVSTATS_MATCH_FULL - tmp_matches[i]);
+
/*
* To AND-merge the bitmaps, a MIN() semantics is used.
* For OR-merge, use MAX().
*
* FIXME this does not decrease the number of matches
*/
- UPDATE_RESULT(matches[i], or_matches[i], is_or);
+ UPDATE_RESULT(matches[i], tmp_matches[i], is_or);
}
- pfree(or_matches);
-
+ pfree(tmp_matches);
}
else
elog(ERROR, "unknown clause type: %d", clause->type);
}
- /* free the call cache */
pfree(callcache);
return nmatches;
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 5350329..57214e0 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3518,7 +3518,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/*
* Also get the normal inner-join selectivity of the join clauses.
@@ -3541,7 +3542,8 @@ compute_semi_anti_join_factors(PlannerInfo *root,
joinquals,
0,
JOIN_INNER,
- &norm_sjinfo);
+ &norm_sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
if (jointype == JOIN_ANTI)
@@ -3708,7 +3710,7 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
Node *qual = (Node *) lfirst(l);
/* Note that clause_selectivity will be able to cache its result */
- selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo);
+ selec *= clause_selectivity(root, qual, 0, JOIN_INNER, &sjinfo, NIL);
}
/* Apply it to the input relation sizes */
@@ -3744,7 +3746,8 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
rel->baserestrictinfo,
0,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
rel->rows = clamp_row_est(nrows);
@@ -3781,7 +3784,8 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
allclauses,
rel->relid, /* do not use 0! */
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
nrows = clamp_row_est(nrows);
/* For safety, make sure result is not more than the base estimate */
if (nrows > rel->rows)
@@ -3919,12 +3923,14 @@ calc_joinrel_size_estimate(PlannerInfo *root,
joinquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = clauselist_selectivity(root,
pushedquals,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
/* Avoid leaking a lot of ListCells */
list_free(joinquals);
@@ -3936,7 +3942,8 @@ calc_joinrel_size_estimate(PlannerInfo *root,
restrictlist,
0,
jointype,
- sjinfo);
+ sjinfo,
+ NIL);
pselec = 0.0; /* not used, keep compiler quiet */
}
diff --git a/src/backend/optimizer/util/orclauses.c b/src/backend/optimizer/util/orclauses.c
index ea831f5..6299e75 100644
--- a/src/backend/optimizer/util/orclauses.c
+++ b/src/backend/optimizer/util/orclauses.c
@@ -280,7 +280,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
* saving work later.)
*/
or_selec = clause_selectivity(root, (Node *) or_rinfo,
- 0, JOIN_INNER, NULL);
+ 0, JOIN_INNER, NULL, NIL);
/*
* The clause is only worth adding to the query if it rejects a useful
@@ -342,7 +342,7 @@ consider_new_or_clause(PlannerInfo *root, RelOptInfo *rel,
/* Compute inner-join size */
orig_selec = clause_selectivity(root, (Node *) join_or_rinfo,
- 0, JOIN_INNER, &sjinfo);
+ 0, JOIN_INNER, &sjinfo, NIL);
/* And hack cached selectivity so join size remains the same */
join_or_rinfo->norm_selec = orig_selec / or_selec;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index d396ef1..805d633 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1627,13 +1627,15 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
case IS_NOT_FALSE:
selec = (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
case IS_FALSE:
case IS_NOT_TRUE:
selec = 1.0 - (double) clause_selectivity(root, arg,
varRelid,
- jointype, sjinfo);
+ jointype, sjinfo,
+ NIL);
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
@@ -6260,7 +6262,8 @@ genericcostestimate(PlannerInfo *root,
indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/*
* If caller didn't give us an estimate, estimate the number of index
@@ -6580,7 +6583,8 @@ btcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
btreeSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
numIndexTuples = btreeSelectivity * index->rel->tuples;
/*
@@ -7331,7 +7335,8 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity = clauselist_selectivity(root, selectivityQuals,
index->rel->relid,
JOIN_INNER,
- NULL);
+ NULL,
+ NIL);
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
@@ -7561,7 +7566,7 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
*indexSelectivity =
clauselist_selectivity(root, indexQuals,
path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
+ JOIN_INNER, NULL, NIL);
*indexCorrelation = 1;
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index edcafce..b7aabed 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -75,6 +75,7 @@
#include "utils/bytea.h"
#include "utils/guc_tables.h"
#include "utils/memutils.h"
+#include "utils/mvstats.h"
#include "utils/pg_locale.h"
#include "utils/plancache.h"
#include "utils/portal.h"
@@ -393,6 +394,15 @@ static const struct config_enum_entry force_parallel_mode_options[] = {
};
/*
+ * Search algorithm for multivariate stats.
+ */
+static const struct config_enum_entry mvstat_search_options[] = {
+ {"greedy", MVSTAT_SEARCH_GREEDY, false},
+ {"exhaustive", MVSTAT_SEARCH_EXHAUSTIVE, false},
+ {NULL, 0, false}
+};
+
+/*
* Options for enum values stored in other modules
*/
extern const struct config_enum_entry wal_level_options[];
@@ -3743,6 +3753,16 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"mvstat_search", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the algorithm used for combining multivariate stats."),
+ NULL
+ },
+ &mvstat_search_type,
+ MVSTAT_SEARCH_GREEDY, mvstat_search_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index 3e4f4d1..d404914 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -90,6 +90,137 @@ even attempting to do the more expensive estimation.
Whenever we find there are no suitable stats, we skip the expensive steps.
+Combining multiple statistics
+-----------------------------
+
+When estimating selectivity of a list of clauses, there may exist no statistics
+covering all of them. If there are multiple statistics, each covering some
+subset of the attributes, the optimizer needs to figure out which of those
+statistics to apply.
+
+When the statistics do not overlap, the solution is trivial - we can simply
+split the groups of conditions by the matching statistics, and then multiply the
+selectivities. For example assume multivariate statistics on (b,c) and (d,e),
+and a condition like this:
+
+ (a=1) AND (b=2) AND (c=3) AND (d=4) AND (e=5)
+
+Then (a=1) is not covered by any of the statistics, so will be estimated using
+the regular per-column statistics. The two conditions ((b=2) AND (c=3)) will be
+estimated using the (b,c) statistics, and ((d=4) AND (e=5)) will be estimated
+using (d,e) statistics. And the resulting selectivities will be estimated.
+
+Now, what if the statistics overlap? For example assume the same condition as
+above, but let's say we have statistics on (a,b,c) and (a,c,d,e). What then?
+
+As selectivity is just a probability that the condition holds for a random row,
+we can write the selectivity like this:
+
+ P(a=1 & b=2 & c=3 & d=4 & e=5)
+
+and we can rewrite it using conditional probability like this
+
+ P(a=1 & b=2 & c=3) * P(d=4 & e=5 | a=1 & b=2 & c=3)
+
+Notice that the first part already matches to (a,b,c) statistics. If we assume
+that columns that are not referenced by the same statistics are independent, we
+may rewrite the second half like this
+
+ P(d=4 & e=5 | a=1 & b=2 & c=3) = P(d=4 & e=5 | a=1 & c=3)
+
+which corresponds to the statistics on (a,c,d,e).
+
+If there are multiple statistics defined on a table, it's not difficult to come
+up with examples when there are multiple ways to combine them to cover a list of
+clauses. We need a way to find the best combination of statistics.
+
+This is the purpose of choose_mv_statistics(). It searches through the possible
+combinations of statistics, and searches such combination that
+
+ (a) covers the most clauses of the list
+
+ (b) reuses the maximum number of clauses as conditions
+ (in conditional probabilities)
+
+While (a) criteria seems natural, the (b) may seem a bit awkward at first. The
+idea is that conditions in a way of transfering information about dependencies
+between statistics.
+
+There are two alternative implementations of choose_mv_statistics() - greedy
+and exhaustive. Exhaustive actually searches through all possible combinations
+of statistics, and for larger numbers of statistics may get quite expensive
+(as it, unsurprisingly, has exponential cost). Greedy terminates in less than
+K steps (when K is the number of clauses), and in each step chooses the best
+next statistics. I've been unable to come up with an example where those two
+approaches would produce different combinations.
+
+It's possible to choose the optimization using mvstat_search_type, with either
+'greedy' or 'exhaustive' values (default is 'greedy').
+
+ SET mvstat_search_type = 'exhaustive';
+
+Note: This is meant mostly for experimentation. I do expect we'll choose one of
+the algorithms and remove the GUC before commit.
+
+
+Limitations of combining statistics
+-----------------------------------
+
+As described in the section 'Combining multiple statistics', the current appoach
+is based on transfering information between statistics by means of conditional
+probabilities. This is a relatively cheap and efficient approach, but it is
+based on two assumptions:
+
+ (1) The overlap between the statistics needs to be sufficiently large, i.e.
+ there needs to be enough columns shared by the statistics to transfer
+ information about dependencies between the remaining columns.
+
+ (2) The query needs to include sufficient clauses on the shared columns.
+
+How a violation of those assumptions may be a problem can be illustrated by
+a simple example. Assume a table with three columns (a,b,c) containing exactly
+the same values, and statistics on (a,b) and (b,c):
+
+ CREATE TABLE test AS SELECT i, i, i
+ FROM generate_series(1,1000);
+
+ CREATE STATISTICS s1 ON test (a,b) WITH (mcv);
+ CREATE STATISTICS s2 ON test (b,c) WITH (mcv);
+
+ ANALYZE test;
+
+First, let's estimate this query:
+
+ SELECT * FROM test WHERE (a < 10) AND (c < 10);
+
+Clearly, there are no conditions on 'b' (which is the only column shared by the
+two statistics), so we'll end up with an estimate based on assumption of
+independence:
+
+ P(a < 10) * P(c < 10) = 0.01 * 0.01 = 0.0001
+
+Which is a significant under-estimate, as the proper selectivity is 0.01.
+
+But let's estimate another query:
+
+ SELECT * FROM test WHERE (a < 10) AND (b < 500) AND (c < 10);
+
+In this case, the estimate may be computed for example like this:
+
+ P[(a < 10) & (b < 500) & (c < 10)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (a < 10) & (b < 500)]
+ = P[(a < 10) & (b < 500)] * P[(c < 10) | (b < 500)]
+
+The trouble is the probability P(c < 10 | b < 500) evaluates to 0.02, because
+we have assumed (a) and (c) are independent because there is no statistic
+containing both these columns, and the condition on (b) does not transfer
+sufficient amount of information between the two statistics.
+
+Currently, the only solution is to build statistics on all three columns, but
+see the 'combining statistics using convolution' section for ideas on how to
+improve this.
+
+
Further (possibly crazy) ideas
------------------------------
@@ -111,3 +242,38 @@ But of course, this may result in expensive estimation (CPU-wise).
So we might add a GUC to choose between a simple (single statistics) and thus
multi-statistic estimation, possibly table-level parameter (ALTER TABLE ...).
+
+
+Combining stats using convolution
+---------------------------------
+
+While the current approach for combining statistics is based on conditional
+probabilities, and thus only works when the query includes conditions on the
+overlapping parts of the statistics. But there may be other ways to combine
+statistics, relaxing this requirement.
+
+Let's assume two histograms H1 and H2 - then combining them might work about
+like this:
+
+
+ for (buckets of H1, satisfying local conditions)
+ {
+ for (buckets of H2, overlapping with H1 bucket)
+ {
+ mark H2 bucket as 'valid'
+ }
+ }
+
+ s1 = s2 = 0.0
+ for (buckets of H2 marked as valid)
+ {
+ s1 += frequency
+
+ if (bucket satistifes local conditions)
+ s2 += frequency
+ }
+
+ s = (s2 / s1) /* final selectivity estimate */
+
+However this may quickly get non-trivial, e.g. when combining two statistics
+of different types (histogram vs. MCV).
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index fea2bb7..33f5a1b 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -192,11 +192,13 @@ extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
- SpecialJoinInfo *sjinfo);
+ SpecialJoinInfo *sjinfo,
+ List *conditions);
#endif /* COST_H */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 777c7da..2b67772 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -17,6 +17,14 @@
#include "fmgr.h"
#include "commands/vacuum.h"
+typedef enum MVStatSearchType
+{
+ MVSTAT_SEARCH_EXHAUSTIVE, /* exhaustive search */
+ MVSTAT_SEARCH_GREEDY /* greedy search */
+} MVStatSearchType;
+
+extern int mvstat_search_type;
+
/*
* Degree of how much MCV item / histogram bucket matches a clause.
* This is then considered when computing the selectivity.
--
2.5.0
0007-multivariate-ndistinct-coefficients.patchtext/x-patch; name=0007-multivariate-ndistinct-coefficients.patchDownload
From 238d8994c85d8b64bd898604b6ad1219850a5a26 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Wed, 23 Dec 2015 02:07:58 +0100
Subject: [PATCH 7/9] multivariate ndistinct coefficients
---
doc/src/sgml/ref/create_statistics.sgml | 9 ++
src/backend/catalog/system_views.sql | 3 +-
src/backend/commands/analyze.c | 2 +-
src/backend/commands/statscmds.c | 11 +-
src/backend/optimizer/path/clausesel.c | 4 +
src/backend/optimizer/util/plancat.c | 4 +-
src/backend/utils/adt/selfuncs.c | 93 +++++++++++++++-
src/backend/utils/mvstats/Makefile | 2 +-
src/backend/utils/mvstats/README.ndistinct | 83 ++++++++++++++
src/backend/utils/mvstats/README.stats | 2 +
src/backend/utils/mvstats/common.c | 23 +++-
src/backend/utils/mvstats/mvdist.c | 171 +++++++++++++++++++++++++++++
src/include/catalog/pg_mv_statistic.h | 26 +++--
src/include/nodes/relation.h | 2 +
src/include/utils/mvstats.h | 9 +-
src/test/regress/expected/rules.out | 3 +-
16 files changed, 424 insertions(+), 23 deletions(-)
create mode 100644 src/backend/utils/mvstats/README.ndistinct
create mode 100644 src/backend/utils/mvstats/mvdist.c
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index f7336fd..80e472f 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -168,6 +168,15 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>ndistinct</> (<type>boolean</>)</term>
+ <listitem>
+ <para>
+ Enables ndistinct coefficients for the statistics.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index b151db1..8d2b435 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -169,7 +169,8 @@ CREATE VIEW pg_mv_stats AS
length(S.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(S.stamcv) AS mcvinfo,
length(S.stahist) AS histbytes,
- pg_mv_stats_histogram_info(S.stahist) AS histinfo
+ pg_mv_stats_histogram_info(S.stahist) AS histinfo,
+ standcoeff AS ndcoeff
FROM (pg_mv_statistic S JOIN pg_class C ON (C.oid = S.starelid))
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace);
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 9087532..c29f1be 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -582,7 +582,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params,
}
/* Build multivariate stats (if there are any). */
- build_mv_stats(onerel, numrows, rows, attr_cnt, vacattrstats);
+ build_mv_stats(onerel, totalrows, numrows, rows, attr_cnt, vacattrstats);
}
/*
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index e0b085f..a7c569d 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -72,7 +72,8 @@ CreateStatistics(CreateStatsStmt *stmt)
/* by default build nothing */
bool build_dependencies = false,
build_mcv = false,
- build_histogram = false;
+ build_histogram = false,
+ build_ndistinct = false;
int32 max_buckets = -1,
max_mcv_items = -1;
@@ -155,6 +156,8 @@ CreateStatistics(CreateStatsStmt *stmt)
if (strcmp(opt->defname, "dependencies") == 0)
build_dependencies = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "ndistinct") == 0)
+ build_ndistinct = defGetBoolean(opt);
else if (strcmp(opt->defname, "mcv") == 0)
build_mcv = defGetBoolean(opt);
else if (strcmp(opt->defname, "max_mcv_items") == 0)
@@ -209,10 +212,10 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* check that at least some statistics were requested */
- if (! (build_dependencies || build_mcv || build_histogram))
+ if (! (build_dependencies || build_mcv || build_histogram || build_ndistinct))
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("no statistics type (dependencies, mcv, histogram) was requested")));
+ errmsg("no statistics type (dependencies, mcv, histogram, ndistinct) was requested")));
/* now do some checking of the options */
if (require_mcv && (! build_mcv))
@@ -246,6 +249,7 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_mv_statistic_deps_enabled -1] = BoolGetDatum(build_dependencies);
values[Anum_pg_mv_statistic_mcv_enabled -1] = BoolGetDatum(build_mcv);
values[Anum_pg_mv_statistic_hist_enabled -1] = BoolGetDatum(build_histogram);
+ values[Anum_pg_mv_statistic_ndist_enabled-1] = BoolGetDatum(build_ndistinct);
values[Anum_pg_mv_statistic_mcv_max_items -1] = Int32GetDatum(max_mcv_items);
values[Anum_pg_mv_statistic_hist_max_buckets -1] = Int32GetDatum(max_buckets);
@@ -253,6 +257,7 @@ CreateStatistics(CreateStatsStmt *stmt)
nulls[Anum_pg_mv_statistic_stadeps -1] = true;
nulls[Anum_pg_mv_statistic_stamcv -1] = true;
nulls[Anum_pg_mv_statistic_stahist -1] = true;
+ nulls[Anum_pg_mv_statistic_standist -1] = true;
/* insert the tuple into pg_mv_statistic */
mvstatrel = heap_open(MvStatisticRelationId, RowExclusiveLock);
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 14e3444..63baa73 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -59,6 +59,7 @@ static void addRangeClause(RangeQueryClause **rqlist, Node *clause,
#define MV_CLAUSE_TYPE_FDEP 0x01
#define MV_CLAUSE_TYPE_MCV 0x02
#define MV_CLAUSE_TYPE_HIST 0x04
+#define MV_CLAUSE_TYPE_NDIST 0x08
static bool clause_is_mv_compatible(Node *clause, Index relid, Bitmapset **attnums,
int type);
@@ -2860,6 +2861,9 @@ stats_type_matches(MVStatisticInfo *stat, int type)
if ((type & MV_CLAUSE_TYPE_HIST) && stat->hist_built)
return true;
+ if ((type & MV_CLAUSE_TYPE_NDIST) && stat->ndist_built)
+ return true;
+
return false;
}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 2519249..3741b7a 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -412,7 +412,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
/* unavailable stats are not interesting for the planner */
- if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built)
+ if (mvstat->deps_built || mvstat->mcv_built || mvstat->hist_built || mvstat->ndist_built)
{
info = makeNode(MVStatisticInfo);
@@ -423,11 +423,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
info->deps_enabled = mvstat->deps_enabled;
info->mcv_enabled = mvstat->mcv_enabled;
info->hist_enabled = mvstat->hist_enabled;
+ info->ndist_enabled = mvstat->ndist_enabled;
/* built/available statistics */
info->deps_built = mvstat->deps_built;
info->mcv_built = mvstat->mcv_built;
info->hist_built = mvstat->hist_built;
+ info->ndist_built = mvstat->ndist_built;
/* stakeys */
adatum = SysCacheGetAttr(MVSTATOID, htup,
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 805d633..f8d39aa 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -132,6 +132,7 @@
#include "utils/fmgroids.h"
#include "utils/index_selfuncs.h"
#include "utils/lsyscache.h"
+#include "utils/mvstats.h"
#include "utils/nabstime.h"
#include "utils/pg_locale.h"
#include "utils/rel.h"
@@ -206,6 +207,7 @@ static Const *string_to_const(const char *str, Oid datatype);
static Const *string_to_bytea_const(const char *str, size_t str_len);
static List *add_predicate_to_quals(IndexOptInfo *index, List *indexQuals);
+static Oid find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos);
/*
* eqsel - Selectivity of "=" for any data types.
@@ -3423,12 +3425,26 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
* don't know by how much. We should never clamp to less than the
* largest ndistinct value for any of the Vars, though, since
* there will surely be at least that many groups.
+ *
+ * However we don't need to do this if we have ndistinct stats on
+ * the columns - in that case we can simply use the coefficient
+ * to get the (probably way more accurate) estimate.
+ *
+ * XXX Probably needs refactoring (don't like to mix with clamp
+ * and coeff at the same time).
*/
double clamp = rel->tuples;
+ double coeff = 1.0;
if (relvarcount > 1)
{
- clamp *= 0.1;
+ Oid oid = find_ndistinct_coeff(root, rel, varinfos);
+
+ if (oid != InvalidOid)
+ coeff = load_mv_ndistinct(oid);
+ else
+ clamp *= 0.1;
+
if (clamp < relmaxndistinct)
{
clamp = relmaxndistinct;
@@ -3437,6 +3453,13 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
clamp = rel->tuples;
}
}
+
+ /*
+ * Apply ndistinct coefficient from multivar stats (we must do this
+ * before clamping the estimate in any way.
+ */
+ reldistinct /= coeff;
+
if (reldistinct > clamp)
reldistinct = clamp;
@@ -7583,3 +7606,71 @@ brincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
/* XXX what about pages_per_range? */
}
+
+/*
+ * Find applicable ndistinct statistics and compute the coefficient to
+ * correct the estimate (simply a product of per-column ndistincts).
+ *
+ * Currently we only look for a perfect match, i.e. a single ndistinct
+ * estimate exactly matching all the columns of the statistics.
+ */
+static Oid
+find_ndistinct_coeff(PlannerInfo *root, RelOptInfo *rel, List *varinfos)
+{
+ ListCell *lc;
+ Bitmapset *attnums = NULL;
+ VariableStatData vardata;
+
+ foreach(lc, varinfos)
+ {
+ GroupVarInfo *varinfo = (GroupVarInfo *) lfirst(lc);
+
+ if (varinfo->rel != rel)
+ continue;
+
+ /* FIXME handle expressions in general only */
+
+ /*
+ * examine the variable (or expression) so that we know which
+ * attribute we're dealing with - we need this for matching the
+ * ndistinct coefficient
+ *
+ * FIXME probably might remember this from estimate_num_groups
+ */
+ examine_variable(root, varinfo->var, 0, &vardata);
+
+ if (HeapTupleIsValid(vardata.statsTuple))
+ {
+ Form_pg_statistic stats
+ = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+
+ attnums = bms_add_member(attnums, stats->staattnum);
+
+ ReleaseVariableStats(vardata);
+ }
+ }
+
+ /* look for a matching ndistinct statistics */
+ foreach (lc, rel->mvstatlist)
+ {
+ int i;
+ MVStatisticInfo *info = (MVStatisticInfo *)lfirst(lc);
+
+ /* skip statistics without ndistinct coefficient built */
+ if (!info->ndist_built)
+ continue;
+
+ /* only exact matches for now (same set of columns) */
+ if (bms_num_members(attnums) != info->stakeys->dim1)
+ continue;
+
+ /* check that the columns match */
+ for (i = 0; i < info->stakeys->dim1; i++)
+ if (bms_is_member(info->stakeys->values[i], attnums))
+ continue;
+
+ return info->mvoid;
+ }
+
+ return InvalidOid;
+}
diff --git a/src/backend/utils/mvstats/Makefile b/src/backend/utils/mvstats/Makefile
index 9dbb3b6..d4b88e9 100644
--- a/src/backend/utils/mvstats/Makefile
+++ b/src/backend/utils/mvstats/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/utils/mvstats
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
-OBJS = common.o dependencies.o histogram.o mcv.o
+OBJS = common.o dependencies.o histogram.o mcv.o mvdist.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/mvstats/README.ndistinct b/src/backend/utils/mvstats/README.ndistinct
new file mode 100644
index 0000000..32d1624
--- /dev/null
+++ b/src/backend/utils/mvstats/README.ndistinct
@@ -0,0 +1,83 @@
+ndistinct coefficients
+======================
+
+Estimating number of distinct groups in a combination of columns is tricky,
+and the estimation error is often significant. By ndistinct coefficient we
+mean a ratio
+
+ q = ndistinct(a) * ndistinct(b) / ndistinct(a,b)
+
+where 'a' and 'b' are columns, ndistinct(a) is (an estimate of) a number of
+distinct values in column 'a'. And ndistinct(a,b) is the same thing for the
+pair of columns.
+
+The meaning of the coefficient may be illustrated by answering the following
+question: Given a combination of columns (a,b), how many distinct values of 'b'
+matches a chosen value of 'a' on average?
+
+Let's assume we know ndistinct(a) and ndistinct(a,b). Then the answer to the
+question clearly is
+
+ ndistinct(a,b) / ndistinct(a)
+
+and by using 'q' we may rewrite this as
+
+ ndistinct(b) / q
+
+so 'q' may be considered as a correction factor of the ndistinct estimate given
+a condition on one of the columns.
+
+This may be generalized to a combination of 'n' columns
+
+ [ndistinct(c1) * ... * ndistinct(cn)] / ndistinct(c1, ..., cn)
+
+and the meaning is very similar, except that we need to use conditions on (n-1)
+of the columns.
+
+
+Selectivity estimation
+----------------------
+
+As explained in the previous paragraph, ndistinct coefficients may be used to
+estimate cardinality of a column, given some apriori knowledge. Let's assume
+we need to estimate selectivity of a condition
+
+ (a=1) AND (b=2)
+
+which we can expand like this
+
+ P(a=1 & b=2) = P(a=1) * P(b=2 | a=1)
+
+Let's also assume that the distribution of 'b' is uniform, i.e. that
+
+ P(a=1) = 1/ndistinct(a)
+ P(b=2) = 1/ndistinct(b)
+ P(a=1 & b=2) = 1/ndistinct(a,b)
+
+ P(b=2 | a=1) = ndistinct(a) / ndistinct(a,b)
+
+which may be rewritten like
+
+ P(b=2 | a=1)
+ = ndistinct(a,b) / ndistinct(a)
+ = (1/ndistinct(b)) * [(ndistinct(a) * ndistinct(b)) / ndistinct(a,b)]
+ = (1/ndistinct(b)) * q
+
+and therefore
+
+ P(a=1 & b=2) = (1/ndistinct(a)) * (1/ndistinct(b)) * q
+
+This also illustrates 'q' as a correction coefficient.
+
+It also explains why we store the coefficient and not simply ndistinct(a,b).
+This way we can simply estimate individual clauses and then simply correct
+the estimate by multiplying the result with 'q' - we don't have to mess with
+ndistinct estimates at all.
+
+Naturally, as the coefficient is derives from ndistinct(a,b), it may be also
+used to estimate GROUP BY clauses on the combination of columns, replacing the
+existing heuristics in estimate_num_groups().
+
+Note: Currently only the GROUP BY estimation is implemented. It's a bit unclear
+how to implement the clause estimation when there are other statistics (esp.
+MCV lists and/or functional dependencies) available.
diff --git a/src/backend/utils/mvstats/README.stats b/src/backend/utils/mvstats/README.stats
index d404914..6d4b09b 100644
--- a/src/backend/utils/mvstats/README.stats
+++ b/src/backend/utils/mvstats/README.stats
@@ -20,6 +20,8 @@ Currently we only have two kinds of multivariate statistics
(c) multivariate histograms (README.histogram)
+ (d) ndistinct coefficients
+
Compatible clause types
-----------------------
diff --git a/src/backend/utils/mvstats/common.c b/src/backend/utils/mvstats/common.c
index f6d1074..d34d072 100644
--- a/src/backend/utils/mvstats/common.c
+++ b/src/backend/utils/mvstats/common.c
@@ -32,7 +32,8 @@ static List* list_mv_stats(Oid relid);
* and serializes them back into the catalog (as bytea values).
*/
void
-build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats)
{
ListCell *lc;
@@ -53,6 +54,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
MVDependencies deps = NULL;
MCVList mcvlist = NULL;
MVHistogram histogram = NULL;
+ double ndist = -1;
int numrows_filtered = numrows;
VacAttrStats **stats = NULL;
@@ -92,6 +94,9 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
if (stat->deps_enabled)
deps = build_mv_dependencies(numrows, rows, attrs, stats);
+ if (stat->ndist_enabled)
+ ndist = build_mv_ndistinct(totalrows, numrows, rows, attrs, stats);
+
/* build the MCV list */
if (stat->mcv_enabled)
mcvlist = build_mv_mcvlist(numrows, rows, attrs, stats, &numrows_filtered);
@@ -101,7 +106,7 @@ build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
histogram = build_mv_histogram(numrows_filtered, rows, attrs, stats, numrows);
/* store the histogram / MCV list in the catalog */
- update_mv_stats(stat->mvoid, deps, mcvlist, histogram, attrs, stats);
+ update_mv_stats(stat->mvoid, deps, mcvlist, histogram, ndist, attrs, stats);
}
}
@@ -183,6 +188,8 @@ list_mv_stats(Oid relid)
info->mcv_built = stats->mcv_built;
info->hist_enabled = stats->hist_enabled;
info->hist_built = stats->hist_built;
+ info->ndist_enabled = stats->ndist_enabled;
+ info->ndist_built = stats->ndist_built;
result = lappend(result, info);
}
@@ -252,7 +259,7 @@ find_mv_attnums(Oid mvoid, Oid *relid)
void
update_mv_stats(Oid mvoid,
MVDependencies dependencies, MCVList mcvlist, MVHistogram histogram,
- int2vector *attrs, VacAttrStats **stats)
+ double ndistcoeff, int2vector *attrs, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
@@ -292,26 +299,36 @@ update_mv_stats(Oid mvoid,
= PointerGetDatum(data);
}
+ if (ndistcoeff > 1.0)
+ {
+ nulls[Anum_pg_mv_statistic_standist -1] = false;
+ values[Anum_pg_mv_statistic_standist-1] = Float8GetDatum(ndistcoeff);
+ }
+
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_mv_statistic_stadeps -1] = true;
replaces[Anum_pg_mv_statistic_stamcv -1] = true;
replaces[Anum_pg_mv_statistic_stahist-1] = true;
+ replaces[Anum_pg_mv_statistic_standist-1] = true;
/* always change the availability flags */
nulls[Anum_pg_mv_statistic_deps_built -1] = false;
nulls[Anum_pg_mv_statistic_mcv_built -1] = false;
nulls[Anum_pg_mv_statistic_hist_built-1] = false;
+ nulls[Anum_pg_mv_statistic_ndist_built-1] = false;
nulls[Anum_pg_mv_statistic_stakeys-1] = false;
/* use the new attnums, in case we removed some dropped ones */
replaces[Anum_pg_mv_statistic_deps_built-1] = true;
replaces[Anum_pg_mv_statistic_mcv_built -1] = true;
+ replaces[Anum_pg_mv_statistic_ndist_built-1] = true;
replaces[Anum_pg_mv_statistic_hist_built -1] = true;
replaces[Anum_pg_mv_statistic_stakeys -1] = true;
values[Anum_pg_mv_statistic_deps_built-1] = BoolGetDatum(dependencies != NULL);
values[Anum_pg_mv_statistic_mcv_built -1] = BoolGetDatum(mcvlist != NULL);
values[Anum_pg_mv_statistic_hist_built -1] = BoolGetDatum(histogram != NULL);
+ values[Anum_pg_mv_statistic_ndist_built-1] = BoolGetDatum(ndistcoeff > 1.0);
values[Anum_pg_mv_statistic_stakeys -1] = PointerGetDatum(attrs);
/* Is there already a pg_mv_statistic tuple for this attribute? */
diff --git a/src/backend/utils/mvstats/mvdist.c b/src/backend/utils/mvstats/mvdist.c
new file mode 100644
index 0000000..59b8358
--- /dev/null
+++ b/src/backend/utils/mvstats/mvdist.c
@@ -0,0 +1,171 @@
+/*-------------------------------------------------------------------------
+ *
+ * mvdist.c
+ * POSTGRES multivariate distinct coefficients
+ *
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mvstats/mvdist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include <math.h>
+
+#include "common.h"
+#include "utils/lsyscache.h"
+
+static double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
+
+/*
+ * Compute ndistinct coefficient for the combination of attributes. This
+ * computes the ndistinct estimate using the same estimator used in analyze.c
+ * and then computes the coefficient.
+ */
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats)
+{
+ int i, j;
+ int f1, cnt, d;
+ int nmultiple, summultiple;
+ int numattrs = attrs->dim1;
+ MultiSortSupport mss = multi_sort_init(numattrs);
+ double ndistcoeff;
+
+ /*
+ * It's possible to sort the sample rows directly, but this seemed
+ * somehow simpler / less error prone. Another option would be to
+ * allocate the arrays for each SortItem separately, but that'd be
+ * significant overhead (not just CPU, but especially memory bloat).
+ */
+ SortItem * items = (SortItem*)palloc0(numrows * sizeof(SortItem));
+
+ Datum *values = (Datum*)palloc0(sizeof(Datum) * numrows * numattrs);
+ bool *isnull = (bool*)palloc0(sizeof(bool) * numrows * numattrs);
+
+ for (i = 0; i < numrows; i++)
+ {
+ items[i].values = &values[i * numattrs];
+ items[i].isnull = &isnull[i * numattrs];
+ }
+
+ Assert(numattrs >= 2);
+
+ for (i = 0; i < numattrs; i++)
+ {
+ /* prepare the sort function for the first dimension */
+ multi_sort_add_dimension(mss, i, i, stats);
+
+ /* accumulate all the data into the array and sort it */
+ for (j = 0; j < numrows; j++)
+ {
+ items[j].values[i]
+ = heap_getattr(rows[j], attrs->values[i],
+ stats[i]->tupDesc, &items[j].isnull[i]);
+ }
+ }
+
+ qsort_arg((void *) items, numrows, sizeof(SortItem),
+ multi_sort_compare, mss);
+
+ /* count number of distinct combinations */
+
+ f1 = 0;
+ cnt = 1;
+ d = 1;
+ for (i = 1; i < numrows; i++)
+ {
+ if (multi_sort_compare(&items[i], &items[i-1], mss) != 0)
+ {
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ d++;
+ cnt = 0;
+ }
+
+ cnt += 1;
+ }
+
+ if (cnt == 1)
+ f1 += 1;
+ else
+ {
+ nmultiple += 1;
+ summultiple += cnt;
+ }
+
+ ndistcoeff = 1 / estimate_ndistinct(totalrows, numrows, d, f1);
+
+ /*
+ * now count distinct values for each attribute and incrementally
+ * compute ndistinct(a,b) / (ndistinct(a) * ndistinct(b))
+ *
+ * FIXME Probably need to handle cases when one of the ndistinct
+ * estimates is negative, and also check that the combined
+ * ndistinct is greater than any of those partial values.
+ */
+ for (i = 0; i < numattrs; i++)
+ ndistcoeff *= stats[i]->stadistinct;
+
+ return ndistcoeff;
+}
+
+double
+load_mv_ndistinct(Oid mvoid)
+{
+ bool isnull = false;
+ Datum deps;
+
+ /* Prepare to scan pg_mv_statistic for entries having indrelid = this rel. */
+ HeapTuple htup = SearchSysCache1(MVSTATOID, ObjectIdGetDatum(mvoid));
+
+#ifdef USE_ASSERT_CHECKING
+ Form_pg_mv_statistic mvstat = (Form_pg_mv_statistic) GETSTRUCT(htup);
+ Assert(mvstat->ndist_enabled && mvstat->ndist_built);
+#endif
+
+ deps = SysCacheGetAttr(MVSTATOID, htup,
+ Anum_pg_mv_statistic_standist, &isnull);
+
+ Assert(!isnull);
+
+ ReleaseSysCache(htup);
+
+ return DatumGetFloat8(deps);
+}
+
+/* The Duj1 estimator (already used in analyze.c). */
+static double
+estimate_ndistinct(double totalrows, int numrows, int d, int f1)
+{
+ double numer,
+ denom,
+ ndistinct;
+
+ numer = (double) numrows *(double) d;
+
+ denom = (double) (numrows - f1) +
+ (double) f1 * (double) numrows / totalrows;
+
+ ndistinct = numer / denom;
+
+ /* Clamp to sane range in case of roundoff error */
+ if (ndistinct < (double) d)
+ ndistinct = (double) d;
+
+ if (ndistinct > totalrows)
+ ndistinct = totalrows;
+
+ return floor(ndistinct + 0.5);
+}
diff --git a/src/include/catalog/pg_mv_statistic.h b/src/include/catalog/pg_mv_statistic.h
index 7020772..e46cc6b 100644
--- a/src/include/catalog/pg_mv_statistic.h
+++ b/src/include/catalog/pg_mv_statistic.h
@@ -40,6 +40,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_enabled; /* analyze dependencies? */
bool mcv_enabled; /* build MCV list? */
bool hist_enabled; /* build histogram? */
+ bool ndist_enabled; /* build ndist coefficient? */
/* histogram / MCV size */
int32 mcv_max_items; /* max MCV items */
@@ -49,6 +50,7 @@ CATALOG(pg_mv_statistic,3381)
bool deps_built; /* dependencies were built */
bool mcv_built; /* MCV list was built */
bool hist_built; /* histogram was built */
+ bool ndist_built; /* ndistinct coeff built */
/* variable-length fields start here, but we allow direct access to stakeys */
int2vector stakeys; /* array of column keys */
@@ -57,6 +59,7 @@ CATALOG(pg_mv_statistic,3381)
bytea stadeps; /* dependencies (serialized) */
bytea stamcv; /* MCV list (serialized) */
bytea stahist; /* MV histogram (serialized) */
+ float8 standcoeff; /* ndistinct coeff (serialized) */
#endif
} FormData_pg_mv_statistic;
@@ -72,7 +75,7 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
* compiler constants for pg_mv_statistic
* ----------------
*/
-#define Natts_pg_mv_statistic 16
+#define Natts_pg_mv_statistic 19
#define Anum_pg_mv_statistic_starelid 1
#define Anum_pg_mv_statistic_staname 2
#define Anum_pg_mv_statistic_stanamespace 3
@@ -80,14 +83,17 @@ typedef FormData_pg_mv_statistic *Form_pg_mv_statistic;
#define Anum_pg_mv_statistic_deps_enabled 5
#define Anum_pg_mv_statistic_mcv_enabled 6
#define Anum_pg_mv_statistic_hist_enabled 7
-#define Anum_pg_mv_statistic_mcv_max_items 8
-#define Anum_pg_mv_statistic_hist_max_buckets 9
-#define Anum_pg_mv_statistic_deps_built 10
-#define Anum_pg_mv_statistic_mcv_built 11
-#define Anum_pg_mv_statistic_hist_built 12
-#define Anum_pg_mv_statistic_stakeys 13
-#define Anum_pg_mv_statistic_stadeps 14
-#define Anum_pg_mv_statistic_stamcv 15
-#define Anum_pg_mv_statistic_stahist 16
+#define Anum_pg_mv_statistic_ndist_enabled 8
+#define Anum_pg_mv_statistic_mcv_max_items 9
+#define Anum_pg_mv_statistic_hist_max_buckets 19
+#define Anum_pg_mv_statistic_deps_built 11
+#define Anum_pg_mv_statistic_mcv_built 12
+#define Anum_pg_mv_statistic_hist_built 13
+#define Anum_pg_mv_statistic_ndist_built 14
+#define Anum_pg_mv_statistic_stakeys 15
+#define Anum_pg_mv_statistic_stadeps 16
+#define Anum_pg_mv_statistic_stamcv 17
+#define Anum_pg_mv_statistic_stahist 18
+#define Anum_pg_mv_statistic_standist 19
#endif /* PG_MV_STATISTIC_H */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 84be0ce..ba587da 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -657,11 +657,13 @@ typedef struct MVStatisticInfo
bool deps_enabled; /* functional dependencies enabled */
bool mcv_enabled; /* MCV list enabled */
bool hist_enabled; /* histogram enabled */
+ bool ndist_enabled; /* ndistinct coefficient enabled */
/* built/available statistics */
bool deps_built; /* functional dependencies built */
bool mcv_built; /* MCV list built */
bool hist_built; /* histogram built */
+ bool ndist_built; /* ndistinct coefficient built */
/* columns in the statistics (attnums) */
int2vector *stakeys; /* attnums of the columns covered */
diff --git a/src/include/utils/mvstats.h b/src/include/utils/mvstats.h
index 2b67772..67ed2f8 100644
--- a/src/include/utils/mvstats.h
+++ b/src/include/utils/mvstats.h
@@ -225,6 +225,7 @@ typedef MVSerializedHistogramData *MVSerializedHistogram;
MVDependencies load_mv_dependencies(Oid mvoid);
MCVList load_mv_mcvlist(Oid mvoid);
MVSerializedHistogram load_mv_histogram(Oid mvoid);
+double load_mv_ndistinct(Oid mvoid);
bytea * serialize_mv_dependencies(MVDependencies dependencies);
bytea * serialize_mv_mcvlist(MCVList mcvlist, int2vector *attrs,
@@ -266,11 +267,17 @@ MVHistogram
build_mv_histogram(int numrows, HeapTuple *rows, int2vector *attrs,
VacAttrStats **stats, int numrows_total);
-void build_mv_stats(Relation onerel, int numrows, HeapTuple *rows,
+double
+build_mv_ndistinct(double totalrows, int numrows, HeapTuple *rows,
+ int2vector *attrs, VacAttrStats **stats);
+
+void build_mv_stats(Relation onerel, double totalrows,
+ int numrows, HeapTuple *rows,
int natts, VacAttrStats **vacattrstats);
void update_mv_stats(Oid relid, MVDependencies dependencies,
MCVList mcvlist, MVHistogram histogram,
+ double ndistcoeff,
int2vector *attrs, VacAttrStats **stats);
#ifdef DEBUG_MVHIST
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 528ac36..7a914da 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1377,7 +1377,8 @@ pg_mv_stats| SELECT n.nspname AS schemaname,
length(s.stamcv) AS mcvbytes,
pg_mv_stats_mcvlist_info(s.stamcv) AS mcvinfo,
length(s.stahist) AS histbytes,
- pg_mv_stats_histogram_info(s.stahist) AS histinfo
+ pg_mv_stats_histogram_info(s.stahist) AS histinfo,
+ s.standcoeff AS ndcoeff
FROM ((pg_mv_statistic s
JOIN pg_class c ON ((c.oid = s.starelid)))
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)));
--
2.5.0
0008-change-how-we-apply-selectivity-to-number-of-groups-.patchtext/x-patch; name=0008-change-how-we-apply-selectivity-to-number-of-groups-.patchDownload
From c968f05c26ecfa9344a8a9c9209bd755fa4ddf7b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 26 Jan 2016 18:14:33 +0100
Subject: [PATCH 8/9] change how we apply selectivity to number of groups
estimate
Instead of simply multiplying the ndistinct estimate with selecticity,
we instead use the formula for the expected number of distinct values
observed in 'k' rows when there are 'd' distinct values in the bin
d * (1 - ((d - 1) / d)^k)
This is 'with replacements' which seems appropriate for the use, and it
mostly assumes uniform distribution of the distinct values. So if the
distribution is not uniform (e.g. there are very frequent groups) this
may be less accurate than the current algorithm in some cases, giving
over-estimates. But that's probably better than OOM.
---
src/backend/utils/adt/selfuncs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index f8d39aa..76be0e3 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3464,9 +3464,9 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
reldistinct = clamp;
/*
- * Multiply by restriction selectivity.
+ * Estimate the number of distinct values observed in rel->rows.
*/
- reldistinct *= rel->rows / rel->tuples;
+ reldistinct *= (1 - powl(1 - rel->rows/rel->tuples, rel->tuples/reldistinct));
/*
* Update estimate of total distinct groups.
--
2.5.0
0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchtext/x-patch; name=0009-fixup-of-regression-tests-plans-changes-by-group-by-.patchDownload
From 29ea451f45fa5b8891ebde195551180f2841826d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Sun, 28 Feb 2016 21:16:40 +0100
Subject: [PATCH 9/9] fixup of regression tests (plans changes by group by
estimation)
---
src/test/regress/expected/subselect.out | 25 +++++++++++--------------
1 file changed, 11 insertions(+), 14 deletions(-)
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index de64ca7..0fc93d9 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -807,27 +807,24 @@ select * from int4_tbl where
explain (verbose, costs off)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
- QUERY PLAN
-----------------------------------------------------------------------
- Hash Join
+ QUERY PLAN
+----------------------------------------------------------------
+ Hash Semi Join
Output: o.f1
Hash Cond: (o.f1 = "ANY_subquery".f1)
-> Seq Scan on public.int4_tbl o
Output: o.f1
-> Hash
Output: "ANY_subquery".f1, "ANY_subquery".g
- -> HashAggregate
+ -> Subquery Scan on "ANY_subquery"
Output: "ANY_subquery".f1, "ANY_subquery".g
- Group Key: "ANY_subquery".f1, "ANY_subquery".g
- -> Subquery Scan on "ANY_subquery"
- Output: "ANY_subquery".f1, "ANY_subquery".g
- Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
- -> HashAggregate
- Output: i.f1, (generate_series(1, 2) / 10)
- Group Key: i.f1
- -> Seq Scan on public.int4_tbl i
- Output: i.f1
-(18 rows)
+ Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
+ -> HashAggregate
+ Output: i.f1, (generate_series(1, 2) / 10)
+ Group Key: i.f1
+ -> Seq Scan on public.int4_tbl i
+ Output: i.f1
+(15 rows)
select * from int4_tbl o where (f1, f1) in
(select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
--
2.5.0
Hi,
On 03/16/2016 03:58 AM, Tatsuo Ishii wrote:
I apology if it's already discussed. I am new to this patch.
Attached is v15 of the patch series, fixing this and also doing quite a
few additional improvements:* added some basic examples into the SGML documentation
* addressing the objectaddress omissions, as pointed out by Alvaro
* support for ALTER STATISTICS ... OWNER TO / RENAME / SET SCHEMA
* significant refactoring of MCV and histogram code, particularly
serialization, deserialization and building* reworking the functional dependencies to support more complex
dependencies, with multiple columns as 'conditions'* the reduction using functional dependencies is also significantly
simplified (I decided to get rid of computing the transitive closure
for now - it got too complex after the multi-condition dependencies,
so I'll leave that for the futureDo you have any other missing parts in this work? I am asking
because I wonder if you want to push this into 9.6 or rather 9.7.
I think the first few parts of the patch series, namely:
* shared infrastructure (0002)
* functional dependencies (0003)
* MCV lists (0004)
* histograms (0005)
might make it into 9.6. I believe the code for building and storing the
different kinds of stats is reasonably solid. What probably needs more
thorough review are the changes in clauselist_selectivity(), but the
code in these parts is reasonably simple as it only supports using a
single multi-variate statistics per relation.
The part (0006) that allows using multiple statistics (i.e. selects
which of the available stats to use and in what order) is probably the
most complex part of the whole patch, and I myself do have some
questions about some aspects of it. I don't think this part might get
into 9.6 at this point (although it'd be nice if we managed to do that).
I can also imagine moving the ndistinct pieces forward, in front of 0006
if that helps getting it into 9.6. There's a bit more work on making it
more flexible, though, to allow handling subsets columns (currently we
need a perfect match).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Many trailing white spaces found.
Sorry, haven't noticed that after one of the rebases. Fixed in the
attached v15 of the patch.
There are still few of traling spaces.
/home/t-ishii/0002-shared-infrastructure-and-functional-dependencies.patch:3792: trailing whitespace.
/home/t-ishii/0004-multivariate-MCV-lists.patch:471: trailing whitespace.
/home/t-ishii/0004-multivariate-MCV-lists.patch:656: space before tab in indent.
{
/home/t-ishii/0004-multivariate-MCV-lists.patch:682: space before tab in indent.
}
/home/t-ishii/0004-multivariate-MCV-lists.patch:685: space before tab in indent.
{
/home/t-ishii/0004-multivariate-MCV-lists.patch:715: trailing whitespace.
/home/t-ishii/0006-multi-statistics-estimation.patch:2513: trailing whitespace.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 03/21/2016 12:00 AM, Tatsuo Ishii wrote:
Many trailing white spaces found.
Sorry, haven't noticed that after one of the rebases. Fixed in the
attached v15 of the patch.There are still few of traling spaces.
/home/t-ishii/0002-shared-infrastructure-and-functional-dependencies.patch:3792: trailing whitespace.
/home/t-ishii/0004-multivariate-MCV-lists.patch:471: trailing whitespace.
/home/t-ishii/0004-multivariate-MCV-lists.patch:656: space before tab in indent.
{
/home/t-ishii/0004-multivariate-MCV-lists.patch:682: space before tab in indent.
}
/home/t-ishii/0004-multivariate-MCV-lists.patch:685: space before tab in indent.
{
/home/t-ishii/0004-multivariate-MCV-lists.patch:715: trailing whitespace.
/home/t-ishii/0006-multi-statistics-estimation.patch:2513: trailing whitespace.Best regards,
D'oh. Thanks for reporting. Attached is v16, hopefully fixing the few
remaining whitespace issues.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
multivariate-stats-v16.tgzapplication/x-compressed-tar; name=multivariate-stats-v16.tgzDownload
� B3�V �<ks�F����Y}X���)J���b+]��Wd6���TC`(b
4M�����go���KrW�J,����w������Z������=����C?�y�/w+��a�O�8j�
��]����G���]>]��go4�f��?����GG�G�>�
�_��N��'���"�<n�������`���G����<�����h$��}G����������Yo ���gc�d��v������s6 \��rv��o���q@e�v�8�^�H��X�� �X� ���������#v�t�8��K��s�������Y�s�#%e��
R��FI�ZI))�(i`YA4���B��d3�z����g�0�����nn��q���Z���dhw��~/|�,#w�~a'�\�x�6��z�����X��\O0{��{�� ���������e9�l�Z�{7b��u���!��;�#���Eo6h�O��>8f=��� W��<��\�|�Z���CX:��1x���G�:p{������v�G�c���_���k�{�\�SO<O���V��E�>�����j|��X�J^4����Mx�o����S�,t��K��>~�D/ptB�V��l��w��w���P6
#�|#��s�Bw��\G6 �^��{����������/���P��2���C������{�d����_I<�� �����2�!��(!=�RH�u���I�%�����=D��f��Yi�����r�C���~�y��� ����0���_z����@G?�"\�g}!��Iy�w|�Dq��1������G��������P8-`f�A�
�-�;�Y������r�<��mW�/�_m���6��b�����?�������8����9q>������`x2�O��c����������of��O�����q�C4��� kJ����"��1�+�dy�{tX��,����-��]��x�k�=����,����2P�H�E #�],�0�~���[P 8���������<�^p�������2r�&�~u{y1�d����j<�z5f>_vs�"�A,���/����������}}{���/�L.o����6���NnX��/'l������������%@�*<�>����p�5 Sb����E����+3��H�Nc�s0s���������O����KPH����Bq��;dw2����"���<� l��J0���9L�N��m/vPo�:1�,
�9Ir��mI���j���Q�H��)
�
����V�x��7
�l# ��a�/
"b+{��)�}/
��>� ���q�d`Es�<����4@r���`E|F 1$�|
@S�{�i�L �,���S�O��fo�z*��PGe���0o ���cRlG�:��@n�\)�h0��R�I��t ��g
��o�*���?�$;�2X^��n�����hny �Hd]���l���#\�����8WA�� ��K���dH�Vs����g3&��67%�m��H��6�6��A�V�TK� �I�����0���(Q������� ��t�C:c0�V�*�&V-�]�+��,3�#���~�5�{�AK] �"Pg-��%�Zp��"��P;M���-_�(xf��z��Dx�� ��+1d�8$
��2P�8,�eskr�����| �����m�[���R�K8�0��d����#����D��=�y= ���B.��:�������bC���q�l[T��(���xS�1��X3�W�2�ItD� ��D�6��@�,��W�����o.!�L� x�&��A*t�15�����^��KMI���6�>��� a �$#~��b
����a�+��K��I� ��P`
�����W{/ �@��!a
� ��0|p���<>�%:�(�w�`�
5R'7��a���t��a
�*�`WR(����z��"��H�w�����������}���>�)�������2P����sHQ�8
�P��x|(�=�I�Va���`��
7E<�K�h�����������i�`I�����r���H���i�R~��D$K����BKG^ ����������-r}3��5�;��rB3&��P�YEn���3sC��nYZ�~�$0�\
���q������~�p[V���C{������9kP0���B)���e:30��G��� hO����]�"���,�%�G���|�^�.C0�D�;������$�PS{�"�Bt����{��Y��\�w<4)������l�D@��a��l-"�k.�s��6�d:�50,��;r�jA��K0��$}���f���R2��^�����Z�l��IO_��=�N�zJ���i�
�R���H�C'��V$|H�u���K� 1:��&�k�rF���>�_@�/u}p�d`����6������ ��L�=C��'������J0�{�I������2�'YU�Y��',��,A�`��EzO{�T���e���q��MH]t���hc��W�m���o��`�QX��EP ���U
7|Y51���l��DS�d�3���p������(�p�Z8��w�X� ?x9�^�������r@1�c�HYk����\��S�)$�z*QC��#����`�����?��yb7I��j���b#�;�"��B h�kLh��*Y��T�qTE`'E��v>���UOY�L(e$@��S���. ���h2�m����J�@�
��O����!)�l<�%?m���`���������_Io���`���~{���[�6�>�Q�<8�`��=O�v�1/P>:����U8�<>)��q�f��B��;�0���A���FN������0Q�"��y�|i4)��A�(eP������yE���b�3L�S���\��N[s��+3%q�bW�l�"��^�o�������������R�jB4����)E���������E�+�N��+13S>�zyud&��,a&n�v��kG|�d�X�]���;d9�.�Zz5��Z�L�����Z��h�F
��z��T���Z��Z��:��;�{w��;q1 m wV�I�Xc�z� l�'iN�%���@��G��i����X� ����tu�#tS����n����n�U�D%���k+Pu��B��h��L�~�����xk��
Yp|��wv%�a���B����{�e�����N���Q�=��sr�����Bu�l��G}���G���
��E�n����n��-�G0�=���A�n[��/�����O��{,�?6�i<�|�����;�b���A��U�wu����zNzg
��P0�.��N�G����&�1��T��$�x�*�g�}����w=����I�lf�vY��/�74�k��YQ!kEm`3�U����A2+�"�a�!9i�k������w|��FU�`s�BMs�Sj����T�@s�N�Q��F��3�\��d��!�eMp9�.�m�4�[u�F� Xb�`5FM�!�%CU��wt[����� U��Y��B&���;��m��������������S5
*j�2��a�zQ~��u�#���@F����^��<X��96���6��k�:/�`\��'�#�vFK��S���{�[4e:#
dg��]�p}^��9��7��� $*��W8M�y�S�=�@�x~�8�u�/���6�'�#'Z�
��1�
@���d����)���}H���j������e �w]*��4}�����G�NO2�"G�'�~���T��W����t�O���m�K�r�y�B�;G�8G2�<�n�w%�������0 ��]��EX�����������1��������������P�z���m�@d�Ih�4�@��E2�:�)q��kD!��4MY�k�������1e�0���3]�U��y��e��&S�
���1�V��m�.�u����!lJ�h*v'K��B0L��<�GH�l+���-,�� �1����y���2� 7���y~_�H���������s���;W����0�F S�P�2��f*�������~�T�\��v}����7'����
��<a:0xD_�jb����/�B��}c2Rd��$�ZC[�3c\��,��B<���_Gt����7�b�V��@�@��A��AB�}��8�V�!�S3��,�-��R)�:�5A�_�fN��9j��3�w����0�����L�N���w�q�!��h���&WQVcA� ��I��$}]
�����N����
Yy�AUD�
��b����.� 8���H.l@�ci�#5%�3�����=n��� �_���NZ�������*p>~���i����S����� ��@��ok�$�
��`���17�u��|�����BR�<��Fe56���s}����f��S�g%�>n��C���
R{5z(��Z@���*�����������fO~�2r6����p�2�f�lyz����A��BR%�T�eA�J�R���k�
T�b����jR��K0 �~k�X����]}��o&����d�~�V���T�"l��6�*@�g���T��Va�����Q���g}�������_nCB��P� �Y(��}�v�r�p���a�&&qc�=��<H�_�$<��PW6jO��m��QM��d�2U�����T����d�R����26U���C�X ^���h�~\������)�H7}�5R�Q�(^�J~RQ�P�5U�s���
��U�p/MMu����xT�2�J�'�|0Kx�n�Q�7��Y���Z���(O�����R��E��L�Z���*pR��!22�l���a|��L�Y.�����r�)$�i����H��|W@��(cq�'� ]o������+Vu��*�]���r~�%���r�|&�:��u{�h�w
Eu��kZ���8k���f����I���������� ����f��n���#�^lt��\5��~R�S�>m����&��
�dZ�
������W��d�B���k�"��I���?�u��n��e������WjS�G���V�#����4B����]�qI�)��_�R�w�X���v��c ���_����q)8����{����� �S!U"%��2���L�����v����%�t`U���.v6sC��95B�������x����)N�@������������D���=L������ ����}�^�t�]{��~��1f]��&��oLo���!s�&